An Integrated Assessment

Your challenge is to integrate various Earth Observation-derived features with available socio-economic data in order to discover or enhance our understanding of COVID-19 impacts.

AID PACT: Artificial Intelligence Derived Predictive Analysis of COVID-19 over Time

Summary

Our interactive map displays current and AI-predicted future COVID-19 cases and unemployment rates over time. Using a Bayesian Ridge Regression, we trained our machine learning model by integrating Earth Observation derived features and socioeconomic data. By providing these location-based predictions, we hope to help the public make better-informed decisions regarding safety.

How We Addressed This Challenge

Link to our website: https://hexhax.herokuapp.com/

Here is a picture of our website

Our predictive model integrates Earth Observation-derived features like population density and global human modification with socio-economic data including city accessibility, Gini coefficient (the measurement of wealth inequality), and GDP (gross domestic product) to predict future cases of COVID-19 through machine learning.

By mapping current and future cases on an international scale, this application will allow global health agencies to distribute personal protective equipment (PPE), ventilators, gurneys, and other necessary medical resources to states and countries that will experience a greater effect of COVID-19.  Our AI predictions of future cases will allow government leaders to make better-informed decisions about lifting or extending lockdown and social distancing regulations. Informing the public about potential spread will help ensure that they are protecting themselves appropriately. Concurrently, predictions of high numbers of future cases may encourage lawmakers to consider implementing or increasing stimulus checks and other forms of relief.

How We Developed This Project

As COVID-19 spreads throughout the world, individuals of lower socioeconomic status are disproportionately affected. With unprecedented unemployment filings and increasing COVID-19 cases in the United States, our team was interested in understanding how both environmental and socio-economic factors affect the spread of COVID-19.

We used 13 datasets including COVID-19 cases, global human modification, unemployment data, and accessibility to cities. Population density data obtained from NASA were also used. These datasets were used to train our Bayesian Ridge Regression machine learning model, which predicts future spread of the virus and changes in unemployment. Our model was developed in Python using Google Earth Engine and Google Colab. In order to visualize our data and predictions, we built a website to make information easily accessible. We used repl.it to simultaneously develop our website using HTML, CSS, and JavaScript. Our interactive map was prepared by Leaflet, and charts were rendered using Chart.js.

One problem we faced was choosing the model for predictive analytics. We needed a model simple enough to implement in time but robust enough to be accurate. After much discussion of methods from K-Means Clustering to Principal Component Analysis, we decided on Bayesian Ridge Regression. A Ridge Regression was computed for each changing data point (cases, deaths, and testing) which was used to predict the next day’s data, which then was used to predict the following day, creating a “forecast.” Although these predictions become less confident as they are further extrapolated, the data fed to the Bayesian Ridge Regression can be updated to reflect the most recent data, allowing forecasts to be as accurate as possible in predictions.

Our front end team encountered a few issues regarding the evenness of the website’s layout, specifically in the “Meet the Team” section. Since member information varied, choosing the right height for the member card was crucial. Using the tag <br> caused member cards to be different lengths when adjusting window size. Our solution ultimately involved implementing a media query that changed card heights whenever the max-width was below a certain threshold.

Overall, we are very proud of our accomplishments, and we are especially proud of our sidebar pop up on our map. When a state or country is clicked, the selected region is highlighted and the sidebar displays three tabs that each display different sets of graphs or data.

Outside of the time constraints, we plan to improve the accuracy of our predictions by factoring in more datasets into our model, including data from past epidemics such as SARS and current climate data like humidity and temperature. (While we planned to integrate climate data, the data’s format of pixels over a map rather than as aggregate values on a per-country basis made the data difficult to classify. Given the limited timeframe, we were unable to use climate data.) Additionally, we would like to adjust our model to predict based on the trend of cases rather than the only previous day. We would further train our model by inspecting the correlating trends of COVID-19 cases with environmental and socioeconomic factors from recovering countries to better predict future curves of affected countries.

We would further increase accuracy by harnessing data and displaying information on regions as specific as counties rather than just countries and states. Moreover, we would expand our predictions to more types of COVID-19 impacts, such as unemployment, climate change, and global trade. In order to make this project more sustainable, we would also program the website to read in new data and update its predictions daily.

Tags
#artificial_intelligence #covid-19 #global_human_impact #socio-economic #gini_coefficient #global_population_index #unemployment #medical_capacity #PPE #predictive #machine_learning #data_science #web_development #google_earth_engine #global_human_modification #gdp_per_capita #global #accessibility #map_visualization #python
Global Judging
This project was submitted for consideration during the Space Apps Global Judging process.