COVID Briant - Space Apps Challenge

COVID Briant| Human Factors

Human Factors

The emergence and spread of infectious diseases, like COVID-19, are on the rise. Can you identify patterns between population density and COVID-19 cases and identify factors that could help predict hotspots of disease spread?

Test Site Distribution Model

Summary

COVID-19 has hit the world hard both in terms of health and economically. To get life back to normal safely; ensuring there is enough testing is crucial to help treat, isolate or hospitalize people who are infected. Until we are able to get testing more wildly available, we want to make sure that we can improve the distribution of testing sites.Using data such as population, infection rate, death rate, wealth - we hope to identify areas that are deprived from testing. Highlighting testing sites that take uninsured citizens and finding where we can redistribute some resources to get people back out there safely.

How We Addressed This Challenge

Using data to look at population, infection rate, deaths, and average wealth - we want to find areas that have a higher need of a testing site and identify areas that have more of an abundance of testing sites so that resources may be diverted to more deprived areas.

How We Developed This Project

What inspired your team to choose this challenge?

We are an international group of programmers who met mostly through Stanford's Code in Place class and united over a common interest in applying our new skills in the interest of COVID-19. Through many virtual meetings and collaboration over 4 different time-zones, we worked together to determine what common problems we wanted to solved. Many our lives were upended by the virus and felt that the problem is personal as well as societal. And it pains us to see our communities struggling to get right resources and help that they need to contain this novel virus. We wanted to offer our skill sets and ideas to help the healthcare and essential workers in combating this pandemic.

We were inspired by this article and wondered if there was a way to use socioeconomic data to address social justice during a time of the pandemic.

What was your approach to developing this project?

We used up-to-date, coronavirus testing case numbers from city/ county public health data, geolocation and government census to understand what factors can put an individual and/or community at risk for outbreak. We further used our data to see if we can find out which area needs more testing centers in order to effectively control the spread.

Method

Determining Case Sites

We used NASA's SEDAC data to determine where our two case studies would be. Two counties in California were chosen based on two metrics off of SEDAC's COVID-19 viewer:

In the same quintile of death rate.
One county to be in a densely urban area. Another county to be in a balanced urban/rural area.

Data Collection for Each Case Site

Based on those metrics, San Francisco County and Kern County were chosen. We then began to pull data from the US Census and each county's respective COVID-19 dashboard , taking in COVID-19 data and demographic data by test code. Data was compiled onto a spreadsheet. The data can be found here.

Data Analysis

Once the data was collected, we ran a multivariate regression to determine which demographic factors were most indicative of test rate. It was determined to be population size and percent of people without health insurance. Those variables were then used to generate a predicted case rateper zip code. If the current case rate was higher than the predicted case rate, then that county's zipcode would be flagged. A flagged zipcode means that the testing site capacity might not be sufficient to support the current case rate and that more testing sites are recommended.

Implementation of a Model

Using GeoPandas and Bokeh on Jupyter Notebook, we created a map by zip code of San Francisco County. Each zipcode area is flagged red if the current case rate is higher than the predicted case rate. The map is also interactive with a drop-down menu, give policy-makers an integrated view of the social characteristic in a zip code.

In the future

Currently, the map model is static. The data is pulled from May 31, 2020. In the future, we plan to use API's to pull in real-time case rate data.
We also only had time to create this program for San Francisco County. In the future, we plan to extend our calculations and mapping to Kern County as well (and hopefully, the rest of the state, and the country! And the world!)

What problems or achievements did your team have?

The biggest problem we faced was coordination since our team consists of various international members our sleep cycles were off. Another problem we faced was how there is a lack of uniform data reports among different city/ county health departments; some were in-depth but may lack valuable insight e.g. cases by zip code, testing requirements, map/ directory for testing sites, etc. For our achievement, we are just happy that we got our project finished on time and work as a cohesive unit.

Project Demo

Video Presentation

Project Code

https://github.com/yuyinja/covidbriant

Data & Resources

Data sources:

NASA SEDAC

SEDAC population and urbanization data were used to determine the case study sites. We wanted to choose a densely urbanized county and a more diverse urban/rural county to test our model. We chose San Francisco County and Kern County, both in California.

US Census