Our team looked at population density changes from Category 5 Hurricane Michael (2018) and hotspots of COVID-19 over the past months to predict which areas will be most likely to experience an outbreak during an evacuation. By analyzing nighttime imagery from NASA's Earth Observing System Data and Information System (EOSDIS) Worldview dataset before and after Hurricane Michael in 2018, we predict which neighboring counties would have a notable influx of people displaced by an evacuation in 2020. We then looked at the John Hopkins COVID-19 cases dataset to find hotspots in Florida and neighboring states. Finally, we re-calculated the predicted number of cases for each county given the predicted change in population. Residents who may need to evacuate can refer to our predictions to help them decide where to go to best avoid contraction and transmission of COVID-19.
Our team chose this challenge because it is important to consider the impact of concurrent disasters. We narrowed our focus by looking at what coastal areas are the most impacted, and chose the Panhandle of Florida. Through the analysis of the NASA EOSDIS data of before and after Hurricane Michael, we were able to track the migration of evacuators. We then developed a model that would project the number of COVID-19 cases based on the current cases for each county and the population increase found through analysing the NASA EOSDIS data.
Source Code and Analysis Algorithm
Our model is built by parsing different datasets, organizing them into hash tables, and projecting the cases for a given county using simplified assumptions. The main data structures we used to accomplish data parsing were classes and hash tables.
We developed an Image class, which takes GeoTIFF pixel data from before and after the hurricane and calculates a percentage difference in light intensity. This is what we used as a potential indicator of the population changes. In this class, we also converted the latitude and longitude axes provided in the metadata to a normalized x-y scale with (0, 0) as the origin. The percentage change in light intensity for each county was stored in a hash table, which is used in developing our model.
We also developed a Data class, which parses the JHU dataset containing COVID data from the US since the start of the pandemic. We parsed the data by making three hash tables that contain the latitude and longitude for each county, a vector of cases over time for a given county, and the population for each county. These three tables are accessed in the driver using the county as the key.
We created a model that takes in the light data hash table, location hash table, and COVID-19 cases hash table. From these inputs we get a percentage of population increase that was found from the NASA EOSDIS analysis and the percent of the population in each county with COVID-19. From these percentages, a number of COVID-19 cases in the county are then projected in the case of added evacuators to the population following a hurricane.
Challenges Faced
A challenge that we knew would be innate to this situation was having a remote team. To mitigate this, we decided to meet Friday evening to discuss project ideas as well as our plan of attack for the weekend. We set aside a couple hours in the morning and afternoon/evening of both days to ensure we had time to tag up, discuss what we had done and what needs to be completed as well as have collaborative time. We worked a lot on Google Meet as we could share our computer screens. We found that Visual Studio Code had a Python environment as well as Live Share where multiple people could work on a program at the same time. These steps ensured that everyone was aware of progress made and could work with other people easier.
One of the challenges faced was learning Python. While all members have a programming background, our team members had mixed experiences with programming in Python: some had working knowledge and others had no experience. This made creating the analysis program difficult as it took time to learn how to use the analysis packages. While the programming took longer than expected, it was a great learning experience.
We struggled with navigating the data sets and understanding which resource would be best to use. We consulted experts but got conflicting advice. For example, an expert indicated that a given resource would be helpful but for our project, it was not. In addition, we found it difficult to find downloadable versions of the data. A lot of the resources were interactive maps and it was difficult to find downloadable data that we could parse. We spent a lot of time digging through resources and asking people for guidance before finding sources to use.
We struggled with the fact that nobody on our team had a machine learning background. We discussed using a machine learning algorithm, but none of our team members have a machine learning background. We figured it would take too much time to learn how to create an accurate, efficient algorithm in the short timeframe of this challenge, which is why we used the simplified algorithm in our analysis. If we were to continue this project past this weekend’s challenge, we would need to refine the algorithm used for defining hotspots after migration using machine learning. We would have trained the algorithm with past migration data and trends in how COVID-19 spreads with population and population density. Then, the algorithm could be applied to figure out the increase of COVID-19 cases after hurricane migration.
https://docs.google.com/presentation/d/1umy9VOWcWKCe6Ka_Ou2M7O6jhsnza_j0WgQrYrZSE8s/edit?usp=sharing
NASA’s Earth Observing System Data and Information System: worldview.earthdata.nasa.gov
John Hopkins University COVID-19 Data: github.com/CSSEGISandData/COVID-19/tree/master/csse_covid_19_data
US Census County Population Totals 2010-2019 https://www.census.gov/data/datasets/time-series/demo/popest/2010s-counties-total.html#par_textimage_70769902