Human Factors

The emergence and spread of infectious diseases, like COVID-19, are on the rise. Can you identify patterns between population density and COVID-19 cases and identify factors that could help predict hotspots of disease spread?

Developing a COVID-19 case forecast tool’ from Earth observation air pollution data

Summary

Air pollution may act as a surrogate marker for adherence to social distancing; a key variable in mitigating SARS-CoV-2 transmission. Our initial analysis of London, a population-dense city, indicated that an increase in social isolation led to a decrease in air pollution (NO2) derived from ground-based and satellite data. An increase in air pollution (Time+0 days) also led to an increased number of new COVID-19 cases (Time+7 days, average viral incubation time). This pilot study aims to inform indirect modelling of social distancing and COVID-19 incidence to aid policy makers about medical resource management.

How We Addressed This Challenge

What was our interpretation of the challenge? The first step was team formation, identification of skill mix and brainstorming a list of initial objectives for our project. Our team organically formed from members of the Space Generation Advisory Council Space Medicine and Life Sciences (SGAC SMLS) project group (https://spacegeneration.org/projects/smls), and we all had skill sets that overlapped with aerospace medicine and life sciences. As a team we all had an interest in addressing the Human Factors challenge. As this stream involved identifying whether there was a link between population density and COVID-19 cases to help predict hotspots of disease spread.

Our team skill-mix, rationale and objectives: Our group consisted of six early investigators that spanned multiple disciplines that included medical physics and engineering, public health, life sciences, space medicine, emergency medicine, intensive care medicine, and microbiology.  Three of our members were early career frontline physicians that are part of the frontline response to COVID-19 based in the UK (Birmingham, London) and the USA (New York); who have seen the impact of covid-19 cases overwhelming a medical system and limiting appropriate resource allocation.

As part of our brainstorming session, as a team we primarily set out our key objectives in a shared document, matched each objective to a team member (as per skill mix) and set SMART goals. We also ensured that we touched base at set times using VOIP (Zoom/Hangouts) to discuss our progress.

Our key objectives evolved during the hackathon and can be distilled to the following:

  1. Evaluate whether air pollution (NO2) was a surrogate marker for social distancing (using London, UK for a proof of concept study).
  2. Evaluate the impact of air pollution (NO2) at Time 0 on the incidence of COVID-19 cases? (using London, UK for a proof of concept study).
  3. To use this pilot data to develop a model using Earth Observation pollution data to predict the (post-incubation) COVID-19 cases presenting to hospitals and improve resource management (for public health policy makers).
How We Developed This Project

What inspired your team to choose this challenge? Why Covid-19?

The COVID-19 pandemic is constantly evolving and challenging the global public health infrastructure in an unprecedented fashion.  Coronovirus disease 2019 (COVID-19) is caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) (formerly called HCoV-19). This is a novel virus first identified in Wuhan, China, in December 2019. The subsequent viral sequencing indicates that this is a beta-coronovirus that is closely linked to the SARS virus. [1]. The SGAC SMLS team - Team Novidien - were inspired to engage in this NASA Space Apps challenge as we wanted to explore the use of space applications and EO data.

Team Novidien project rationale: Earth Observation and Covid 19

Our initial literature review indicated that measures such as social distancing have been effective in “flattening the curve”, the curve being COVID-19 caseload and deaths. Another pertinent point was the time lag delay between exposure to the virus to symptoms to serious manifestation of COVID-19. Understanding population adherence to social distancing measures may allow governments and public health authorities to plan for future “waves” or surges of COVID-19 cases. Adequate resourcing of our medical facilities, including staff, ICU capacities, ventilators, personal protective equipment (PPE), will ensure better outcomes for patients in subsequent waves of COVID-19 across the world.

What was your approach to developing this project? Please see our workflow with our decision tree in Slide 1 on our slideset.

How did you use space agency data in your project? What tools, coding languages, hardware, software did you use to develop your project? Please see slide 2,3,4 and our summary below. 

With this approach and our key objectives in mind we sourced 3 main data sources to build our model:

  1. NO2 pollution data was sourced from the NASA OMNO2 data from Aura satellite
  2. Google community mobility reports for social distancing levels in London
  3. Daily COVID-19 laboratory confirmed cases and mortality number in London from the NHS

Our analysis started with evaluation of the NASA OMNO2 data from the Aura satellite. From the animation GIF in figure 2 & 3, there is a distinctive decrease in atmospheric NO2 concentrations after during the pandemic period as compared to the same timeframe one year ago. (Slide 2)

Next we seek to compare NO2 data with Google community mobility reports, which seeks to mention social distancing through geographical location data. Although the atmospheric NO2 concentrations data did not have a significant correlation with increased residence time (Figure 4), when combined with more granular data from ground-based stations from LondonAir data, does have a negative correlation with less NO2 concentration with increased residence time (Figure 5).  The Spearman R curve was -0.31 (confidence interval -0.48 to -0.11) with p-value significance of 0.0021.

Finally, we attempted to predict caseload and mortality data to the level of local government levels in London through our mapping data in Figure 7. You can see the plotting of our mapped data here with layers for local government, hospital facilities, BAME data, and caseload and mortality data: https://bit.ly/3eD9Jak  Our aim was to compare our mapped pollution data from 7 days ago to COVID-19 caseload on our map. For example, our pollution data would start on the 24th February 2020 and link to COVID-19 cases on the 1st March and so on until the 27th of May. However, we have issues determining NO2 levels at the local government level from the data we have access to, so decided to demonstrate a proof of concept using city-level data from the whole greater London area. In Figure 8 you can see the demonstration that the combined atmospheric levels of NO2 from NASA OMNO2 data and LondonAir ground data has a positive correlation to COVID-19 caseloads 7 days later. This produced a Spearman R-curve of 0.3449 (Confidence Interval 0.1332 to 0.5265) with a significant P-value of 0.0014

This confirms that the NO₂ levels in an area could be harnessed as a surrogate marker for social distancing, and used to predict new COVID-19 cases.  The significance of this is that the NO2 pollution data from today can be used to aid policy makers and hospital administrators throughout the world to understand the adherence of social distancing measures. This can essentially provide a COVID-19 caseload forecast for 7 days or more in the future.

What problems and achievements did your team have? Please see slide 5 

The limitations in this analysis is the assumption of a delay of 7 days between contact to confirmation of COVID-19 case. There is a huge variability in time between contact with a COVID-19 positive person to first symptoms, and then to confirmation of a new COVID-19 case, and then to hospital admission and death. Further data analysis can be done on a range of days between pollution data and caseload that can achieve the maximum R-curve.

We were not able to obtain in time Intensive Care Unit admission data for COVID-19 which would be the most accurate measure of COVID-19 impact. This is because of the inherent limitations of data from new caseload due to testing bias and lack of testing in asymptomatic cases, and the death rate which does not include death where COVID-19 is not confirmed. One hundred percent of Intensive Care Unit admission data is collected by Intensive Care National Audit & Research Center (https://www.icnarc.org/) and provides a much more accurate picture of severe COVID-19 cases. While the underlying cause of a severe COVID-19 case is multifactorial including demographics, environmental and resource factors, this data is also a much more important measure for hospital administrators and healthcare organisations as Intensive Care Unit bed is one of the most resource intensive demand on the health system, so providing prediction on this data would be most relevant to the intended audience.

A future potential in verifying the robustness and accuracy of our data is to compare our London data with another capital city like Stockholm, Sweden where social distancing and closure of schools and workplace was not enforced. A preliminary plot we created clearly shows the different trends in caseload between the two cities, and it will be interesting to compare NO2 data and its relation to COVID-19 caseload. Other city-by-city comparisons can also be carried out especially to compare the reliability of this data with the density of the city.

The potential in this workflow process is to create a predictive model for policy makers and hospital administrators to decide on the degree of lockdown and social distancing in response to real-time data on adherence, and can help with resource management of healthcare systems in the near future in response to new waves of infection.

Project Demo

Our five slide presentation showcasing our project for the NASA Space Apps Hackathon

Link 1 to Google slides: https://tinyurl.com/teamnovid 

Alternative link: Google slides https://drive.google.com/file/d/11Zvo0ay6Ldq7DEtnH3rnbGH-64M0YQf8/view?usp=sharing 


Data & Resources
  1. NASA OMNO2 data (https://disc.gsfc.nasa.gov/datasets/OMNO2_003/summary)
    1. Data captured with Aura satellite, shows distribution of nitrogen dioxide over London and Sweden Timeframe: Feb 23 - May 27 2020
  2. Google community mobility reports
  3. LondonAir ground NO2 datasets
  4. Daily new COVID-19 cases confirmed by laboratory test from NHS England (https://coronavirus.data.gov.uk/ )
    1. Dates selected: 1st March - 29th May
  5. Daily death numbers from COVID-19 from NHS England
Tags
#SGAC #SMLS #spacemedicine #airpollution #socialdistancing #covid19 #Covidprediction #model #software #solution #novidien #aim4novid
Global Judging
This project was submitted for consideration during the Space Apps Global Judging process.