Awards & Nominations

Sentinellium has received the following awards and nominations. Way to go!

Global Finalist

Human Factors

The emergence and spread of infectious diseases, like COVID-19, are on the rise. Can you identify patterns between population density and COVID-19 cases and identify factors that could help predict hotspots of disease spread?

Sentinellium

Summary

Sentinellium for public health was built keeping in mind that only a fraction of the Philippine population has internet. With SMS and free mobile data for Messenger, we capture more accurate user data, then integrated with official reports and NASA's space assets on population density, urbanization, and aerosol. Being more robust, it enabled Sentinellium to form a more accurate prediction of developing epidemics to aid health authorities, and profile the users’ risk better.

How We Addressed This Challenge

SUMMARY INFOGRAPHIC:

https://drive.google.com/open?id=1Zx7y8hZptU_4w7wZ4rhqja5js_cGCXFq


Solving the Internet Problem

In order to be able to let users participate even without internet we used SMS and Chatbots as complimentary platforms of the Web App. Here's the rationale:

  • SMS - anyone with a keypad phone to a smartphone can submit their data to Sentinellium and have their risk assessed through the platform as well
  • Messenger Chatbots - in the Philippines, Facebook's Messenger is free for use even without mobile data (or internet). Hence, this can be accessible to most where Messenger has a a high penetration according to https://ourworldindata.org/internet
  • Web Apps - in order to provide additional services for other uses, a complimentary web app is used which is planned to be leveraged in the future for more robust data collection for forecasting other epidemics

Approaching Gap in Data

One of the essential component of an accurate and powerful forecast is a robust or almost complete dataset. As presented in the literatures the model of Sentinellium is built upon, population density and urbanization levels are essential players in determining the rate of spread of a disease. Aerosol levels, despite being debated amongst various literatures in terms of its role in transmission may demonstrate a significant improvement in the model. Since Sentinellium is geared towards monitoring different epidemics, integrating these types of data is crucial.

However, due to challenges in terms of man power, technology, and geography, population data in the Philippines can be very challenging to access (it could be outdated, lacks a lot of values, or estimates built on older metrics). Furthermore, measuring urbanization level is very difficult. As such, to close this gap, Sentinellium leveraged space assets which provides hiqh-quality estimates on population density, accurate and varied measures of urbanization level using satellite imagery, and  aerosol levels.

Aiding Health Units with Predictions and Prescriptions

Data analytics and predictions are all good and powerful. The health workers we've talked to who worked in local health units do acknowledge it. However, one key challenge they often face is the difficulty of having to run their own analysis from a widely entangled dataset which can consume time better dedicated towards mitigation. Often times, they rely on the national authority's macroscopic advices and then tweak on their own to localize the efforts. Sentinellium offers local health units an easy to access, localized, and tailored  analysis. Furthermore, to aide them in faster response, Sentinellium provides insights/prescriptions on how to optimize response in terms of which hotspots are deemed best to address first, which geographic locations are considered safe for the population, and how rapid the response needs to be before the outbreak worsens!

How We Developed This Project

EARLY WARNING SYSTEM FOR EPIDEMICS

In modelling spread of a disease, an epidemiological approach is taken which may sometimes yield a far meaningful result through a simpler methodology compared to implementing advanced machine learning algorithms. As such, two main approaches are taken in developing the forecast of an epidemic based on existing data:

  1. Epidemiological Approach - Building on Tomas Pueyo's study published on https://medium.com/@tomaspueyo/coronavirus-act-today-or-people-will-die-f4d3d9cd99ca, part of Sentinellium's predictive power comes from using historical data on number of deaths, estimated mortality rate (which is updated near real-time as more data comes in), temporal constrainsts (incubation period, time from symptoms onset to death, etc.), and how fast the virus spreads. The initial values used for these estimates come from various related literatures presented at the end of this description.
  2. Exploration of Transmission and Socio-economic Data - As mentioned, how the population is spread, the level of human activity, and the movement of the air are seen from modelling studies of epidemics to play a crucial role in the development of the health crisis. As such Sentinellium built on the following researches:

      3. Future Implementation - During the lockdown, the Philippine government issued an order to ban liquour. The rationale was drinking in the country is a social activity that brings a lot of people together. Culturally, various social circles enjoy time at night through activities such as drinking, karaoke, etc. The group wants to explore how can nighttime lights be used to measure level of human activity in relation to currently ongoing outbreaks. Measuring how stricht social distancing measure is implemented is challenging at the granular level. As such, the team felt nighttime datasets give a better chance.


Challenge Encountered

Most of our team are data explorers used to dealing with datasets other than satellite data. As such, we had to wrangle a lot with how to treat them and interpret them and have them integrated as features of the models we were building. As such, we only half (or maybe even just a quarter) explored the datasets from NASA for modelling techniques! However, we are more than excited to see it through as we got thrilled by the realization that there's so much more to learn about data! The world is filled with it and the organizers of the Space App Challenge and their data gives us easy access to exploration!


RISK PROFILING OF USERS

Since users can access Sentinellium's assessment, we made sure to provide value to users in each of the platform accessible to them. After submitting data, they are then served a risk profiling which is based on their input data, current information on epidemic development near their location, and a weight from the forecast of Sentinellium's model of how the user's location will be in the next 30 days in terms of spread of the disease.

Now, risk assessment is accessible to everyone with access to as basic as a keypad phone!

Model Training Stage

A. Initial Data Analysis. The model will be needing a dataset for training. The huge challenge with this is to survey a sample population in a not-biased and efficient way. After data gathering, exported and clean data from a data repository will be loaded in Python to prepare for model creation. Since the model may use multiple features, it is also better to use correlation maps to visualize the correlation behavior of the features. Simple clustering would also do to understand the data.

B. Exploratory Data Analysis. Multiple features may vary with each other. In this stage correlation and relationship between features will be tested and combined.

C. Feature Selection. Only features that are significant or showing positive correlation will be selected. This may be an iterative process since creating a good model needs to ensure data are processed and analyzed well.

D. Algorithm Selection. The algorithm chosen is clustering. Clustering could also classify results.

E. Model Building. Good selected features fitted into the right clustering algorithm might produce a more accurate model.

F. Model Validation. To validate the accuracy of the model, various techniques might be used such as confusion matrix or cross validations.

G. Model Optimization. There might be ways to optimize a model to improve accuracy, such as dimension reduction.

H. Model Deployment. The best model will then be deployed.


Challenge Encountered

Since Sentinellium is very adamant on capturing as much of the free entry points as possible, it makes the data inflow decentralized. However, the constitunets (the citizen users and the local health units) need to be served. Thus, the development team pushed through with managing the data flow through APIs for both SMS and Chatbots. It was challenging as it demanded a lot of work but it was exciting to see it slowly taking shape!


DEVELOPING SENTINELLIUM PLATFORM

There are three sources of data: SMS, Chatbots, and the Web App. This provided the development team a great challenge. We did a thorough research first on which software to use. We concluded that it is better to use Node.js because it's more accessible and easier to integrate various APIs such as the Facebook messenger API for the chatbot and Twilio for SMS API

Experience the prototype demo yourself here! https://www.figma.com/file/Wz0Gm0f6Mpa5Yuwl6vKSBm/Sentinellim?node-id=0%3A1

The detailed elements of the prototype is documented here:

Techstack: HTML/CSS/ReactJS/NodeJS/Android Studio/Dialogflow

Mobile:https://drive.google.com/open?id=1ChJitnuQZ4z8vy5eT88zEF8A_EDfaaVp

Web/Desktop:https://drive.google.com/open?id=19_Rlm8Gjf4ZBgIgQ_0Ew0boNfleR2Tx7

Data & Resources

DATASETS EXPLORED:

Space Assets

User-Generated Data

REVIEW OF RELATED LITERATURE

Tags
#covid19 #SentinelliumPH #HumanFactors #epidemiology #PredictingEpidemics #InclusiveSolution
Global Judging
This project was submitted for consideration during the Space Apps Global Judging process.