Bee Data| SDGs and COVID-19

SDGs and COVID-19

This challenge invites you to analyze the impact of COVID-19 on the United Nations (UN) Sustainable Development Goals (SDGs) by looking at the current and ongoing change in the monitoring indicators of the UN SDGs using Earth observation/remote sensing and global Earth system model-derived analysis products.

Data homogenization for a better understanding of the effects of COVID-19 on SDGs

Summary

This project addresses the wide and heterogeneous variety of data available to analysis that can be used to measure progress towards the SDGs by describing a tool that shall be able to integrate data from many different sources into a single function of time with an output as simple as a number that would describe the status in regards to a given SDG (or target) at any given moment in time.

How We Addressed This Challenge

The Problem

A wild and wide variety of data sets is available for anyone to explore, and these data sets contain much data which can be used to estimate progress toward a specific SDG, or, more specifically, to a given target. This enormous volume of data, however, poses a couple of problems and threats to objective analysis:

The available data is heterogeneous; there are many different data formats available, representing data of several different natures, mainly images and numbers. In addition, the data is not always easy to access, and the websites presenting them are very dissimilar from one another. One has to spend a good deal of time on each new website just trying to understand where the data / API access information even is.
The connection of a given data set to a specific SDG or target is not always clear; the targets in the SDG framework are given certain indicators, but one could argue that the indicators given are not representative enough or that new and difffere indicators could exist that better indicate the status on a given target.

The Proposed Solution

At the end of the day, the overall progress towards a specific SDG or target is better given by a single number specifying progress within a given range of possible values. This is not only true of SDGs but of any objective goal system. This is not to say that qualitative analysis isn't useful: in fact, when the time comes to take action towards improving our position regarding a given goal, we could say the qualitative analysis is even more important than the numbers themselves! But in order for the goal system to be objective, and to be able to say yes or no to the question "did we get better?", a number is needed that can be tracked over time and compared to prior or later results.

This number (which we will call "score" from here on) can be calculated upon many different inputs; hence a function getScore will exist which takes those inputs and gives as a result the score for a given target.

Because the notion of how much a given input does actually tell you about a specific target is rather subjective, we believe it to be a good idea to make getScore a sum of products, in which each input (from here on "pre-score") is multiplied by a given coeficient which represents how important that given pre-score is for a specific target. All the coeficients must sum to 1.

For this to work, the functions that calculate the pre-scores need to have the same image range; for practical purposes, we decided to make this image range go from 0 to 1, 0 meaning bad and 1 meaning good.

Now, like we said before, the data sets available are very dissimilar from one another in terms of format and access methods. Because of this, we decided to abstract the task of accessing the data and calculating the pre-scores into an object oriented interface, IPreScoreProvider with a single function called getPreScore, which takes a timestamp and returns the value for that prescore (remember, a value between 0 and 1) for that given moment.

Each implementation of IPreScoreProvider must take care of all the details pertinent to accessing the data, such as handling API keys, or parsing the UI of a website that provides no api, or downloading a file from an FTP server, etc; and then of computing the pre-score for the moment requested by the caller.

This allows analysts to not having to worry about how to get and combine the data; they will simply choose a set of pre-score providers, give each of them an importance they believe those pre-scores have towards a specific target, and be able to see the overall score evolving over time.

The scope of the implementation presented today, is a simple console application which serves as a PoC for the homogenization of the data. However, the full solution would ship with a UI for users to navigate through a gallery of many different providers providing pre-scores from many different data sets, and in which the user would have the capability to select as many of such providers as they want, and assign a weight to each one of them to combine the data into a single result, which would be displayed in a plot, showing the evolution of the score over time. This plot can include specific highlights in the timeline, such as the first case of COVID-19, first death, first country to lock down, etc. A mockup displaying this idea is now hosted at https://beedata.us/ .

How We Developed This Project

What led us to choose this challenge was the motivation to be able to help society by helping to understand the effects of COVID-19 on SDGs. We believe that SDGs are great goals to achieve for a better humanity development and better management of the planet's resources.

As mentioned above, one of the problems we encountered was the great hetereogenity of data so we decided to encapsulate the logic of accessing the data. The idea is that the analyst won´t have to worry about data homogenization but will instead work with data already converted to the same format. we use relevant data as an example to show the practicality of the project.

In principle we used C# and the open data of different organizations including NASA to develop the first part of the project, which includes the data aggregation engine and a console demo to show the usefulness of the solution (code available at https://github.com/nprieto95/BeeData/tree/master/BeeData.CovidSpaceAppsDemo).

Then, a React SPA was built (now hosted at https://beedata.us/) using material design to showcase how a UI for the solution would look.