Skip to content

aquilesC/Covid-19-Jupyter-Notebook

Repository files navigation

Covid-19-Jupyter-Notebook

Author: Aquiles Carattino

Date: 2020-04-27

Example Jupyter notebook to explore the data of the ongoing COVID-19 Pandemic. The plots you see here were last generated on 2020-04-27 17:55:52.801651.

Data is pulled from this repository which is updated regularly by the Whiting School of Engineering of the Johns Hopkins University.

To get started, head to this notebook

Figures

The notebook explores different countries and compares them, also shifting the dates to visualize the differences between countries. These are some of the Figures generated by the notebook. With some simple explanation. I will write, over time, more detailed descriptions of what each figure means, how and why it was created, etc.

Doubling Time

5 day window doubling time

The figure above shows the time it takes for the number of cases to double. To generate the figure, I used a 10-window in order to miniimze the statistical fluctuations normally present in these kind of situations. For me, it is shocking to see that the doubling time in Spain and in Germany is almost the same, considering the lock-down policies the follow are radically different. The Netherlands, with policies similar to Germany, is showing a milder slow-down.

Doubling time for some Latin American Countries

Latin American countries are showing a mixture of trends, most likely related to the policies each one followed during the past few weeks. Argentina is in Lockdown, while Brazil and Chile are not.

A lengthier discussion can be found here

China over time

China in Linear Scale

This is the evolution of the cumulative number of detected cases in China over time. In the past, there was data available regarding recoveries, and thus it was possible to make a plot like this one, in which you see the number of active cases. The problem is that recoveries are apparently badly reported through the world, and especially in the US.

Since the evolution of the number of cases spans several orders of magnitude, it may be better to plot the information in logarithmic scale: China in Log Scale

Log-Log Plots number of new cases

The problem with plotting countries as a function of time is that the evolution of the pandemic is shifted. For example, in China, it started in December 2019, while the first detected cases in Europe or the US appeared only at the beginning of 2020. Plotting the number of new cases versus the number of cumulative cases can be a better way of visualizing the status of the pandemic. Let's see how China and South Korea compare to each other:

China Korea in Log-Log

For the plot above, I am defining new cases as the difference in accumulated cases between two consecutive days. For these two countries, it is clear that as the epidemic progresses the number of new cases is proportional to the number of accumulated cases. Until, at some point, it drops abruptly. However, you can see that the data points are relatively noisy because of large statistical fluctuations in how new cases are reported, etc. In order to have a clear picture, instead of dealing directly with the number of new cases, I will apply a rolling window of 1 week and of 5 days, in order to smooth the data, and take into account, somehow, the variation in reporting days, etc.

China Korea weekly-average new cases

China Korea 5-day-average new cases

The plots above are much cleaner. The idea of having a 5-day window is that it allows seeing quicker changes, which is what people are expecting. For example, is the lockdown in Spain or Italy working? With a 5-day window, there's a better chance to see any differences. See below for more plots.

Europe over time

Europe as a whole is being very badly hit by the spread of nCOV19. Below you can see the cumulative number of detected cases in different countries.

Europe in Log Scale

It is possible to see that the onset of the epidemic in each country happens at different times. Instead of relying on the first detected case, I shift the temporal information of the countries by the moment in which 10 cases were detected and use Italy as a reference. The fact I use 10 cases instead of the first is that there are a lot of statistical fluctuations when the number of cases is low. For example, one detected case may be contained and eventually recover, until a second case appears. From the plots above, it seems that once there are 10 cases detected in a country, there is no going back in the evolution of the spread.

Europe shifted according to Italy

To give a bit more context, we can also add the dates of lockdowns or cease of activities, using the same temporal scale than in the previous plot. Some countries, such as Italy, went into full lockdown, while some countries, such as The Netherlands, are in a state where some activities are forbidden (bars, restaurants), schools are closed, but people are not banned from going out even without reason.

Europe shifted according to Italy showing lockdowns

And of course, we can show the same data but in a linear scale, since log-scale may be slightly counter-intuitive for people who are just getting acquainted with exponential processes:

Europe shifted according to Italy showing lockdowns in linear scale

Europe in Log-Log

We can see the evolution of the pandemic in some European countries, to try to understand what is actually happening. The plot below is done with a 5-day rolling window in order to lower the fluctuations and consider the delays in case reporting, analysis, etc.

Europe in Log-Log

Both Sweden and Denmark show the same bump, which I can't explain. It is possible to see that Italy is the only country that is actively tilting the curve.

Latin America

Another interesting region to observe is Latin America. Each country is showing very different evolutions over time.

Some Latin American countries over time in log-lin scale

And as an example, we can see the difference between Italy and Argentina, two countries that have established complete lockdowns to prevent the spread:

Argentina and Italy with lockdowns in log-lin scale

The same plot but in linear scale is very eloquent:

Argentina and Italy with lockdowns in lin-lin scale

Latin America in Log-Log

If we plot the Latin American evolution in log-log, with a 5-day rolling window for the calculation of new cases, we obtain the figure below. I am using US data as a reference for comparing:

Latin American countries in Log-Log new cases vs accumulated

All countries are overlapping on the same curve, and they overlap to the US curve.

Contribute

There are different ways of contributing to the code. If you identify a problem, you can create an Issue explaining what you found. If you know the solution or would like to improve any of its parts, I encourage you to create a fork of this repository, implement it and then submit a pull request. In this way, the authorship of the code will be nicely documented.

Useful information to gather

  • Information on lockdowns or closure of activities I could gather some information about lockdowns or massive closure of activities in a handful of countries. This information could be systematized

  • Number of hospitalizations I don't know of a source to include the number of people who are hospitalized. This is crucial to understand the evolution of the disease, not just the spread of the virus.

  • Number of beds in hospitals A useful parameter to show in the figures is the total number of beds in hospitals for the countries. Especially with the number of hospitalizations, it may give insight into when countries are locking down and how much room for further expansion of the epidemic is available, globally and locally.

LICENSE

The code is released under an MIT License. This means you can reuse it and redistribute it as you see fit. The data is released with stringent terms and conditions, and thus you should be careful with the usage you make out of it. I consider this repository an example of educational usage of the data.

About

Example Jupyter notebook to explore the data of the ongoing COVID-19 Pandemic

Resources

License

Stars

Watchers

Forks

Packages

No packages published