Skip to content
This repository has been archived by the owner on Jan 8, 2024. It is now read-only.

Visualizations of the novel coronavirus using data science and machine learning techniques

License

Notifications You must be signed in to change notification settings

briancpark/COVID-19-Visualizations

Repository files navigation

COVID-19 Visualizations

Visualizations of the novel coronavirus using data science and machine learning techniques. Please feel free to contribute by sending issues or pull requests.

Automatic daily updates to this repo have stopped.

There have been lots of change to the dataset, so data may look incorrect or inconsistent.

To download, please use a shallow clone to improve performance:

git clone --depth 1 git@github.com:briancpark/COVID-19-Visualizations.git

COVID-19 Confirmed

COVID-19 Daily Report and Data Analysis

COVID-19 Worldwide Cases

Global Statistics

Dependencies

This project uses Pandas, NumPy, MatPlotLib, GeoPandas, and Descartes, plotly, and Selenium. All the code needed to run is in COVID19 Visualizations.ipynb. Please make sure you have installed the all the Python libraries before you run the code. Also make sure to install ffmpeg if you want to compile graphics into video.

Databases

NYTimes database was used for United States of America data and JHU CSSE database was used for international data. Repository is updated bidaily as both databases update around 12 hours apart. Notebook is conveniently coded with UNIX commands so that all it takes to update the visualizations is a simple restart and rerun of the kernel.

Core Functions

country(country_name, data)

Displays the graphs of a country associated with the type of data (confirmed, deaths, or recovered)

country_legend(country_name)

Displays the graphs of all the types of data for a given country

country_active_cases(country_name)

Displays the graph of active cases of COVID-19 for a given country. Calculated by active = confirmed - deaths - recovered

compare_countries(list_countries)

Displays the all graph for a list of given countries. All on top of each other for comparison of statistics.

Update Functions

update_all_cases_country_individual()

Updates/overwrites all the graphs by country and data type (confirmed, deaths, recovered) in the cases_country_individual/ directory.

update_all_cases_country()

Updates/overwrites all the graphs by country and all data types in the cases_country/ directory

update_all_cases_country_active()

Updates/overwrites all the graphs of active cases by country in the cases_country_active/ directory

Global Statistics

worldwide_cases()

Updates/overwrites the worldwide COVID-19 cases. Saved in the main directory as COVID19_worldwide.png

Global Statistics

worldwide_active()

Updates/overwrites the worldwide COVID-19 active cases. Saved in the main directory as COVID19_worldwide_active.png

Global Active Statistics

Geo Functions

These functions utilize the GeoPandas library to visualize COVID-19 cases on the map.

compile_timelapse()

Uses ffmpeg to compile into video and gif format.

Visualizations

Timelapses

Confirmed COVID-19 Cases Worldwide

COVID-19 Confirmed on Map

Deaths from COVID-19 Worldwide

COVID-19 Deaths on Map

Recovered COVID-19 Cases Worldwide

COVID-19 Recovered on Map

Confirmed COVID-19 Cases by County in United States

COVID-19 Confirmed

COVID-19 Deaths by County in United States

COVID-19 Deaths

US Citizens Who Always Wears Masks

always

US Citizens Who Always Wears Masks

frequently

US Citizens Who Always Wears Masks

sometimes

US Citizens Who Always Wears Masks

rarely

US Citizens Who Always Wears Masks

never

Data Sources

I used the dataset provided by the NYTimes. Although the dataset provided by JHU CSSE provides international data, the NYTimes has more specific metadata that is useful in analyzing the United States data like coronavirus cases by states and cities. COVID-19 cases are rising dangerously high in United States at the time of writing this. The NYTimes has already displayed useful statistics with their own database, but I decided to take it one step further and implement time factor.

Other Resources

World Meters Coronavirus Tracker