Skip to content

Python API for working with Santa Fe (Argentina) COVID data reported

Notifications You must be signed in to change notification settings

mariano22/argcovidapi

Repository files navigation

Argentina and Santa Fe COVID API

Based on handed scraped data on goverment reports:

Santa Fe reports are cumulative. National reports shows new daily cases.

We decided to work with cumulatives time series. There is a smart design decision behind that: if we have cumulative confirmed cases, we don't have to read all the entries, only with the frequentcy we are interested in (imagine weekly analysis).

'Sospechosos' could decrease because some cases can move to 'Confirmados' or 'Descartados'.

Non-Python users

For non python users csv's are generated periodically to be parsed and used (see ./csv/) folder. All with cumulative time series.

  • For Santa Fe:

csv/SantaFe_AllData.csv

  • For Argentina:

csv/Argentina_Provinces.csv

Check last update time on csv/last_update.txt

Argentina API

Python API for working with Argentina COVID data reported.

DataTypes exported:

  • COVIDStats namedtuple
  • ArgentinaAPI class

API methods:

  • api.get_stats(date) API public properties:
  • api.df_provinces : pandas.DataFrame
  • api.provinces : List[str]

Important data types.

from argentina_api import *
print('COVIDStats namedtuple:', COVIDStats._fields)
COVIDStats namedtuple: ('date', 'place_name', 'confirmados', 'muertos', 'recuperados', 'activos')

Create api instance passing the working directory

When load the data, the API tells if there are no entries in 'Info' sheet for certain city.

api = ArgentinaAPI('./')
Downloading Argentinian provinces table from google drive (https://docs.google.com/spreadsheets/d/e/2PACX-1vTfinng5SDBH9RSJMHJk28dUlW3VVSuvqaBSGzU-fYRTVLCzOkw1MnY17L2tWsSOppHB96fr21Ykbyv/pub#)

get_stats : Date -> [ COVIDStats ] of all provinces

Date must be expressed in DD/MM format.

api.get_stats('26/03')[:3]
[COVIDStats(date='26/03', place_name='BUENOS AIRES', confirmados=158, muertos=4, recuperados=15, activos=139),
 COVIDStats(date='26/03', place_name='CABA', confirmados=197, muertos=4, recuperados=53, activos=140),
 COVIDStats(date='26/03', place_name='CATAMARCA', confirmados=0, muertos=0, recuperados=0, activos=0)]

Exported DataFrames

Also exports a DataFrame df_provinces.

With the content of Google Drive data by province (see link above). Provinces names are normalized using normalize_str function.

api.df_provinces.head(3)
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
03/03 04/03 05/03 06/03 07/03 08/03 09/03 10/03 11/03 12/03 ... 02/04 03/04 04/04 05/04 06/04 07/04 08/04 09/04 10/04 11/04
TYPE PROVINCIA
ACTIVOS BUENOS AIRES 0 0 1 2 2 3 3 4 5 9 ... 289 308 333 365 375 405 421 442 460 493
CABA 1 1 1 5 5 7 8 9 10 13 ... 283 311 345 376 389 411 427 445 455 498
CATAMARCA 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 0

3 rows × 40 columns

Exported provinces names

Also exports a List[str] provinces with all the provinces names:

api.provinces
['BUENOS AIRES',
 'CABA',
 'CATAMARCA',
 'CHACO',
 'CHUBUT',
 'CORDOBA',
 'CORRIENTES',
 'ENTRE RIOS',
 'FORMOSA',
 'JUJUY',
 'LA PAMPA',
 'LA RIOJA',
 'MENDOZA',
 'MISIONES',
 'NEUQUEN',
 'RIO NEGRO',
 'SALTA',
 'SAN JUAN',
 'SAN LUIS',
 'SANTA CRUZ',
 'SANTA FE',
 'SANTIAGO DEL ESTERO',
 'TIERRA DEL FUEGO',
 'TUCUMAN']

Example of use

provinces = api.df_provinces.loc['CONFIRMADOS']['26/03']
provinces = provinces[provinces>0]
provinces.plot.bar()
<matplotlib.axes._subplots.AxesSubplot at 0x7fce3ff5e280>

png

Santa Fe API

Python API for working with Santa Fe (Argentina) COVID data reported.

DataTypes exported:

  • CityInfo, COVIDStats namedtuples
  • SantaFeAPI class

API methods:

  • api.get_stats(date)
  • api.get_cities_stats(date)
  • api.get_departments_stats(date) API public properties:
  • api.df : pandas.DataFrame
  • api.all_names : List[str]
  • api.cities : List[str]
  • api.departments : List[str]

Exported functions:

  • is_city(str)
  • is_deparment(str)
  • normalize_str(str)

Important data types.

from santa_fe_api import *
print('COVIDStats namedtuple:', COVIDStats._fields)
COVIDStats namedtuple: ('date', 'place_name', 'confirmados', 'descartados', 'sospechosos')

Create api instance passing the working directory

When load the data the API tells if there are no entries in 'Info' sheet for certain city.

api = SantaFeAPI('./')
Download from google drive...

get_stats : Date -> [ COVIDStats ] of all places

Date must be expressed in DD/M/YYYY format.

api.get_stats('26/3/2020')[:3]
[COVIDStats(date='26/3/2020', place_name='#D_IRIONDO', confirmados=0.0, descartados=1.0, sospechosos=0.0),
 COVIDStats(date='26/3/2020', place_name='ARMSTRONG', confirmados=0.0, descartados=1.0, sospechosos=0.0),
 COVIDStats(date='26/3/2020', place_name='RAFAELA', confirmados=5.0, descartados=1.0, sospechosos=9.0)]

get_stats : Date -> [ COVIDStats ] of only cities

api.get_cities_stats('26/3/2020')[:3]
[COVIDStats(date='26/3/2020', place_name='ARMSTRONG', confirmados=0.0, descartados=1.0, sospechosos=0.0),
 COVIDStats(date='26/3/2020', place_name='RAFAELA', confirmados=5.0, descartados=1.0, sospechosos=9.0),
 COVIDStats(date='26/3/2020', place_name='SAN GENARO', confirmados=0.0, descartados=0.0, sospechosos=1.0)]

get_stats : Date -> [ COVIDStats ] of only departments

api.get_departments_stats('26/3/2020')[:10]
[COVIDStats(date='26/3/2020', place_name='#D_IRIONDO', confirmados=0.0, descartados=1.0, sospechosos=0.0),
 COVIDStats(date='26/3/2020', place_name='#D_GENERAL LOPEZ', confirmados=1.0, descartados=5.0, sospechosos=4.0),
 COVIDStats(date='26/3/2020', place_name='#D_SAN CRISTOBAL', confirmados=0.0, descartados=1.0, sospechosos=1.0),
 COVIDStats(date='26/3/2020', place_name='#D_GARAY', confirmados=2.0, descartados=2.0, sospechosos=5.0),
 COVIDStats(date='26/3/2020', place_name='#D_GENERAL OBLIGADO', confirmados=0.0, descartados=2.0, sospechosos=5.0),
 COVIDStats(date='26/3/2020', place_name='#D_SAN JUSTO', confirmados=0.0, descartados=0.0, sospechosos=2.0),
 COVIDStats(date='26/3/2020', place_name='#D_ROSARIO', confirmados=15.0, descartados=88.0, sospechosos=22.0),
 COVIDStats(date='26/3/2020', place_name='#D_CASTELLANOS', confirmados=5.0, descartados=4.0, sospechosos=11.0),
 COVIDStats(date='26/3/2020', place_name='#D_CASEROS', confirmados=2.0, descartados=0.0, sospechosos=1.0),
 COVIDStats(date='26/3/2020', place_name='#D_CONSTITUCION', confirmados=1.0, descartados=3.0, sospechosos=1.0)]

Exported DataFrames

Also exports a pandas.DataFrame df.

With the content of Google Drive 'AllData' with ['TYPE','DEPARTMENT','PLACE'] index.

Values are cumulative. City names are normalized using normalize_str function.

api.df.head(3)
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
14/3/2020 15/3/2020 16/3/2020 17/3/2020 18/3/2020 19/3/2020 20/3/2020 21/3/2020 22/3/2020 23/3/2020 ... 31/3/2020 1/4/2020 2/4/2020 3/4/2020 4/4/2020 5/4/2020 6/4/2020 7/4/2020 8/4/2020 9/4/2020
TYPE DEPARTMENT PLACE
CONFIRMADOS ##TOTAL ##TOTAL 1.0 1.0 1.0 1.0 1.0 2.0 2.0 4.0 4.0 15.0 ... 133.0 144.0 152.0 160.0 165.0 176.0 184.0 187.0 189.0 195.0
#D_9 DE JULIO #D_9 DE JULIO 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
TOSTADO 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0

3 rows × 27 columns

Exported Dict[CityName, DepartmentName]

to_department(str) property stores CityName to DepartmentName assignations. all_names property stores Set[CityName or DepartmentName]. cities property stores Set[CityName]. departments property stores Set[DepartmentName].

print('Some cities: {}'.format(list(api.cities)[:3]))
print('Some departments: {}'.format(list(api.departments)[:3]))
print('Cities with the respective departments: {}'.format([(c,api.to_department[c]) for c in list(api.cities)[:2]]))
Some cities: ['ARMSTRONG', 'RAFAELA', 'SAN GENARO']
Some departments: ['#D_IRIONDO', '#D_CONSTITUCION', '#D_GENERAL OBLIGADO']
Cities with the respective departments: [('ARMSTRONG', '#D_BELGRANO'), ('RAFAELA', '#D_CASTELLANOS')]

Example of use

Uses is_city(str) is_deparment(str) method to check if a place name is city or department.

ciudades = api.df.loc['CONFIRMADOS'][ api.df.loc['CONFIRMADOS'].index.map(lambda x : is_city(x[1]))  ]['26/3/2020']
ciudades = ciudades[ciudades>0].reset_index(['DEPARTMENT'],drop=True) 
ciudades.plot.bar()
<matplotlib.axes._subplots.AxesSubplot at 0x7fce3d839dc0>

png

Authors

About

Python API for working with Santa Fe (Argentina) COVID data reported

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published