Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate worldbank datareader to wbdata #831

Open
glatterf42 opened this issue Mar 7, 2024 · 3 comments
Open

Migrate worldbank datareader to wbdata #831

glatterf42 opened this issue Mar 7, 2024 · 3 comments

Comments

@glatterf42
Copy link
Collaborator

Possible related to #815.
#827 migrates from pandas-datareader to wbdata and implements some changes needed for that.

@glatterf42
Copy link
Collaborator Author

@danielhuppmann Feel free to close this whenever you consider this migration done. Maybe opening this issue was never needed in the first place since #827 already migrated the read_worldbank() function.

@danielhuppmann
Copy link
Member

Yes, my bad - I didn't get that you already fixed the issue in your PR. But let's leave this topic open anyway as a reminder to revisit the WorldBank-integration feature and see if the unit-issue can be fixed.

@glatterf42
Copy link
Collaborator Author

As it is, I'm not sure there is an officially supported way of retrieving the unit from the WorldBank data. For example:

>>> indicator = "NY.GDP.PCAP.PP.KD"
>>> new = wbdata.get_dataframe(indicators={indicator: "GDP"},country=["CAN", "MEX", "USA"],date=("2003", "2005"))
>>> new
                             GDP
country       date              
Canada        2005  44683.764981
              2004  43704.669134
              2003  42791.094678
Mexico        2005  19144.014627
              2004  19017.753814
              2003  18634.896456
United States 2005  54331.658336
              2004  52989.030694
              2003  51497.734688
>>> new.index
MultiIndex([(       'Canada', '2005'),
            (       'Canada', '2004'),
            (       'Canada', '2003'),
            (       'Mexico', '2005'),
            (       'Mexico', '2004'),
            (       'Mexico', '2003'),
            ('United States', '2005'),
            ('United States', '2004'),
            ('United States', '2003')],
           names=['country', 'date'])
>>> 
>>> new.columns
Index(['GDP'], dtype='object')
>>> result = wbdata.get_indicators(indicator)
>>> result
id                 name
-----------------  ---------------------------------------------------
NY.GDP.PCAP.PP.KD  GDP per capita, PPP (constant 2017 international $)
>>> raw = wbdata.get_data(indicator, country=["CAN", "MEX", "USA"],date=("2003","2005"))
>>> raw
[{'indicator': {'id': 'NY.GDP.PCAP.PP.KD', 'value': 'GDP per capita, PPP (constant 2017 international $)'}, 'country': {'id': 'CA', 'value': 'Canada'}, 'countryiso3code': 'CAN', 'date': '2005', 'value': 44683.764981042, 'unit': '', 'obs_status': '', 'decimal': 0}, {'indicator': {'id': 'NY.GDP.PCAP.PP.KD', 'value': 'GDP per capita, PPP (constant 2017 international $)'}, 'country': {'id': 'CA', 'value': 'Canada'}, 'countryiso3code': 'CAN', 'date': '2004', 'value': 43704.6691337093, 'unit': '', 'obs_status': '', 'decimal': 0}, {'indicator': {'id': 'NY.GDP.PCAP.PP.KD', 'value': 'GDP per capita, PPP (constant 2017 international $)'}, 'country': {'id': 'CA', 'value': 'Canada'}, 'countryiso3code': 'CAN', 'date': '2003', 'value': 42791.0946777734, 'unit': '', 'obs_status': '', 'decimal': 0}, {'indicator': {'id': 'NY.GDP.PCAP.PP.KD', 'value': 'GDP per capita, PPP (constant 2017 international $)'}, 'country': {'id': 'MX', 'value': 'Mexico'}, 'countryiso3code': 'MEX', 'date': '2005', 'value': 19144.014627364, 'unit': '', 'obs_status': '', 'decimal': 0}, {'indicator': {'id': 'NY.GDP.PCAP.PP.KD', 'value': 'GDP per capita, PPP (constant 2017 international $)'}, 'country': {'id': 'MX', 'value': 'Mexico'}, 'countryiso3code': 'MEX', 'date': '2004', 'value': 19017.7538141902, 'unit': '', 'obs_status': '', 'decimal': 0}, {'indicator': {'id': 'NY.GDP.PCAP.PP.KD', 'value': 'GDP per capita, PPP (constant 2017 international $)'}, 'country': {'id': 'MX', 'value': 'Mexico'}, 'countryiso3code': 'MEX', 'date': '2003', 'value': 18634.8964558406, 'unit': '', 'obs_status': '', 'decimal': 0}, {'indicator': {'id': 'NY.GDP.PCAP.PP.KD', 'value': 'GDP per capita, PPP (constant 2017 international $)'}, 'country': {'id': 'US', 'value': 'United States'}, 'countryiso3code': 'USA', 'date': '2005', 'value': 54331.6583361399, 'unit': '', 'obs_status': '', 'decimal': 0}, {'indicator': {'id': 'NY.GDP.PCAP.PP.KD', 'value': 'GDP per capita, PPP (constant 2017 international $)'}, 'country': {'id': 'US', 'value': 'United States'}, 'countryiso3code': 'USA', 'date': '2004', 'value': 52989.0306944184, 'unit': '', 'obs_status': '', 'decimal': 0}, {'indicator': {'id': 'NY.GDP.PCAP.PP.KD', 'value': 'GDP per capita, PPP (constant 2017 international $)'}, 'country': {'id': 'US', 'value': 'United States'}, 'countryiso3code': 'USA', 'date': '2003', 'value': 51497.7346884645, 'unit': '', 'obs_status': '', 'decimal': 0}]

What we'd probably want to have as the unit, 'PPP (constant 2017 international $)', is part of 'name' or 'value' and the 'unit' key that exists for raw data is empty. Might be worth opening an issue with https://github.com/OliverSherouse/wbdata/tree/master if this is a feature we want to see.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants