Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Processed data contains duplicate data for multiple geographies #63

Open
aboutaaron opened this issue Aug 3, 2020 · 0 comments
Open

Comments

@aboutaaron
Copy link

aboutaaron commented Aug 3, 2020

Bug/Issue

Census data downloader correctly downloads raw data but creates a CSV duplicated data in the processed directory.

Environment

  • Python 3.8
  • Pipenv version 2018.11.27.dev0
  • Latest version of censusdatadownloader

Reproduce

Install the package and then try to download a data set.

pipenv install census-data-downloader
censusdatadownloader --data-dir data/census race states

Expected behavior

A 52 row CSV file with total population by race in the processed directory.

Actual behavior

A 52 CSV with the same data for each column processed directory.

Possible issues/solutions

It looks like the data is correctly downloaded in the raw directory which makes me think something's happening in the process step. I'm seeing this behavior specifically with the race [geography] arguments.

I noticed the same behavior for internet counties but did get the correct data when I used internet states.

I'll see if I can debug what's happening at the process step but in the meantime I'll rely on the raw data. Thanks for your work on this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant