Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update unknown_facilities list generation to include filename #8

Open
cawarren opened this issue May 8, 2020 · 0 comments
Open

Update unknown_facilities list generation to include filename #8

cawarren opened this issue May 8, 2020 · 0 comments

Comments

@cawarren
Copy link
Member

cawarren commented May 8, 2020

Issue
The script currently (and incredibly helpfully) outputs a list of unrecognized facility names after each merge.

Many of these names just need to be added to the facility names mapping sheet, but some are indicative of a collection problem. For example, in the screenshot below it's clear that a scraper has gotten its columns mixed up:

image

Unfortunately we currently have no way to tell which file from the covidprisondata.com dataset that mix-up occurred in, so we can't easily point our collaborators towards the issue.

Request
If easy (this is P2), please update the aggregator to include a fourth column in the unknown_facilities.csv output: the filename the facility was found in.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant