-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cleanly drop state data upon failures and continue processing #1571
base: main
Are you sure you want to change the base?
Conversation
Fixes microCOVID#1570 This will drop a state-level locale from the dataset if we encounter an error while processing data within it. microCOVID#1570 was caused by the upstream JHU data no longer sending data from Pitcairn Islands, which is considered a state-level locale under the UK in the JHU dataset.
✅ Deploy Preview for microcov ready!
To edit notification comments on pull requests, go to your Netlify site settings. |
The proximate cause here is that Pitcairn Islands was dropped from JHU's dataset starting 2022-09-14, and $ grep Pitcairn 09-13-2022.csv
,,Pitcairn Islands,United Kingdom,2022-09-14 04:20:44,-24.3768,-128.3242,4,0,,,"Pitcairn Islands, United Kingdom",0.0,0.0
$ grep Pitcairn 09-14-2022.csv
$ (you can check these at https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_daily_reports/09-14-2022.csv and so on) Tested by: Before fix - running
After fix - running
|
if state.app_key in app_locations[country.app_key].subdivisions: | ||
print_and_log_to_sentry( | ||
f"Removing {state} from app_locations[{country.app_key}].subdivisions" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original issue that motivated this was #1570, which only affected the Pitcairn Islands - population 47 people - and left the site stale for a week until JHU noticed and resolved the issue.
If we had a working alert system, we could use the population affected to adjust error handling. We could define something akin to DEBUG/INFO/WARNING/ERROR thresholds, where data issues affecting populations above a certain size go to WARNING (actively alerting us) or even ERROR (where it would fail the prevalence run like it does today as well as alerting us).
Failing prevalence would represent a choice to leave the site with stale data rather than significantly limited data.
app_locations[state.app_key] = state.as_app_data() | ||
for county in state.counties.values(): | ||
app_locations[county.app_key] = county.as_app_data() | ||
try: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ha! @apiology - now I see the try/catch 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @coachnate! This is a different PR than your original comment in #1655. This approach would not have had any effect on #1654.
Fixes #1570
This will drop a state-level locale from the dataset if we encounter an error while processing data within it. #1570 was caused by the upstream JHU data no longer sending data from Pitcairn Islands, which is considered a state-level locale under the UK in the JHU dataset.