Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cleanly drop state data upon failures and continue processing #1571

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

apiology
Copy link
Member

Fixes #1570

This will drop a state-level locale from the dataset if we encounter an error while processing data within it. #1570 was caused by the upstream JHU data no longer sending data from Pitcairn Islands, which is considered a state-level locale under the UK in the JHU dataset.

Fixes microCOVID#1570

This will drop a state-level locale from the dataset if we encounter
an error while processing data within it.  microCOVID#1570 was caused by the
upstream JHU data no longer sending data from Pitcairn Islands, which
is considered a state-level locale under the UK in the JHU dataset.
@netlify
Copy link

netlify bot commented Sep 21, 2022

Deploy Preview for microcov ready!

Name Link
🔨 Latest commit 6ebaea2
🔍 Latest deploy log https://app.netlify.com/sites/microcov/deploys/63309a492c645100083e6b2a
😎 Deploy Preview https://deploy-preview-1571--microcov.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site settings.

@apiology
Copy link
Member Author

The proximate cause here is that Pitcairn Islands was dropped from JHU's dataset starting 2022-09-14, and update_prevalence.py assumes we will have data for every region it discovers for every day during the 16 days of history we gather:

$ grep Pitcairn 09-13-2022.csv
,,Pitcairn Islands,United Kingdom,2022-09-14 04:20:44,-24.3768,-128.3242,4,0,,,"Pitcairn Islands, United Kingdom",0.0,0.0
$ grep Pitcairn 09-14-2022.csv
$ 

(you can check these at https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_daily_reports/09-14-2022.csv and so on)

Tested by:

Before fix - running ./update_prevalence.py locally

After fix - running ./update_prevalence.py locally

  • verified that script completed
  • verified that UK loads up in site rendered locally
  • verified that stale data warning goes away
  • verified that Pitcairn Islands is no longer a choice in UK's 'Region' selection

if state.app_key in app_locations[country.app_key].subdivisions:
print_and_log_to_sentry(
f"Removing {state} from app_locations[{country.app_key}].subdivisions"
)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original issue that motivated this was #1570, which only affected the Pitcairn Islands - population 47 people - and left the site stale for a week until JHU noticed and resolved the issue.

If we had a working alert system, we could use the population affected to adjust error handling. We could define something akin to DEBUG/INFO/WARNING/ERROR thresholds, where data issues affecting populations above a certain size go to WARNING (actively alerting us) or even ERROR (where it would fail the prevalence run like it does today as well as alerting us).

Failing prevalence would represent a choice to leave the site with stale data rather than significantly limited data.

@apiology apiology marked this pull request as draft October 31, 2022 11:54
app_locations[state.app_key] = state.as_app_data()
for county in state.counties.values():
app_locations[county.app_key] = county.as_app_data()
try:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ha! @apiology - now I see the try/catch 👍

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @coachnate! This is a different PR than your original comment in #1655. This approach would not have had any effect on #1654.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Prevalence updates failing
2 participants