Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spotty reporting leads to false zero case per week numbers and incorrect microCOVID calculations #1685

Open
apiology opened this issue Dec 8, 2022 · 0 comments

Comments

@apiology
Copy link
Member

apiology commented Dec 8, 2022

Our current code was written back when case reporting in many places was done daily, and was relatively reliable. This is not true anymore, and as a result our code is not as robust as it should be.

Looking for the most extreme symptom of this, I instrumented update_prevalence.py to see how many people are affected by (presumably false) reports of zero cases in at least one of two weeks we look at. Here are the new log messages when this was run on 2022-12-08:

WARNING: 271,562,497 people affected by No cases noted for a week - State level
WARNING: 909,848,811 people affected by No cases noted for a week - Country level
WARNING: 48,101,095 people affected by No cases noted for a week - County level

I looked at Nigeria, the largest country affected based on the logs:

No cases noted for a week - Country level (206,139,587 people): No cases reported in at least one week in Nigeria for period 

Indeed, we report ~0 microCOVIDs for this scenario as of the latest data - what update_prevalence.py calls "effective_date" - 2022-12-07:

image

I looked at the JHU-aggregated data we are pulling over the last couple of months using the alasql CLI:

alasql "SELECT Confirmed FROM csv('https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_daily_reports/10-09-2022.csv') where Country_Region = 'Nigeria'"

Pulling that over multiple days showed these transition points in the reporting numbers:

  • 2022-10-08: 265816 (e - 60)
  • 2022-10-15: 265937 (e - 53) - 7 days later - 121 new cases/week
  • 2022-10-22: 266043 (e - 46) - 7 days later - 106 new cases/week
  • 2022-10-29: 266138 (e - 39) - 7 days later - 95 new cases/week
  • 2022-11-05: 266192 (e - 32) - 7 days later - 54 new cases/week
  • 2022-11-19: 266283 (e - 18) - 14 days later - 45.5 new cases/week
  • 2022-12-07: 266283 (e = effective_date)

I dropped the repeating entries above - the numbers between the items below stay identical until the next item.

This causes the false zeros:

  • Place#cases_last_week grabs e-7 to e and calculates 0 cases that week.
  • Place#cases_week_before grabs e-14 to e-7 and calculates zero cases the week before too.

Per microCOVID's basic method of person risk we create a now-cast based on the change between the weekly numbers, so presumably a false zero on either value is going to cause trouble.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant