Spotty reporting leads to false zero case per week numbers and incorrect microCOVID calculations #1685

apiology · 2022-12-08T23:06:06Z

Our current code was written back when case reporting in many places was done daily, and was relatively reliable. This is not true anymore, and as a result our code is not as robust as it should be.

Looking for the most extreme symptom of this, I instrumented update_prevalence.py to see how many people are affected by (presumably false) reports of zero cases in at least one of two weeks we look at. Here are the new log messages when this was run on 2022-12-08:

WARNING: 271,562,497 people affected by No cases noted for a week - State level
WARNING: 909,848,811 people affected by No cases noted for a week - Country level
WARNING: 48,101,095 people affected by No cases noted for a week - County level

I looked at Nigeria, the largest country affected based on the logs:

No cases noted for a week - Country level (206,139,587 people): No cases reported in at least one week in Nigeria for period

Indeed, we report ~0 microCOVIDs for this scenario as of the latest data - what update_prevalence.py calls "effective_date" - 2022-12-07:

I looked at the JHU-aggregated data we are pulling over the last couple of months using the alasql CLI:

alasql "SELECT Confirmed FROM csv('https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_daily_reports/10-09-2022.csv') where Country_Region = 'Nigeria'"

Pulling that over multiple days showed these transition points in the reporting numbers:

2022-10-08: 265816 (e - 60)
2022-10-15: 265937 (e - 53) - 7 days later - 121 new cases/week
2022-10-22: 266043 (e - 46) - 7 days later - 106 new cases/week
2022-10-29: 266138 (e - 39) - 7 days later - 95 new cases/week
2022-11-05: 266192 (e - 32) - 7 days later - 54 new cases/week
2022-11-19: 266283 (e - 18) - 14 days later - 45.5 new cases/week
2022-12-07: 266283 (e = effective_date)

I dropped the repeating entries above - the numbers between the items below stay identical until the next item.

This causes the false zeros:

Place#cases_last_week grabs e-7 to e and calculates 0 cases that week.
Place#cases_week_before grabs e-14 to e-7 and calculates zero cases the week before too.

Per microCOVID's basic method of person risk we create a now-cast based on the change between the weekly numbers, so presumably a false zero on either value is going to cause trouble.

The text was updated successfully, but these errors were encountered:

apiology added p1 important documentation / white paper skill: model/research skill: backend engineering labels Dec 9, 2022

This was referenced Dec 9, 2022

Set updatedAt based on when location's case data last changed #1686

Merged

Update api.opencovid.ca call for HR case data #1691

Closed

apiology added this to the Stabilize and monitor data pipeline milestone Dec 21, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Spotty reporting leads to false zero case per week numbers and incorrect microCOVID calculations #1685

Spotty reporting leads to false zero case per week numbers and incorrect microCOVID calculations #1685

apiology commented Dec 8, 2022

Spotty reporting leads to false zero case per week numbers and incorrect microCOVID calculations #1685

Spotty reporting leads to false zero case per week numbers and incorrect microCOVID calculations #1685

Comments

apiology commented Dec 8, 2022