Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dashboard needs to account for forecaster prediction missingness #101

Open
brookslogan opened this issue Feb 24, 2024 · 2 comments
Open

Comments

@brookslogan
Copy link
Contributor

brookslogan commented Feb 24, 2024

Currently, there are some surprises in comparisons like

  • by ahead you see dev10v4s lower WIS than amoebalike
  • by forecast_date it looks like amoebalike is almost always lower WIS than dev10v4s.

This is probably explained by amoebalike being generated for fewer, mostly lower, aheads right now and slightly fewer times (though somehow it has ~twice as many predictions than dev10v4s when you do x var = forecaster?)

There are a couple of approaches

  • Intersecting to common prediction set for the set of forecasters selected.
  • Forecaster-pool-relative WIS approaches.

Usually I think we'd favor the first unless some weird missingness patterns or high levels of missingness force us to do something like the second.

There's a bit of old code for this; evalcast::intersect_averagers() did this one way, and some old code did it another way; this is the core:

matched_scorecards <- scorecards %>%
  filter(!is.na(ae)) %>%
  group_by(data_source, signal, geo_value, forecast_date, target_end_date, ahead, incidence_period) %>%
  filter(n() == length(unique(.[["forecaster"]]))) %>%
  ungroup()

There are also variations on this that tried to also simultaneously filter to forecast dates or target end dates that had evaluations for all the aheads like the below, though it's pretty confusing and there's probably a better way to write it. The idea was that for the most recent target dates, we may only have evaluations ready for the shorter aheads, and that this would suggest misleading forecasting "trends" when breaking down by target end date but not simultaneously the ahead.

   group_by(data_source, signal, geo_value, target_end_date, incidence_period) %>%
   {
     n.forecasters <- length(unique(.[["forecaster"]]))
     filter(
       .,
       n() == n.forecasters * length(matching_aheads[matching_aheads %% 7L == extract_single_unique_value(ahead %% 7L)])
     )
   } %>%
   ungroup() %>%

This code looks a little weird because forecast_dates were expected to be exactly weekly but aheads from 0 or 1 to 28, so for target dates with the same weekday as a forecast date you'd want 5 or 4 predictions per forecaster, and for other target dates you'd want 4 predictions per forecaster. (For complete forecast_dates you'd want 29 or 28 per forecaster.)

@dsweber2 dsweber2 added this to the Pipeline improvements milestone Feb 26, 2024
@dsweber2
Copy link
Contributor

though somehow it has ~twice as many predictions than dev10v4s when you do x var = forecaster?)

This is because covid_hosp_explore generates every day of the week, rather than just one, so for any given ahead there's ~7x the number of points. I'm considering dropping this down to only Mondays. @dshemetov thoughts? It does make examining weekday effects more difficult, but those are surprisingly uncommon here. It would mean the 21 hour runs become a mere 3 hours.

Intersecting to common prediction set for the set of forecasters selected.

In the mean time, you can filter by ahead to only the matching ones. Having that auto-populate to the minimal shared wouldn't be a bad idea.

@dshemetov
Copy link
Contributor

Just producing weekly forecasts sounds good to me. I'm not particularly concerned about weekday effects atm and a shorter run time is nice to have.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants