You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently predict_time_value is the max time_value in the wide df, which appears to be the max time_value with a non-shifted signal available. This is not equal to the forecast (as-of) date for any covidcast signal that I'm aware of, but we want it to be. This problem is even worse for data sources such as "hhs" that can be missing a few days before the as-of date rather than just one, and for those where wday effects may be important. See also cmu-delphi/covidcast#569
The text was updated successfully, but these errors were encountered:
library(dplyr)
library(animalia)
library(evalcast)
## A version of production_forecaster that will place debug info into an object in the global environment:
debug_production_forecaster = `body<-`(
animalia::production_forecaster,
value = expr({
.GlobalEnv[["debug.production_forecaster.env"]] <- environment()
!!body(animalia::production_forecaster)
})
)
predictions =
evalcast::get_predictions(
debug_production_forecaster,
"debug_production_forecaster",
signals = tibble(
data_source = "jhu-csse",
signal = "confirmed_incidence_num",
geo_type = "state",
geo_values = "pa",
start_day = "2021-01-01"
),
forecast_dates = as.Date("2021-03-10"),
incidence_period = "day",
forecaster_args = list(
incidence_period = "day",
lags = c(0L, 7L, 14L)
)
)
debug.production_forecaster.env$mats$predict_time_value # expected to be forecast date
debug.production_forecaster.env$predict_params$newx # expected lag 0 to be NA (& trigger an error)
The result in this case is forecasts that target times 1d earlier than intended. Again, for "hhs"-data-source data or less reliably near-real-time data, it could be more than 1d. The impact of such mistargeting would be larger when there are significant wday effects.
(Note: we shouldn't expect to have data for the forecast date for most/all covidcast signals, so including 0L in the lags doesn't really make sense. But if we remove it and have lags=c(7L, 14L), the problem remains: the predict_time_value and relevant newx entries are still the same as in the example above.)
Currently
predict_time_value
is the maxtime_value
in the wide df, which appears to be the maxtime_value
with a non-shifted signal available. This is not equal to the forecast (as-of) date for any covidcast signal that I'm aware of, but we want it to be. This problem is even worse for data sources such as"hhs"
that can be missing a few days before the as-of date rather than just one, and for those where wday effects may be important. See also cmu-delphi/covidcast#569The text was updated successfully, but these errors were encountered: