Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Eliminate/explicate differences in training windowing between flatline and arx forecasters #321

Open
brookslogan opened this issue Apr 19, 2024 · 0 comments

Comments

@brookslogan
Copy link
Contributor

brookslogan commented Apr 19, 2024

#290 highlighted that training window sizes similar to the ahead value can trip up the flatline forecaster. But this also indicates that the flatline forecaster is not using anywhere near n_training instances per epikey if ahead is within an order of magnitude of n_training. This is not the case for arx_forecaster:

library(epipredict)
#> Loading required package: epiprocess
#> 
#> Attaching package: 'epiprocess'
#> The following object is masked from 'package:stats':
#> 
#>     filter
#> Loading required package: parsnip
trace(slather, quote({
  if (inherits(object, "layer_residual_quantiles")) {
    trace(dplyr::summarize, quote({
      cat("Number of non-NA residuals:\n")
      print(.data %>% tidyr::drop_na(.resid) %>% nrow())
    }))
  }
}), quote(untrace(dplyr::summarize)))
#> Tracing function "slather" in package "epipredict"
#> [1] "slather"
case_death_rate_subset %>% flatline_forecaster("case_rate", flatline_args_list(ahead = 28L, n_training = 29L))
#> [...]
#> Number of non-NA residuals:
#> [1] 56
#> [...]
case_death_rate_subset %>% arx_forecaster("case_rate", "case_rate", args_list = arx_args_list(ahead = 28L, n_training = 29L))
#> [...]
#> Number of non-NA residuals:
#> [1] 1624
#> [...]

Created on 2024-04-19 with reprex v2.0.2

However, ?flatline_args_list doesn't explicate this

n_training: Integer. An upper limit for the number of rows per key that
          are used for training (in the time unit of the 'epi_df').

and the message from slather.layer_residual_quantiles when output residuals are NA is something specific to flatline forecaster (and off by one for flatline_forecaster):

! Residual quantiles could not be calculated due to missing residuals.
ℹ This may be due to `n_train` < `ahead` in your <epi_recipe>.

Approach 1: eliminate these differences. Make n_training make sense for flatline_forecaster by using the same NA omission pre training window approach as arx_forecaster. Remove the mention of the inequality above in the layer_residual_quantiles error message since it won't be an issue anymore.

Approach 2: explain the difference in ?flatline_args_list, and mention n_train --> <= <-- ahead is an issue --> for flatline_forecaster <-- in the residual quantiles error message.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant