Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue 432: Add some basic functions for as_point and as_quantile #790

Open
wants to merge 19 commits into
base: main
Choose a base branch
from

Conversation

seabbs
Copy link
Contributor

@seabbs seabbs commented Apr 15, 2024

Description

This PR closes #432. It adds basic s3 methods for as_point and as_quantile to map from sample and quantile forecasts to quantile and point forecasts.

It also adds an option to as_forecast to skip checks as these are a bit annoying and adds an internal function remake_forecast to reclass a forecast and then use assert_forecast. Potentially we may want to expose the latter.

Do we have an issue for refactoring as_forecast.default in a later release as the current monolithic approach will probably make user class extensions hard/annoying.

@pearsonca this is a good example of having to reinvent the wheel quantile wise (i.e your epinowcast community post)

Note: Several of these implementations could be more efficient but I see this as putting the groundwork and I think improving them etc. are topics for their own issues.

Checklist

  • My PR is based on a package issue and I have explicitly linked it.
  • I have included the target issue or issues in the PR title as follows: issue-number: PR title
  • I have tested my changes locally.
  • I have added or updated unit tests where necessary.
  • I have updated the documentation if required.
  • I have built the package locally and run rebuilt docs using roxygen2.
  • My code follows the established coding standards and I have run lintr::lint_package() to check for style issues introduced by my changes.
  • I have added a news item linked to this PR.
  • I have reviewed CI checks for this PR and addressed them as far as I am able.

@seabbs seabbs requested a review from nikosbosse April 15, 2024 18:20
@seabbs seabbs changed the title Issue 432add some basic functions for as_point and as_quantile Issue 432: Add some basic functions for as_point and as_quantile Apr 15, 2024
Copy link
Contributor

@sbfnk sbfnk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good - would have expected perhaps also to see an as_quantile.forecast_point (setting the 0.5 quantile to the point forecast)?

R/as_point.R Outdated
Comment on lines 41 to 42
#' @param quantile_level The desired quantile level for the point forecast.
#' Defaults to 0.5 (median).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Surprised to see this as an argument - do you have a use case in mind where one would want it not to be 0.5?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not reallly but I thought there was little issue with giving this option

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree that it's a minor issue but it does add to the cognitive burden on anyone coming tot he function so if we think there's no use case it might be better to simplify?

R/as_point.R Outdated Show resolved Hide resolved
R/as_point.R Show resolved Hide resolved
R/as_quantile.R Outdated Show resolved Hide resolved
R/forecast.R Outdated Show resolved Hide resolved
R/forecast.R Outdated Show resolved Hide resolved
R/as_point.R Outdated Show resolved Hide resolved
Copy link
Contributor

@nikosbosse nikosbosse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! This is cool functionality to have. I left a few comments. My main points are the duplication of the as_quantile() function for sample-based forecasts and the changes to as_forecast() which I'm not sure I completely understand.

NAMESPACE Show resolved Hide resolved
R/as_point.R Show resolved Hide resolved
R/as_point.R Outdated Show resolved Hide resolved
R/as_point.R Show resolved Hide resolved
R/as_point.R Show resolved Hide resolved
R/as_quantile.R Show resolved Hide resolved
R/forecast.R Outdated Show resolved Hide resolved
#'
#' @return The modified forecast object
#' @keywords internal
remake_forecast <- function(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, we could just call as_forecast() instead. The current sample_to_quantile() function does that.
I'd be happy to avoid the additional code complexity of introducing a new function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to do this hence the new checking option and it still wouldn't work. This what motivated the question about refactoring as_forecast as the current approach with the type specific checking really doesn't seem feasible for future extensions.

I think we should make a new issue for this (related to the as_forecast issue). There is also a more general point about whether or not as_forecast should be able to reclass a forecast object or whether it does actually make sense workflow wise to have something that explicitly removes and then readds the class.

forecast, old_classname, new_classname, verbose = TRUE
) {
remade_forecast <- forecast
class(remade_forecast) <- setdiff(class(forecast), old_classname)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

new_forecast() calls as.data.table() which strips the class anyway

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried this and was having some weird errors. Can you confirm? More generally I think its helpful to make this explicit in the code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. I'll check it out and play around a bit with this. Circling back!

@@ -60,13 +62,17 @@ summarise_scores <- function(scores,
by = "model",
across = NULL,
fun = mean,
metrics,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure we strictly need an additional argument given that users can specify the metrics in score().

If we decide to have it then I'd vote for making the default metrics = get_metrics(scores).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does get_metric do if the input isn't a score object? That was the rub here.

I only added this so I can could use summarise_scores on more general objects and save introducing more code.

More generally I also quite like the idea that naive users can reduce their score object at this point as many likely won't know to manipulate the metric list going into score regardless of docs.

All of above being said I think this is a new issue to discuss.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it makes sense not to use summarise_scores() here (at least for now).

summarise_scores() ultimately is a glorified one-liner and the actual work happens here:

scores <- scores[, lapply(.SD, fun, ...),
    by = c(by),
    .SDcols = colnames(scores) %like% paste(metrics, collapse = "|")
  ]

Then we could discuss the metrics argument in a separate issue/PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure

@seabbs
Copy link
Contributor Author

seabbs commented Apr 17, 2024

would have expected perhaps also to see an as_quantile.forecast_point (setting the 0.5 quantile to the point forecast)?

Given the original issue was just for as_point I think this might be a new feature request.

seabbs and others added 5 commits April 17, 2024 10:22
Co-authored-by: Sebastian Funk <sebastian.funk@lshtm.ac.uk>
Co-authored-by: Sebastian Funk <sebastian.funk@lshtm.ac.uk>
Co-authored-by: Nikos Bosse <37978797+nikosbosse@users.noreply.github.com>
Co-authored-by: Sebastian Funk <sebastian.funk@lshtm.ac.uk>
@seabbs
Copy link
Contributor Author

seabbs commented Apr 17, 2024

My main points are the duplication of the as_quantile() function for sample-based forecasts and the changes to as_forecast() which I'm not sure I completely understand.

Hoopefully the responses cover the rational here and the proposed actions.

@seabbs seabbs requested review from nikosbosse and sbfnk April 17, 2024 10:45
@seabbs seabbs mentioned this pull request Apr 18, 2024
Comment on lines +5 to +9
S3method(as_point,default)
S3method(as_point,forecast_quantile)
S3method(as_point,forecast_sample)
S3method(as_quantile,default)
S3method(as_quantile,forecast_sample)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only minor comment is that this feels as if there is a new point, or quantile class. I believe it would make sense, and be slightly easier to wrap my head around as a potential user, if we re-used the existing classes and terminology. I.e., forecast_point and forecast_quantile.

This would lead to longer names but that's probably still ok 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think if we are going with a abstract class as in #373 we should rename.

NEWS.md Outdated Show resolved Hide resolved
R/forecast.R Outdated Show resolved Hide resolved
R/as_point.R Show resolved Hide resolved
R/as_point.R Outdated Show resolved Hide resolved
#' # Function approach
#' as_point(sample_forecast, fun = mean)
as_point.forecast_sample <- function(forecast, quantile_level = 0.5, fun, ...) {
assert_forecast(forecast, verbose = FALSE)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A new issue sounds good. If we're keeping the assert_forecast() then I think we definitely want it here.

The bigger question is probably "do we want to do a validation check at all?" and I think that's up for debate.
But if we're doing a validation check then we should also check whether the object has the right forecast type.

R/as_quantile.R Show resolved Hide resolved
#' @examples
#' as_point(as_forecast(example_quantile))
as_point.forecast_quantile <- function(forecast, quantile_level = 0.5, ...) {
assert_forecast(forecast, verbose = FALSE)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
assert_forecast(forecast, verbose = FALSE)
assert_forecast(forecast, forecast_type = "quantile", verbose = FALSE)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we validate the forecast at all then we should also make sure the forecast type is correct.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seee above point and my disagreement with this

forecast, old_classname, new_classname, verbose = TRUE
) {
remade_forecast <- forecast
class(remade_forecast) <- setdiff(class(forecast), old_classname)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. I'll check it out and play around a bit with this. Circling back!

@@ -60,13 +62,17 @@ summarise_scores <- function(scores,
by = "model",
across = NULL,
fun = mean,
metrics,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it makes sense not to use summarise_scores() here (at least for now).

summarise_scores() ultimately is a glorified one-liner and the actual work happens here:

scores <- scores[, lapply(.SD, fun, ...),
    by = c(by),
    .SDcols = colnames(scores) %like% paste(metrics, collapse = "|")
  ]

Then we could discuss the metrics argument in a separate issue/PR.

#' @export
#' @examples
#' sample_to_quantile(as_forecast(example_sample_discrete))
#' @keywords internal
sample_to_quantile <- function(forecast,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this function be deleted then?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it can be but this PR is getting widely out of scope hence just making it internal in the first instance

#' @examples
#' as_quantile(as_forecast(example_sample_continuous))
as_quantile.forecast_sample <- function(
forecast, quantile_levels = seq(from = 0.01, to = 0.99, by = 0.01), ...
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
forecast, quantile_levels = seq(from = 0.01, to = 0.99, by = 0.01), ...
forecast, quantile_level = seq(from = 0.01, to = 0.99, by = 0.01), ...

I vote for calling this quantile_level for two reasons:

  • the argument is called quantile_level everywhere else
  • the column in the output object will also be called quantile_level
    I see the point for using the plural, but overall think it would be easier to use the singular here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't agree but happy to standardise with the rest of the code base. It feels like this needs a new issue though as its not very intuitive. (i.e its probs not prob in quantile for a reason).

seabbs and others added 3 commits April 19, 2024 10:04
Co-authored-by: Sebastian Funk <sebastian.funk@lshtm.ac.uk>
Co-authored-by: Sebastian Funk <sebastian.funk@lshtm.ac.uk>
Co-authored-by: Nikos Bosse <37978797+nikosbosse@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Create function as_point() that converts sample- or quantile-based forecasts into point forecasts
4 participants