Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/scalar with window #2021

Open
wants to merge 12 commits into
base: master
Choose a base branch
from

Conversation

JanFidor
Copy link
Contributor

@JanFidor JanFidor commented Oct 6, 2023

Fixes #1540.

Summary

I've added the ability to use Scalar inside historical_forecasts to avoid information leakage.
As RegressionModel._optimized_historical_forecasts doesn't support model retraining, decided to not add scalar with window there. Please let me know if the implementation makes sense or if the code could be changed for the better!
P.S. Sorry for the PR being quite late, but the past 2 months have been quite hectic. I'll have significantly less time during this school year, than I originally planned, but I'll do my best to find an hour or 2 a week to respond to reviews and slowly but surely bring this and the others PRs to a merge

@codecov-commenter
Copy link

codecov-commenter commented Oct 6, 2023

Codecov Report

Attention: 11 lines in your changes are missing coverage. Please review.

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Files Coverage Δ
darts/models/forecasting/forecasting_model.py 95.00% <100.00%> (+0.07%) ⬆️
darts/models/forecasting/regression_model.py 96.50% <ø> (ø)
...orical_forecasts/optimized_historical_forecasts.py 93.54% <71.42%> (-4.02%) ⬇️
darts/utils/historical_forecasts/utils.py 89.07% <41.66%> (-3.33%) ⬇️

... and 6 files with indirect coverage changes

📢 Thoughts on this report? Let us know!.

Copy link
Collaborator

@dennisbader dennisbader left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice, thanks a lot @JanFidor for another great contribution!

Only had some minor suggestions, let me know what you think.

@@ -603,6 +604,7 @@ def historical_forecasts(
show_warnings: bool = True,
predict_likelihood_parameters: bool = False,
enable_optimization: bool = True,
scaler: Scaler = None,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few thoughts on this:

  • could we rename this to something like "transformer", as it doesn't have to be a Scaler?
  • this should Ideally be of type darts.dataprocessing.transformers.invertible_data_transformer.InvertibleDataTransformer as we have to inverse transform to compare with the input series. Fittable is not a must.
  • if we allow a transformer for the target series, then we should also support it for the covariates, what do you think? maybe series_transformer, past_covariates_transformer, future_covariates_transformer. The covariates transformers don't have to invertible

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great point with the InvertibleDataTransformer and covariates, it will definitely make the code cleaner and more robust!

darts/models/forecasting/forecasting_model.py Outdated Show resolved Hide resolved
@@ -959,24 +964,34 @@ def retrain_func(
if train_length_ and len(train_series) > train_length_:
train_series = train_series[-train_length_:]

# testing `retrain` to exclude `False` and `0`
if (
forecast_time_index_correct = (
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we revert the changes to this and is_retrain_func?
We want to avoid having to call retrain_func() if not required

# avoid fitting the same model multiple times
model = model.untrained_model()
if is_scalar_used:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think here we could try something like below. We need to respect that some transformers are non-fittable

                        # apply also the same covariates
                        if isinstance(scaler, FittableDataTransformer):
                            scaler.fit(train_series)

                        if scaler is not None:
                            # remember that series was already transformed for later forecast
                            train_transfomed = True
                            train_series = scaler.transform(train_series)

@@ -1032,6 +1047,9 @@ def retrain_func(
verbose=verbose,
predict_likelihood_parameters=predict_likelihood_parameters,
)
if is_scalar_used:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment for line 1040 (just before model._predict_wrapper()): We could allow support for passing a transformer even if retrain=False. We would just assume that the transformer was fit before, or that it doesn't require fitting.

            # also apply for covariates, here we can assume that the scaler must have been fit before or not
                # require fitting at all
                if scaler is not None and not train_transfomed:
                    train_series = scaler.transform(train_series)

@@ -709,6 +711,8 @@ def historical_forecasts(
Default: ``False``
enable_optimization
Whether to use the optimized version of historical_forecasts when supported and available.
scaler
Parameter Scaler applied to each historical forecast.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add some more information here?
E.g. if model is retrained, data transformer re-fit on the training data at each historical forecast step.
The fitted transformer is used to transform the input either for model training or prediction.
Inverse transformation will be applied to the prediction.

@ETTAN93
Copy link

ETTAN93 commented Dec 28, 2023

Hi, since my issue #2134 has been linked to this, can I just double check if there is a timeline to finish implementing the usage of scalers in historical forecast?

@JanFidor
Copy link
Contributor Author

JanFidor commented Jan 4, 2024

Hi @ETTAN93! I'll try to finish up the PR around end of January - early February. I'm terribly sorry for the delay, but unfortunately this university term was much harder than I hoped and I didn't have the free time to work on it.

@@ -1977,6 +1995,7 @@ def _optimized_historical_forecasts(
verbose: bool = False,
show_warnings: bool = True,
predict_likelihood_parameters: bool = False,
scaler=None,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing type

@@ -1117,6 +1117,7 @@ def _optimized_historical_forecasts(
verbose: bool = False,
show_warnings: bool = True,
predict_likelihood_parameters: bool = False,
scaler=None,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing type

@@ -1168,6 +1169,7 @@ def _optimized_historical_forecasts(
overlap_end=overlap_end,
show_warnings=show_warnings,
predict_likelihood_parameters=predict_likelihood_parameters,
scaler=scaler,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing type

)
forecast_value = TimeSeries.from_times_and_values(
times=times[0]
if stride == 1 and model.output_chunk_length == 1

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it not better to do this in one more step? Because in this way it is really difficult to read and understand this method call...

@ETTAN93
Copy link

ETTAN93 commented Apr 25, 2024

Hi, can I just check when this feature is planned to be merged?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEATURE] Scaler with rolling/expanding window to eliminate look ahead bias
5 participants