-
Notifications
You must be signed in to change notification settings - Fork 810
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/scalar with window #2021
base: master
Are you sure you want to change the base?
Conversation
Codecov ReportAttention: ❗ Your organization needs to install the Codecov GitHub app to enable full functionality.
... and 6 files with indirect coverage changes 📢 Thoughts on this report? Let us know!. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice, thanks a lot @JanFidor for another great contribution!
Only had some minor suggestions, let me know what you think.
@@ -603,6 +604,7 @@ def historical_forecasts( | |||
show_warnings: bool = True, | |||
predict_likelihood_parameters: bool = False, | |||
enable_optimization: bool = True, | |||
scaler: Scaler = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few thoughts on this:
- could we rename this to something like "transformer", as it doesn't have to be a Scaler?
- this should Ideally be of type
darts.dataprocessing.transformers.invertible_data_transformer.InvertibleDataTransformer
as we have to inverse transform to compare with the input series. Fittable is not a must. - if we allow a transformer for the target series, then we should also support it for the covariates, what do you think? maybe
series_transformer
,past_covariates_transformer
,future_covariates_transformer
. The covariates transformers don't have to invertible
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great point with the InvertibleDataTransformer
and covariates, it will definitely make the code cleaner and more robust!
@@ -959,24 +964,34 @@ def retrain_func( | |||
if train_length_ and len(train_series) > train_length_: | |||
train_series = train_series[-train_length_:] | |||
|
|||
# testing `retrain` to exclude `False` and `0` | |||
if ( | |||
forecast_time_index_correct = ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we revert the changes to this and is_retrain_func
?
We want to avoid having to call retrain_func()
if not required
# avoid fitting the same model multiple times | ||
model = model.untrained_model() | ||
if is_scalar_used: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think here we could try something like below. We need to respect that some transformers are non-fittable
# apply also the same covariates
if isinstance(scaler, FittableDataTransformer):
scaler.fit(train_series)
if scaler is not None:
# remember that series was already transformed for later forecast
train_transfomed = True
train_series = scaler.transform(train_series)
@@ -1032,6 +1047,9 @@ def retrain_func( | |||
verbose=verbose, | |||
predict_likelihood_parameters=predict_likelihood_parameters, | |||
) | |||
if is_scalar_used: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment for line 1040 (just before model._predict_wrapper()): We could allow support for passing a transformer even if retrain=False. We would just assume that the transformer was fit before, or that it doesn't require fitting.
# also apply for covariates, here we can assume that the scaler must have been fit before or not
# require fitting at all
if scaler is not None and not train_transfomed:
train_series = scaler.transform(train_series)
@@ -709,6 +711,8 @@ def historical_forecasts( | |||
Default: ``False`` | |||
enable_optimization | |||
Whether to use the optimized version of historical_forecasts when supported and available. | |||
scaler | |||
Parameter Scaler applied to each historical forecast. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add some more information here?
E.g. if model is retrained, data transformer re-fit on the training data at each historical forecast step.
The fitted transformer is used to transform the input either for model training or prediction.
Inverse transformation will be applied to the prediction.
Hi, since my issue #2134 has been linked to this, can I just double check if there is a timeline to finish implementing the usage of scalers in historical forecast? |
Hi @ETTAN93! I'll try to finish up the PR around end of January - early February. I'm terribly sorry for the delay, but unfortunately this university term was much harder than I hoped and I didn't have the free time to work on it. |
@@ -1977,6 +1995,7 @@ def _optimized_historical_forecasts( | |||
verbose: bool = False, | |||
show_warnings: bool = True, | |||
predict_likelihood_parameters: bool = False, | |||
scaler=None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missing type
@@ -1117,6 +1117,7 @@ def _optimized_historical_forecasts( | |||
verbose: bool = False, | |||
show_warnings: bool = True, | |||
predict_likelihood_parameters: bool = False, | |||
scaler=None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missing type
@@ -1168,6 +1169,7 @@ def _optimized_historical_forecasts( | |||
overlap_end=overlap_end, | |||
show_warnings=show_warnings, | |||
predict_likelihood_parameters=predict_likelihood_parameters, | |||
scaler=scaler, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missing type
) | ||
forecast_value = TimeSeries.from_times_and_values( | ||
times=times[0] | ||
if stride == 1 and model.output_chunk_length == 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it not better to do this in one more step? Because in this way it is really difficult to read and understand this method call...
…th-window # Conflicts: # darts/models/forecasting/forecasting_model.py # darts/models/forecasting/regression_model.py # darts/utils/historical_forecasts/optimized_historical_forecasts_regression.py # darts/utils/historical_forecasts/utils.py
Co-authored-by: Dennis Bader <dennis.bader@gmx.ch>
Hi, can I just check when this feature is planned to be merged? |
Fixes #1540.
Summary
I've added the ability to use Scalar inside historical_forecasts to avoid information leakage.
As
RegressionModel._optimized_historical_forecasts
doesn't support model retraining, decided to not add scalar with window there. Please let me know if the implementation makes sense or if the code could be changed for the better!P.S. Sorry for the PR being quite late, but the past 2 months have been quite hectic. I'll have significantly less time during this school year, than I originally planned, but I'll do my best to find an hour or 2 a week to respond to reviews and slowly but surely bring this and the others PRs to a merge