Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Error when forecasting with exogenous regressors for each of the nodes #80

Open
aakashparsi opened this issue Jul 9, 2021 · 4 comments
Labels
bug Something isn't working

Comments

@aakashparsi
Copy link
Collaborator

aakashparsi commented Jul 9, 2021

Describe the bug
When I'm passing a different exogenous regressor for each of the nodes in the hierarchy. I'm getting an error related to the shape of the data frame.

To Reproduce
I have a three-level hierarchy tree just like load_mobility_data() but the columns have one external regressor each. I'm passing the data along with the regressors but still, I'm getting an error during the predict() method.

htsmodel = hts.HTSRegressor(model = 'auto_arima', revision_method = 'BU', n_jobs = 0)
htsfit = htsmodel.fit(trainSet, hierarchy, exogenous = exogenous_hier)
pred = htsfit.predict(steps_ahead = forecast_horizon, exogenous_df = testSet[exog_reorder])

Here's the stack trace for the same.

Fitting models:   4%|▍         | 87/2157 [00:00<00:07, 286.59it/s]
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
~/.local/lib/python3.6/site-packages/statsmodels/tsa/statespace/mlemodel.py in _validate_out_of_sample_exog(self, exog, out_of_sample)
   1757             try:
-> 1758                 exog = exog.reshape(required_exog_shape)
   1759             except ValueError:

ValueError: cannot reshape array of size 24840 into shape (6,1)

During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
<ipython-input-217-a527e0f5f0c9> in <module>
----> 1 pred = htsfit.predict(steps_ahead = forecast_horizon, exogenous_df = testSet[exog_reorder])

~/.local/lib/python3.6/site-packages/hts/core/regressor.py in predict(self, exogenous_df, steps_ahead, distributor, disable_progressbar, show_warnings, **predict_kwargs)
    344             disable_progressbar=disable_progressbar,
    345             show_warnings=show_warnings,
--> 346             distributor=distributor,
    347         )
    348         for key, forecast, error, residual in results:

~/.local/lib/python3.6/site-packages/hts/core/utils.py in _do_predict(models, function_kwargs, n_jobs, disable_progressbar, show_warnings, distributor)
     86 
     87     result = distributor.map_reduce(
---> 88         _do_actual_predict, data=models, function_kwargs=function_kwargs
     89     )
     90     distributor.close()

~/.local/lib/python3.6/site-packages/hts/utilities/distribution.py in map_reduce(self, map_function, data, function_kwargs, chunk_size, data_length)
    174         )
    175 
--> 176         result = list(itertools.chain.from_iterable(result))
    177 
    178         return result

~/.local/lib/python3.6/site-packages/tqdm/std.py in __iter__(self)
   1176 
   1177         try:
-> 1178             for obj in iterable:
   1179                 yield obj
   1180                 # Update and possibly print the progressbar.

~/.local/lib/python3.6/site-packages/hts/utilities/distribution.py in _function_with_partly_reduce(chunk_list, map_function, kwargs)
     39     kwargs = kwargs or {}
     40     results = (map_function(chunk, kwargs) for chunk in chunk_list)
---> 41     return list(results)
     42 
     43 

~/.local/lib/python3.6/site-packages/hts/utilities/distribution.py in <genexpr>(.0)
     38 
     39     kwargs = kwargs or {}
---> 40     results = (map_function(chunk, kwargs) for chunk in chunk_list)
     41     return list(results)
     42 

~/.local/lib/python3.6/site-packages/hts/core/utils.py in _do_actual_predict(model, function_kwargs)
    119         node=node,
    120         steps_ahead=function_kwargs["steps_ahead"],
--> 121         **function_kwargs["predict_kwargs"]
    122     )
    123     return key, model_instance.forecast, model_instance.mse, model_instance.residual

~/.local/lib/python3.6/site-packages/hts/model/ar.py in predict(self, node, steps_ahead, alpha, exogenous_df)
     62             ex = None
     63         in_sample_preds = self.model.predict_in_sample(X=ex, alpha=alpha)
---> 64         y_hat = self.model.predict(X=exogenous_df, alpha=alpha, n_periods=steps_ahead)
     65         return self._set_results_return_self(in_sample_preds, y_hat)
     66 

~/.local/lib/python3.6/site-packages/pmdarima/utils/metaestimators.py in <lambda>(*args, **kwargs)
     51 
     52         # lambda, but not partial, allows help() to work with update_wrapper
---> 53         out = (lambda *args, **kwargs: self.fn(obj, *args, **kwargs))
     54         # update the docstring of the returned function
     55         update_wrapper(out, self.fn)

~/.local/lib/python3.6/site-packages/pmdarima/arima/auto.py in predict(self, n_periods, X, return_conf_int, alpha, **kwargs)
    259             X=X,
    260             return_conf_int=return_conf_int,
--> 261             alpha=alpha,
    262         )
    263 

~/.local/lib/python3.6/site-packages/pmdarima/arima/arima.py in predict(self, n_periods, X, return_conf_int, alpha, **kwargs)
    679             end=end,
    680             X=X,
--> 681             alpha=alpha)
    682 
    683         if return_conf_int:

~/.local/lib/python3.6/site-packages/pmdarima/arima/arima.py in _seasonal_prediction_with_confidence(arima_res, start, end, X, alpha, **kwargs)
     81         end=end,
     82         exog=X,
---> 83         **kwargs)
     84 
     85     f = results.predicted_mean

~/.local/lib/python3.6/site-packages/statsmodels/tsa/statespace/mlemodel.py in get_prediction(self, start, end, dynamic, index, exog, extend_model, extend_kwargs, **kwargs)
   3301             kwargs = self.model._get_extension_time_varying_matrices(
   3302                 self.params, exog, out_of_sample, extend_kwargs,
-> 3303                 transformed=True, includes_fixed=True, **kwargs)
   3304 
   3305         # Make sure the model class has the current parameters

~/.local/lib/python3.6/site-packages/statsmodels/tsa/statespace/sarimax.py in _get_extension_time_varying_matrices(self, params, exog, out_of_sample, extend_kwargs, transformed, includes_fixed, **kwargs)
   1716 
   1717         # Get the appropriate exog for the extended sample
-> 1718         exog = self._validate_out_of_sample_exog(exog, out_of_sample)
   1719 
   1720         # Get the tmp endog, exog

~/.local/lib/python3.6/site-packages/statsmodels/tsa/statespace/mlemodel.py in _validate_out_of_sample_exog(self, exog, out_of_sample)
   1761                                  ' appropriate shape. Required %s, got %s.'
   1762                                  % (str(required_exog_shape),
-> 1763                                     str(exog.shape)))
   1764         elif self.k_exog > 0:
   1765             exog = None

ValueError: Provided exogenous values are not of the appropriate shape. Required (6, 1), got (6, 4140).

Expected behavior
Since the scikit-hts documentation clearly mentions that we have to pass a data frame containing the exogenous data for each of the nodes. So, this should not happen.

Desktop (please complete the following information):

  • OS: Windows 10
  • scikit-hts version: 0.5.11
  • Python version: 3.6.9
@aakashparsi aakashparsi added the bug Something isn't working label Jul 9, 2021
@aakashparsi aakashparsi changed the title [BUG] Error when forecasting with exogenous regressors for each of the node. [BUG] Error when forecasting with exogenous regressors for each of the nodes Jul 9, 2021
@aakashparsi
Copy link
Collaborator Author

I've also tried the same using the prophet model there were no issues. Everything was working fine.

@aakashparsi
Copy link
Collaborator Author

y_hat = self.model.predict(X=exogenous_df, alpha=alpha, n_periods=steps_ahead)

I guess we are passing the entire exogenous data frame every time to the predict() function instead, we can pass only the exogenous data frame that is required for a given node in the tree.

Updating the code like this
y_hat = self.model.predict(X=exogenous_df[self.node.exogenous], alpha=alpha, n_periods=steps_ahead)
fixed the issue.

@Downfor-u
Copy link

Downfor-u commented Aug 29, 2021

Hello, I'm facing the same problem here (see my post on [SO])(https://stackoverflow.com/questions/68939597/grouped-time-series-forecasting-with-scikit-hts).

@aakashparsi : I implemented the fix you propose but ended up with negative forecasts. Have you encountered the same problem ?
@carlomazzaferro : have concerns regarding negative forecasts been raised to you ?

Thanks in advance for the fix !

@aakashparsi
Copy link
Collaborator Author

@Downfor-u To fix negative forecasts, you could try square root transformation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants