Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question] AutoARIMA.forward #797

Open
peregrinter opened this issue Mar 10, 2024 · 5 comments
Open

[Question] AutoARIMA.forward #797

peregrinter opened this issue Mar 10, 2024 · 5 comments

Comments

@peregrinter
Copy link

Hi all,

I have a question regarding AutoARIMA.forward().

I'm using a Pandas series converted to a numpy array for the y value, which works totally fine.
However, I do not get the X_future to work due to wrong shape.

My data comes from a pd.dataframe with a datetime index.

How exactly should I reshape my future exogenous variables to work with this method?

Would appreciate help and thx for the package!

@elephaint
Copy link

AutoARIMA's fit takes in a dataframe containing the columns ds, unique_id, y, ex_1, ex_2, ...., ex_n. The first three are the timestamp, the id of the timeseries and the value of the time series. Thereafter, the exogenous variables follow. Note that you can have arbitrary column names for these exogenous variables.

AutoARIMA's predict (which I assume you're referring to in your question) can take an input X_df for future exogenous inputs. This dataframe should have the columns ds, unique_id, ex_1, ex_2, ...., ex_n. So, it basically contains the same columns as your training dataset, except for the target value.

How many rows should X_df have? Suppose you're predicting two timeseries, with a horizon of 12. Then, X_df should (at least) contain 12 * 2 = 24 rows, each row representing a set of future exogenous variables for the forecast horizon for one of the time series.

Let me know if this helps, and if not, please provide a (minimal) example of your code so that I can reproduce your issue.

@peregrinter
Copy link
Author

Hi @elephaint,

Thanks for your quick reply. I think what you mean is using the StatsForecast class for this, right? Maybe also, just to clarify, the future exogenous inputs (X_df) are incorporated into the forecast by the current time stamp (so it does not consider values further than forecast horizon)? But the exo features provided during fit are also future exogenous features?

I tried integrating the AutoARIMA class into my forecasting pipeline, which uses another packages TimeSeries class, for data representation.

Basically, what I am after is having the ability to evaluate the fitted model on different datasets. Mainly due to comparability with other ML models. This is also why I chose the forward method (says per documentation that this works for predicting different/new series).

The snippet below works fine if no future exogenous variables are provided. However, it would be a nice addition for me. 

class AutoARIMA:
    def __init__(self, *autoarima_args, **autoarima_kwargs):

        self.model = AutoARIMA(*autoarima_args, **autoarima_kwargs)

    def fit(self, series: TimeSeries, future_covariates: Optional[TimeSeries] = None):

        self.model.fit(
            y=np.squeeze(series[0].values(copy=False)),
            X=future_covariates.values(copy=False) if future_covariates else None,
        )
        return self

    def predict(
        self,
        n: int,
        series: Optional[TimeSeries] = None,
        future_covariates: Optional[TimeSeries] = None,
        num_samples: int = 1,
        verbose: bool = False,
    ):

        y = np.squeeze(series.values(copy=False))

        forecast_dict = self.model.forward(
            h=n,
            y=y,
            X=future_covariates.values(copy=False) if future_covariates else None,
            level=(one_sigma_rule,),
        )

        mu, std = unpack_sf_dict(forecast_dict)
        if num_samples > 1:
            samples = create_normal_samples(mu, std, num_samples, n)
        else:
            samples = mu

        return _build_forecast_series(samples, input_series=series)
        
       ````

@peregrinter
Copy link
Author

peregrinter commented Mar 11, 2024

Edit: So fitting works fine, if y and X have the same shape (after slicing the future exogenous regressors).

For forwards method, I still get either a shape error or xreg error.

@peregrinter
Copy link
Author

peregrinter commented Mar 11, 2024

@elephaint It seems like I got this to work now. Seems like I missunderstood the doc. X are the exogenous features for the series supplied in y and X_future those for the horizon.

Does this look plausible to you? Maybe you can also help me understand the difference between forecast and forward. Does forecast fit the model again to the newly provided series?

Thx for your help.

    def predict(
        self,
        n: int,
        series: Optional[TimeSeries] = None,
        future_covariates: Optional[TimeSeries] = None,
        num_samples: int = 1,
        verbose: bool = False,
    ):

        forecast_dict = self.model.forward(
            h=n,
            y=np.squeeze(series[0].values(copy=False)),
            X=future_covariates[0].drop_before(series[0].start_time() - timedelta(days=1)).drop_after(series[0].end_time() + timedelta(days=1)).values(copy=False) if future_covariates else None,
            X_future=future_covariates[0].drop_before(series[0].end_time()).drop_after(n).values(copy=False) if future_covariates else None,
            level=(one_sigma_rule,),
        )

@elephaint
Copy link

Indeed, your understanding of the exogenous is now correct.

Code seems plausible; You're right about the difference between forward and forecast: the latter will fit the model again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants