Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved forecasting functionality #30

Open
tom-andersson opened this issue Jul 26, 2023 · 0 comments
Open

Improved forecasting functionality #30

tom-andersson opened this issue Jul 26, 2023 · 0 comments
Labels
enhancement New feature or request

Comments

@tom-andersson
Copy link
Collaborator

tom-andersson commented Jul 26, 2023

Currently, users can set up a forecasting model using context_delta_t/target_delta_t args with a TaskLoader. See the docs and example below:

task_loader = TaskLoader(context=[era5_ds['t2m'], era5_ds['t2m'], aux_ds], target=era5_ds['t2m'],
                         context_delta_t=[-1, 0, 0], target_delta_t=1)

Then, when predicting directly to xarray with model.predict, the prediction xr.Dataset will have a different data_var for each lead time.

DeepSensor's forecasting functionality has room for improvement. Here are three ideas with increasing implementation difficulty:

  • [Easy] Currently, the xarray/pandas output of model.predict time dimension refers to the initialisation date, which may be confusing and lead to bugs when users compare predictions with ground truth. Would be safer to make the time dimension the target date. One can then straight away compare predicted and true values for forecast validation.
  • [Intermediate] Rather than having to repeat the context/target datasets when initialising a TaskLoader like above, it could be useful to be able to specify a single array of lead times associated with a particular context/target set. This would then correspond to a single context/target set, rather than a different context/target set for each lead time. (Although, if missing data varies with time then one needs a different context/target set for each lead time to not lose data, otherwise a single missing value will 'broadcast' across time.)
  • [Difficult] Autoregressive forecasting (passing the model its own forecast as input and rolling the model out iteratively). I would envision this being part of the model.predict high-level inference functionality. I label this as 'difficult' because of the need to map the right target predictions to the right context sets for the roll-out, which may be tricky when the prediction variable is repeated in the context sets. There is also the design choice of whether we ask the user to state which target sets map to which context sets in model.predict (less legwork for us but not that user-friendly), or whether we try to handle this all under-the-hood by looking at the context/target objects and the delta_t args (which would be tough to implement).
@tom-andersson tom-andersson added the enhancement New feature or request label Jul 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant