You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Then, when predicting directly to xarray with model.predict, the prediction xr.Dataset will have a different data_var for each lead time.
DeepSensor's forecasting functionality has room for improvement. Here are three ideas with increasing implementation difficulty:
[Easy] Currently, the xarray/pandas output of model.predict time dimension refers to the initialisation date, which may be confusing and lead to bugs when users compare predictions with ground truth. Would be safer to make the time dimension the target date. One can then straight away compare predicted and true values for forecast validation.
[Intermediate] Rather than having to repeat the context/target datasets when initialising a TaskLoader like above, it could be useful to be able to specify a single array of lead times associated with a particular context/target set. This would then correspond to a single context/target set, rather than a different context/target set for each lead time. (Although, if missing data varies with time then one needs a different context/target set for each lead time to not lose data, otherwise a single missing value will 'broadcast' across time.)
[Difficult] Autoregressive forecasting (passing the model its own forecast as input and rolling the model out iteratively). I would envision this being part of the model.predict high-level inference functionality. I label this as 'difficult' because of the need to map the right target predictions to the right context sets for the roll-out, which may be tricky when the prediction variable is repeated in the context sets. There is also the design choice of whether we ask the user to state which target sets map to which context sets in model.predict (less legwork for us but not that user-friendly), or whether we try to handle this all under-the-hood by looking at the context/target objects and the delta_t args (which would be tough to implement).
The text was updated successfully, but these errors were encountered:
Currently, users can set up a forecasting model using
context_delta_t
/target_delta_t
args with aTaskLoader
. See the docs and example below:Then, when predicting directly to
xarray
withmodel.predict
, the predictionxr.Dataset
will have a differentdata_var
for each lead time.DeepSensor's forecasting functionality has room for improvement. Here are three ideas with increasing implementation difficulty:
model.predict
time dimension refers to the initialisation date, which may be confusing and lead to bugs when users compare predictions with ground truth. Would be safer to make the time dimension the target date. One can then straight away compare predicted and true values for forecast validation.context
/target
datasets when initialising aTaskLoader
like above, it could be useful to be able to specify a single array of lead times associated with a particular context/target set. This would then correspond to a single context/target set, rather than a different context/target set for each lead time. (Although, if missing data varies with time then one needs a different context/target set for each lead time to not lose data, otherwise a single missing value will 'broadcast' across time.)model.predict
high-level inference functionality. I label this as 'difficult' because of the need to map the right target predictions to the right context sets for the roll-out, which may be tricky when the prediction variable is repeated in the context sets. There is also the design choice of whether we ask the user to state which target sets map to which context sets inmodel.predict
(less legwork for us but not that user-friendly), or whether we try to handle this all under-the-hood by looking at the context/target objects and thedelta_t
args (which would be tough to implement).The text was updated successfully, but these errors were encountered: