Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reconciliation interface design #366

Open
21 tasks
mitchelloharawild opened this issue Oct 6, 2022 · 11 comments
Open
21 tasks

Reconciliation interface design #366

mitchelloharawild opened this issue Oct 6, 2022 · 11 comments
Assignees
Labels
enhancement New feature or request help wanted Extra attention is needed reconciliation Related to hierarchical reconciliation

Comments

@mitchelloharawild
Copy link
Member

mitchelloharawild commented Oct 6, 2022

User-defined control parameters.

  1. Construction
  • Projection
  • Structural
  • ERM (low-priority)
  1. Weight matrix (typically requires access to model object and varies with data structure)
  • OLS
  • WLS
  • Structural
  • Sample
  • Shrinkage
  • More common types...
  • Time varying (maybe?)
  • Custom matrix
  1. Optimisation technique
  • Regular minimisation
  • Non-negative (LP, Heuristic)
  • Constraint matrix LP
  1. Data structure
  • Cross-sectional (Hierarchical & Grouped)
  • Temporal (Hierarchical & Grouped)
  • Cross-temporal
  • Arbitrary acyclical graphs (maybe?)
  • Disjoint
  1. Combination method/type
  • Additive
  • Linear combination

Are there more things that can be customised here?


User interface

Data structure and value combination method/type

Data structure and combination method are passed in via data attributes created at the aggregate_*() step.
Allow the user to directly impose data structure constraints, for example defining a pre-existing aggregation structure from the data.
This can also be used to remove aggregation structure to create disjoint hierarchies
For example, you may have a cross-temporal structure but only want to make it temporally coherent. To achieve this, you can remove the key aggregation constraints.

Hold onto aggregation structure in <tsibble>, and <mdl_lst>

Code

Allow reconciliation of mables, fitted models, and model definitions.

Option A - reconcile() on model with all params as args

reconcile(<mbl_df>, lm = min_trace(lm, ...), ...) # as before, maybe soft-deprecated?

mutate(<mbl_df>,  lm = reconcile(lm, ...), ...)
mutate(<mbl_df>,  lm_ols = reconcile(lm,weights = weight_ols), lm_shr = reconcile(lm,weights = weight_shr), ...)

reconcile(<mdl_lst>, ???) 
reconcile(<mdl_def>, ???)

reconcile(object, weights = weight_fn, construction = constr_fn, opt_method = opt_fn)

Option B - reconcile() on mable with opt function as reconcile input fn

reconcile(<mbl_df>, lm = gls(lm, weights = weight_fn,, ...), ...)
reconcile(<mbl_df>, lm = nn(lm, weights = weight_fn, ...), ...)
reconcile(<mbl_df>, lm = lp_constrained(lm, weights = weight_fn,, ...), ...)

reconcile(<mdl_lst>, opt_fn = gls, weights = weight_fn, ... ) #??? 
reconcile(<mdl_def>, ???)

Option C - reconcile() on mable with construction function as reconcile input fn

reconcile(<mbl_df>, lm = proj(lm, weights = weight_fn, ...), ...)
reconcile(<mbl_df>, lm = struc(lm, weights = weight_fn, ...), ...)

reconcile(<mdl_lst>, opt_fn = gls, weights = weight_fn, ... ) #??? 

Option D - reconcile() on mable with node utilisation function as reconcile input fn

reconcile(<mbl_df>, lm = top_down(lm, weights = weight_fn, optimiser = opt_fn, ...), ...)
reconcile(<mbl_df>, lm = middle_out(lm, weights = weight_fn, optimiser = opt_fn, ...), ...)
reconcile(<mbl_df>, lm = bottom_up(lm, weights = weight_fn, optimiser = opt_fn, ...), ...)
reconcile(<mbl_df>, lm = all_nodes(lm, weights = weight_fn, optimiser = opt_fn, ...), ...)

Attention: @danigiro, @robjhyndman, @GeorgeAthana

@mitchelloharawild mitchelloharawild added enhancement New feature or request help wanted Extra attention is needed reconciliation Related to hierarchical reconciliation labels Oct 6, 2022
@mitchelloharawild mitchelloharawild self-assigned this Oct 6, 2022
@FinYang
Copy link

FinYang commented Oct 7, 2022

Do people get to vote on it ;)? I like options A and B - isn't it possible to implement both of them (or A and C for that matter) if reconcile is S3? (Not sure if implementing both is a good design choice)

@danigiro
Copy link

danigiro commented Oct 7, 2022

One step back

Talking with Tommy, the role of reconciliation is unclear. In this framework, we are doing:

data |>
  ... |>
  model(...) |>
  reconcile(...) |>
  forecast(...)

However, the real strength of reconciliation is that it is based on forecasts, not models. For example, the previous structure does not match with judgmental forecasts. In such situations, we need somenthing like this

data |>
  ... |>
  forecast(...) |>
  reconcile(...)

However, how to take the residuals for the covariance matrix is still a problem with this configuration.
We need to talk more about that.

@mitchelloharawild
Copy link
Member Author

Yes, welcoming votes and discussion.
I just chatted with @robjhyndman and come up with option d, where the function describes the utilisation of forecasts across the graph. This is my current preference, and is most similar to our current interface (min_trace() -> all_nodes()) or something similar.

It's possible to implement all of the above at the same time, but that could be confusing as many functions give the same result.

@mitchelloharawild
Copy link
Member Author

We can also have a reconcile() method for <fable> classes if it is really needed, but I don't see why this is required yet.
Could you elaborate on the judgemental forecast reconciliation a bit more?

@danigiro
Copy link

danigiro commented Oct 7, 2022

We can also have a reconcile() method for <fable> classes if it is really needed, but I don't see why this is required yet. Could you elaborate on the judgemental forecast reconciliation a bit more?

Yes sure. The judgemental forecasting (e.g. the Delphi method) I was referring to is just an example where reconciliation should be applicable when the model object is not readily available.

For example, when one has forecasts that do not come from the fable package (maybe come from computationally intensive machine learning models in python or c++) but are stored in a csv file and loaded in R as a fable object, reconciliation should still be possible, since reconciliation depends on the forecasts themselves (and the covariance matrix), not models that generate the forecasts.

If fable contains all possible forecast models so forecasts can come from fable in any case, then building the reconcile function only on top of mable objects may be reasonable, but that might be too strong of an assumption to make.

@FinYang
Copy link

FinYang commented Oct 7, 2022

I agree with @danigiro in saying that reconciliation should be independent of the models. When I first read how fable implement reconciliation (the current interface), I thought reconcile coming before forecast is because of practical considerations such that some information is only accessible in mables (e.g. covariance matrix) (and the actual reconciliation is done inside forecast method anyway), but

data |>
  ... |>
  forecast(...) |>
  reconcile(...)

is really how I think of reconciliation.

@mitchelloharawild
Copy link
Member Author

I agree that it should be possible to reconcile a <fable>, but also think that it should be possible to impose reconciliation constraints on a list of models in a <mable>. I think we should support both, but reconciling a <fable> will require more inputs (such as its weights / residuals / response / etc.)

The current interface of reconciling a mable is part practical and part conceptual. Broadly speaking I think reconciliation (or producing coherent forecasts) is satisfying some additional constraints on the model. If these constraints are imposed, they should also hold true for in-sample fitted values and residuals. In the future I plan for fitted() to optionally provide a <fable> output which can/is coherent.

@cynthiahqy
Copy link

From my discussions with @mitchelloharawild today, there seems to be some overlap between "reconciliation of forecasts" and "reconciliation of data" more generally -- i.e. users might need to do "reconciliation" on imported data before any modelling.

In general, it could be useful to make something like top_down() in OPTION D applicable to more than just the forecasts. I'm working on a functional approach (in the "matrices as a map" sense) to data harmonisation in {conformr} that could be extended to facilitate reconciliation/coherency of data.

@mitchelloharawild
Copy link
Member Author

Had another discussion with @robjhyndman today, mostly about graph reconciliation data structures.

We have also discussed functions to impose aggregation constraints into a tsibble. A weights column could be used for defining linear combination weights across nodes, but for arbitrary graphs an edge linked weights matrix might be needed.

Option D is the interface we're leaning toward, the function drastically changes the output and is simple to learn. All other parameters can then be arguments with suitable defaults.

Some practical examples on graph coherency would be useful before finalising an interface for this. From my imagining it seems that graphs are usually suited to being the only aggregation column, but it is theoretically possible to nest and cross these graph hierarchies.

I'm struggling to think of a graph linear combination reconciliation problem that isn't adequately represented with grouped hierarchies. The closest I've come is this toy example:

image

The number of apples and oranges sold determine both the total weight of produce and the total price/sales over time. Then these metrics are combined to give some measure of value. 🤷

I'm going to continue trying to think of useful graph reconciliation problems, but it may not be something necessary to incorporate into the interface. If it is incorporated I think the graphs would be represented via a single key column with some extra attributes that describe the relations between nodes.

@mitchelloharawild
Copy link
Member Author

I've thought about this more from a data structures perspective and I think it is neatest to store a graph of the constraints in the tsibble/mable/fable objects.

https://arxiv.org/abs/2204.09231 provides some details on how to keep some nodes immutable, which I think is how we should handle 0 variance nodes.

@danigiro
Copy link

I've thought about this more from a data structures perspective and I think it is neatest to store a graph of the constraints in the tsibble/mable/fable objects.

Yes, I think so too

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed reconciliation Related to hierarchical reconciliation
Projects
None yet
Development

No branches or pull requests

4 participants