Hierarchical optimization in D2D #147

gdudziuk · 2020-02-10T16:34:03Z

I have implemented a simple form of hierarchical optimization in D2D and I would like to propose to pull this feature upstream.

The method of hierarchical optimization is described in detail in (Weber et al., 2011) and (Loos et al.,2018). Let me however illustrate this method for the following simplified parameter estimation problem:

(Eq. 1) \min_{p} J_1(p) = \sum_{j=1}^{J} (Y_j - y(p,t_j))^2/{std}_j^2

where Y_j and {std}_j represent experimental measurements and associated standard deviations given or arbitrary scale and y(p,t_j) is the model observable corresponding to the measurements Y_j. Assume that the observable y is of form y(p,t)=s*x(r,t) where x(r,t) denote model species (or some derived variable) corresponding to the measurements Y_j, s is a scale parameter to be estimated along with the dynamic parameters r and p=(r,s). Then, the above optimization problem becomes

(Eq. 2) \min_{r,s} J_2(r,s) = \sum_{j=1}^{J} (Y_j - s*x(r,t_j))^2/{std}_j^2

The core observation is that the function J_2 is quadratic and positive definite w.r.t. the scale s, assuming that at least some of x(r,t_j) are nonzero. Thus, given parameters r, the optimal scale s=s(r) is uniquely determined and the optimization problem (Eq. 2) is equivalent to

(Eq. 3) \min_{r} J_3(r) = \sum_{j=1}^{J} (Y_j - s(r)*x(r,t_j))^2/{std}_j^2

The optimal scales s(r) may be expressed explicitly. To do so, it is enough to resolve the condition \grad_{s} J_2(r,s)=0 w.r.t. s, which gives:

(Eq. 4) s(r) = \frac{\sum_{j} x_(r,t_j)*Y_j/{std}_j^2}
                    {\sum_{j} x_(r,t_j)^2/{std}_j^2}

Having this, we may also explicitly compute the gradient of J_3 w.r.t. r (assuming that we can compute the sensitivities of x w.r.t r) and use it to feed any gradient-based optimization algorithm.

To sum up, thanks to the specific form of the observable y, namely y=s*x, we can replace the optimization problem (Eq. 2) with the equivalent problem (Eq. 3) which has less parameters to optimize. This is the method of hierarchical optimization. From my experience (and as one could expect), this can significantly improve the optimization speed. The trade off is that we need to assume a specific form of the observable y. Nevertheless, I think that the case of y=s*x is common enough to implement hierarchical optimization for it.

Theoretically, the method outlined above may be generalized to more complex observables, like e.g. y=s*x+b where b denotes an offset parameter. Such observables perhaps cover vast majority of practical applications. Having y=s*x+b we proceed analogously as above, namely instead of (Eq. 2) we have

(Eq. 2a) \min_{r,s,b} J_2(r,s,b) = \sum_{j=1}^{J} (Y_j - s*x(r,t_j) - b)^2/{std}_j^2

and we need to compute both the optimal scale s=s(r) and the optimal offset b=b(r). The problem with such generalization is very pragmatic - the formulas for s(r) and b(r) become much more complex than in the simple case of y=s*x discussed above. For this reason I have implemented the hierarchical optimization just for the latter simple case and to be honest I think I will not implement the generalization to y=s*x+b soon.

Usage

There is function arInitHierarchical which detects all observables of form y=s*x end enables the "hierarchical mode" of optimization. To get it work, just put arInitHierarchical before arFit (or arFitLHS or anything like that). There is also function arDisableHierarchical which switches back to the "normal mode". So an example setup with hierarchical optimization may look like this:

arInit
arLoadModel(.....)
arLoadData(.....) % Assume that at least some data has observables of form y=s*x
arCompileAll

arFit % Here we optimize in the "normal mode", i.e. we optimize J_2(r,s) as in (Eq. 2)
arInitHierarchical % Here we enter the "hierarchial mode"...
arFit  % ... and fine-tune the fit by optimizing J_3(r) as in (Eq. 3)
arDisableHierarchical % Switch back to the "normal mode"

The optimal scales s(r) are computed by function arCalcHierarchical, which is called from arCalcRes and arCalcSimu. As a result, any D2D function that internally calls arSimu can benefit from the "hierarchical mode". For example, in the following snippet

arInitHierarchical
arPlot

the scale parameters (if any) will be automatically fine-tuned prior to generating the plots.

Limitations

The implementation in this pull request strongly relies on the specific form (Eq. 2) of the optimization problem. Consequently, any D2D feature that modifies J_2 as in (Eq. 2) into something else cannot be used with the hierarchical optimization, at least until suitable generalizations are implemented. Up to what I know, these interfering features are:

Interpreting observables and data in the log10 scale
Error fitting (Edit: Even more, it is required that errors does not depend on r nor s, even implicitly; cf. the comment below)
Detection limits
The features handled in arCollectRes, namely:
- Intercondition constraints and steady state conditions
- Priors other than standard flat box
- Random effects
- User-defined residuals

With this features enabled, the current implementation of the hierarchical optimization works incorrectly since the optimal scales s(r) are in such case different than in (Eq. 4). Function arCalcHierarchical raises an error if any of these features is in use. I have done my best to make the above list complete. However, I'm a newcomer to D2D so there may be some other features that I don't know yet and that may interfere with the hierarchical optimization by modifying the merit function J_2. So if anybody finds that there are some more features like that in D2D, please let me know.

Further development

The possible directions of generalization of the method are:

Add support for the hierarchical optimization for observables with offsets, y=s*x+b, possibly with some restricting assumptions.
Add support for the hierarchical optimization in combination with the features listed in the preceding section. In this respect, some features are more cumbersome than others. Specifically, support for priors should be simple to implement.

- Improve the docstring in arCalcHierarchical - Remove a redundant line in arInitHierarchical

Previously, this argument was accepet but unused by arCalcHierarchical. Now it is used to avoid computing scale gradients and sensitivities of observables when not necessary.

Previously, arInitHierarchical was using isSymType, the latter being a function introduced in R2019a. This commit removes this dependence and instead implements a small custom function which does the trick.

The scale parameters suitable for hierarchical optimization are those which appear _only_ in observables of form `s*xz` where `s` is the subject scale parameter and `xz` is a model species or a derived variable. Until now, arInitHierarchical has been detecting parameters which appear _at least in one_ observable of form `s*xz` (meaning that they were allowed to appear also in obervables of other form). Consequently, parameters not suitable for the hierarchical optimization could be detected by arInitHierarchical and the results of the hierarchical optimization could be not correct. This commit fixes it by adding extra checks in arInitHierarchical to make the latter function detect only the right parameters.

The implementation of hierarchical optimization depends on the specific form of the merit function. We currently assume the following one: J(s,p) = \sum_j (data_j - s*x(p,t_j))^2/std_j^2 Each feature that modifies this expression also breaks the method of hierarchical optimization, making its results incorrect. To avoid this, in arCalcHierarchical we try to check if d2d features affecting the form of the merit function are enabled. This commit adds extra checks of this kind in arCalcHierarchical, hopefully making the checklist complete (or close to complete).

The removed settings changes are redundant since these settings are tested anyway in arCalcHierarchical.

Currently arCalcHierarchical requires no error fitting, meaning that there can be no dependence of errors on any fittable parameters. Until now, arCalcHierarchical has detected error fitting incorrectly, excluding some possible settings which actually had no error fitting. In other words, the test for error fitting has been to strict. This commit fixes the issue by implementing a better test for error fitting in arCalcHierarchical.

This commit updates the help text of arInitHierarchical to reflect the changes in its functionality introduced in commit c3bbe5a. This should had been done in that commit. Additionally, the present commit fixes a typo in the subject help text.

Hierarchical optimization is not feasible if the solution of the model is zero for all measurement times. This commit adds the respective assertion in arCalchierarchical. This extra assertion is not strictly necessary. Degeneration of the solution to zero would result in NaN scales, subsequently resulting in NaN residuals, which would trigger an error anyway. Nevertheless, with the extra assertion we get a more specific error message which may make it easier to debug.

gdudziuk · 2020-02-11T12:57:49Z

An update. As long as this implementation relies on the form (Eq. 2) of the target function, it is important that {std}_j do not depend on the dynamic parameters r nor on the scale parameter s, neither explicitly nor implicitly, e.g. via model observables. This excludes error models with relative measurement noise, e.g. std(q1,q2,s,r) = sqrt(q1^2 + q2^2*y(s,r)^2), even if the error parameters q1 and q2 are constant.

So I need to write another commit to fix this. I will enforce the use of experimental error if the hierarchical optimization is enabled. Such a solution seems restrictive but:

Error models that don't have any dependence on s nor r and which error parameters are constant always can be represented as concrete error values for particular replicates, i.e. can be represented as experimental errors. So in fact such a solution is not more restrictive than necessary.
It is also simpler to implement than detecting error definitions that has any dependence on s or r.

I will commit in a moment. (Edit: Done, cf. commit 1b7fb04)

clemenskreutz · 2020-02-11T13:07:20Z

Dear gdudziuk.
Thanks for your contribution. At a first glance, it looks very valuable. I also deeply acknowledge your efforts in documentation. We check the code during the next days.

Currently arCalcHierarchical requires no dependence of errors on any fittable parameters, neither explicitly nor implicitly. Until now, arCalcHierarchical has been asserting that there is no fitting of error parameters, but at the same time has been allowing implicit dependence of errors on fittable parameters of the model via observables. This is a bug since with such implicit dependence of errors on fittable parameters the results of arCalcHierarchical are incorrect. This commit fixes it by asserting the use of experimental errors, which are independent of the fittable parameters by definition.

clemenskreutz · 2020-02-28T15:06:03Z

Dear
Sorry for my delayed response.
I startet testing and found the following problem:
You (sometimes) do not consider that there might be several models, i.e. ar.model needs an index ar.model(m). So you always need a loop for m=1:length(ar.model), ar.model(m) ...

Example: Lines 24-25 in arInitHierarchical.m:
xSyms = cellfun(@sym, ar.model.x, 'UniformOutput', false);
zSyms = cellfun(@sym, ar.model.z, 'UniformOutput', false);

Could you please update the code accordingly?
Thanks in advance.
Best, Clemens.

clemenskreutz · 2020-02-28T15:15:04Z

Another question: I did the following:

arCopyBenchmarkModels('Swa')
cd Swameye_PNAS2003\
Setup
arInitHierarchical
arDisableHierarchical % Switch back to the "normal mode"

There is a warning that I don't understand and obviously no scaling parameters were found. Why?

` arInitHierarchical
Warning: Unable to prove 'offset_tSTAT == 0'.
In symengine
In sym/isAlways (line 42)
In arInitHierarchical (line 80)
Warning: Unable to prove 'offset_pSTAT == 0'.
In symengine
In sym/isAlways (line 42)
In arInitHierarchical (line 80)
Warning: No scale parameters found for hierarchical optimization.
In arInitHierarchical (line 224)

arDisableHierarchical % Switch back to the "normal mode"
Error using arDisableHierarchical (line 18)
Hierarchical optimization is not enabled. You have not called
arInitHierarchical or there are no scale parameters suitable for
hierarchical optimization in your observables.`

At the moment, arCalcHierarchical allows experimental errors only, so the lines which handle the case of model-based errors are redundant and obscure the code. This commit removes such lines. See also commit 1b7fb04.

In arInitHierarchical, the cell arrays xSyms and zSyms has been created assuming that there is only one model loaded (i.e. the length of the ar.model structure is 1). This commit corrects the support for multiple models in arInitHierarchical.

gdudziuk · 2020-03-02T14:23:23Z

You (sometimes) do not consider that there might be several models, i.e. ar.model needs an index ar.model(m).

Right. Fixed in commit a35790f. Thank you for pointing this out.

In addition I have found another small issue - I will fix it this week. I will also take a look at the model of Swameye and let you know what's going on there.

For certain purposes, in arCalcHierarchical we have been testing if all elements ar.model(im).data(id).useHierarchical are NaNs. But this cannot happen since the array useHierarchical is constructed such that its elements may be either false or true (and not NaN). So we should rather test if the elements of useHierarchical are all false. This commit fixes this issue.

Until now, arCalcHierarchical has been working correctly only for data series with no duplicate time points (or, more generally, predictor values), despite D2D allows such duplicates in the data. This commit fixes this issue.

There has been a code snippet in a loop over observables which has been independent of the obervables. This has resulted in unnecessary repeated assignments inside this snippet. So this commit moves this snippet to a more suitable location in the body of arInitHierarchical.

Function arInitHierarchical calls function isAlways which by default rasises a warning when testing a condition like `sym('a')==0`. Such conditions appear in run of normal operation of arInitHierarchical so the warning in question is unnecessary. This commit switches it off.

Some of the assertion tests in arCalcHierarchical should be required only for those observables which have parameters suitable for hierarchical optimization. Until now, arCalcHierarchical has been testing such assertions for all observables, thus being overly restrictive. This commit fixes this issue.

Until now, arCalcHierarchical has been working correctly only for data series with no missing observations, despite D2D allows such data. This commit fixes this issue.

gdudziuk · 2020-03-05T18:25:34Z

I have played a bit more with example models shipped with D2D and discovered some issues in my code. They are already fixed. From the model "Becker_Science2010" I have learned that D2D allows repeated measurements in the data sheets (which I should had figured out earlier; see commit 7feb387) and from the model "Swameye_PNAS2003" I have learned that there also may be missing measurements in the data sheets (see commit 14319ec). I hope that now hierarchical optimization works correctly with all possible data.

Concerning the warning "Unable to prove 'some_parameter == 0'.", this is a warning raised by the symbolic function isAlways, which is used in arInitHierarchical. Since this warning was occurring during normal operation of arInitHierarchical, I have decided to switch it off (see commit 050f114).

Concerning the warning "No scale parameters found for hierarchical optimization.", this is correct since there are no such parameters in the model "Swameye_PNAS2003". At the moment, my implementation supports hierarchical optimization only for observables of form scale*xz and the model of Swameye has observables of form scale*xz + offset, which is a different case.

I suspect that this may be a problem for you. I can see that many D2D examples involve not only scales but also offsets so you probably would like to have the hierarchical optimization generalized to cover the setups involving offsets. But if this is acceptable for you to have that simple implementation as it is now, I can provide some toy model which demonstrates the use of the hierarchical optimization in the present state of development.

Of course, for the test purposes you can get hierarchical optimization work with the model of Swameye by manually removing the offset parameters from that model.

gdudziuk · 2020-05-16T11:25:28Z

So, what do you think, @clemenskreutz?

gdudziuk added 13 commits January 30, 2020 13:40

Implement basic hierarchical optimization

dc82ca3

Minor cleanup in hierarchical optimization

ae655ee

- Improve the docstring in arCalcHierarchical - Remove a redundant line in arInitHierarchical

Use argument sensi in arCalcHierarchical

ec877ff

Previously, this argument was accepet but unused by arCalcHierarchical. Now it is used to avoid computing scale gradients and sensitivities of observables when not necessary.

Autocompute hierarchical scales with fine simu

539a55c

Support hierarchical optimization with Matlab <R2019a

ff37968

Previously, arInitHierarchical was using isSymType, the latter being a function introduced in R2019a. This commit removes this dependence and instead implements a small custom function which does the trick.

Improve assertion messages in arCalcHierarchical

2c16a0e

Remove some settings changes from arInitHierarchical

c3bbe5a

The removed settings changes are redundant since these settings are tested anyway in arCalcHierarchical.

Add a function for disabling hierarchical optimization

b0cc36b

gdudziuk added 2 commits March 2, 2020 13:45

Remove some redundant lines in arCalcHierarchical

03c9f23

At the moment, arCalcHierarchical allows experimental errors only, so the lines which handle the case of model-based errors are redundant and obscure the code. This commit removes such lines. See also commit 1b7fb04.

Fix a bug in arInitHierarchical

a35790f

In arInitHierarchical, the cell arrays xSyms and zSyms has been created assuming that there is only one model loaded (i.e. the length of the ar.model structure is 1). This commit corrects the support for multiple models in arInitHierarchical.

gdudziuk added 8 commits March 3, 2020 16:05

Fix a bug in arCalcHierarchical - support data repetitions

7feb387

Until now, arCalcHierarchical has been working correctly only for data series with no duplicate time points (or, more generally, predictor values), despite D2D allows such duplicates in the data. This commit fixes this issue.

Correct an inadequate error message in arCalcHierarchical

aa40219

More informative final message of arInitHierarchical

855c543

Fix a bug in arCalcHierarchical - support missing observations

14319ec

Until now, arCalcHierarchical has been working correctly only for data series with no missing observations, despite D2D allows such data. This commit fixes this issue.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hierarchical optimization in D2D #147

Hierarchical optimization in D2D #147

gdudziuk commented Feb 10, 2020 •

edited

gdudziuk commented Feb 11, 2020 •

edited

clemenskreutz commented Feb 11, 2020

clemenskreutz commented Feb 28, 2020

clemenskreutz commented Feb 28, 2020 •

edited

gdudziuk commented Mar 2, 2020 •

edited

gdudziuk commented Mar 5, 2020

gdudziuk commented May 16, 2020

Hierarchical optimization in D2D #147

Are you sure you want to change the base?

Hierarchical optimization in D2D #147

Conversation

gdudziuk commented Feb 10, 2020 • edited

Usage

Limitations

Further development

gdudziuk commented Feb 11, 2020 • edited

clemenskreutz commented Feb 11, 2020

clemenskreutz commented Feb 28, 2020

clemenskreutz commented Feb 28, 2020 • edited

gdudziuk commented Mar 2, 2020 • edited

gdudziuk commented Mar 5, 2020

gdudziuk commented May 16, 2020

gdudziuk commented Feb 10, 2020 •

edited

gdudziuk commented Feb 11, 2020 •

edited

clemenskreutz commented Feb 28, 2020 •

edited

gdudziuk commented Mar 2, 2020 •

edited