Prior statistics and set-up for multidimensional priors #4907

katrinstreil · 2023-11-16T13:32:28Z

Description

This pull request modifies how priors are evaluated with a separate stat method: prior_fit_statistic(priors).
The priors themselves are no longer saved on the parameters object but on models.
To allow the evaluation of multidimensional priors, the information on the model parameters is saved on the prior instance as modelparameters.

codecov · 2023-11-16T13:37:58Z

Codecov Report

Attention: 10 lines in your changes are missing coverage. Please review.

Comparison is base (303a7c9) 75.73% compared to head (f194f46) 75.76%.
Report is 40 commits behind head on main.

Files	Patch %	Lines
gammapy/modeling/parameter.py	44.44%	5 Missing ⚠️
gammapy/stats/fit_statistics.py	66.66%	3 Missing ⚠️
gammapy/modeling/models/core.py	85.71%	1 Missing ⚠️
gammapy/modeling/models/prior.py	95.65%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #4907      +/-   ##
==========================================
+ Coverage   75.73%   75.76%   +0.02%     
==========================================
  Files         226      226              
  Lines       33149    33197      +48     
==========================================
+ Hits        25105    25151      +46     
- Misses       8044     8046       +2

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

adonath

Thanks @katrinstreil! I have left some first comments in line.

adonath · 2023-11-16T14:36:29Z

gammapy/datasets/core.py

+                prior_stat_sum -= prior_fit_statistic(dataset.models.priors)
+        # compute prior_fit_statistics from all datasets
+        if self.models is not None:
+            prior_stat_sum += prior_fit_statistic(self.models.priors)


Why does one prior count positive the other negative?

Datasets.model includes all models, why is there an independent sum for dataset.models?

adonath · 2023-11-16T14:40:04Z

gammapy/datasets/map.py

-            prior_stat_sum = self.models.parameters.prior_stat_sum()
-
-        counts, npred = self.counts.data.astype(float), self.npred().data
+            prior_stat_sum = prior_fit_statistic(self.models.priors)


Wouldn't this double count the prior, when it is evaluated again at the Datasets.stat_sum() level?

adonath · 2023-11-16T14:44:05Z

gammapy/modeling/models/prior.py

@@ -32,7 +32,15 @@ def _build_priorparameters_from_dict(data, default_parameters):
 class Prior(ModelBase):
    _unit = ""

-    def __init__(self, **kwargs):
+    def __init__(self, modelparameters, **kwargs):


I think taking the Parameter object here on init is rather confusing to establish the linking between parameters and their priors. I think this also deviates from the API proposed in the PIG.

adonath · 2023-11-16T14:44:48Z

gammapy/modeling/models/prior.py

@@ -144,8 +162,7 @@ class GaussianPrior(Prior):
    mu = PriorParameter(name="mu", value=0)
    sigma = PriorParameter(name="sigma", value=1)

-    @staticmethod
-    def evaluate(value, mu, sigma):
+    def evaluate(self, value, mu, sigma):


self I snot used, this can remain a static method.

adonath · 2023-11-16T14:44:57Z

gammapy/modeling/models/prior.py

@@ -172,8 +189,7 @@ class UniformPrior(Prior):
    min = PriorParameter(name="min", value=-np.inf, unit="")
    max = PriorParameter(name="max", value=np.inf, unit="")

-    @staticmethod
-    def evaluate(value, min, max):
+    def evaluate(self, value, min, max):


Same as above...

adonath · 2023-11-16T15:25:50Z

There are some aspects, that are confusing to me in the current implementation.

I'm not sure we should bind the prior evaluation to .stat_sum(). Currently priors might be double counted as they are evaluated in Dataset.stat_sum() as well as Datasets.stat_sum()
We could possibly introduce .stat_sum_prior() and .stat_sum_posterior() on both Dataset as well as Datasets if we want to distinguish.

Next question is how the evaluate hierarchical priors. I think that priors exist on multiple levels of the model hierarchy:

Parameter.prior
SpectralModel.prior, SpatialModel.prior, etc.
SkyModel.prior
Models.prior

And I think they can be treated independently. So e.g. if a users defines:

norm_pl = PowerLawNormSpectralModel()
norm_pl.prior = MultivariateGaussianPrior()

Then norm_pl.norm_1.prior1 is still None`. This is for example different than the covariance handling we have.

In general I think it makes sense to define a MultivariatePrior class. This class could be created "on the fly" on the objects higher in the hierarchy and gather the priors from the models lower in the hierarchy. So the information flows bottom to top only(!). The MultivariatePrior class could indeed take the Parameters object internally, very similar to the Covariance class we already have. However users are never required to pass the parameters themselves, the linking happens internally.

These are a a few half-structured thoughts, maybe they help a bit in developing a clearer vision of the implementation.

katrinstreil · 2023-11-16T17:04:52Z

Thanks @adonath for the comments!

About the first thing you mention concerning the evaluation of the prior in dataset and datasets:

In the here proposed implementation, the prior_stat_sum is always added to the stat_sum. When the stat_sum of a datasets instance gets computed, the prior_stat_sum is counted multiple times. To avoid this, I subtract the individual prior_stat_sums and only count the joint prior_stat_sums. I agree that this could be more elegant.
I like the proposed way of having a .prior_stat_sum(), .stat_sum() and .posterior_stat_sum() = .prior_stat_sum() + .stat_sum(). However, this might be complex for the user.

About the hierarchical treatment of the prior:
I am not sure if we could cover all possible use cases without the user having to define the Parameters on which a multivariate prior sits.

For instance, coupling two indices from two Skymodels?
If one passes the Parameters to the MultiVariatePrior it would look like this:

model1 = PowerLawSpectralModel()
model2 = PowerLawSpectralModel()
MulitVariatePrior(covariance_matrix = C, 
                  modelparameters = [model1.parameters['index'], 
                                     model2.parameters['index'])

Which intrinsically sets: .prior on the two Parameter instances.

How would this is implemented without passing the model parameters to the prior instance? The only possibility I can think of is the following:

multivprior = MultiVariatePrior(covariance_matrix = C)
model1.parameters['index'].prior = multivprior
model2.parameters['index'].prior = multivprior

And then the mapping happens internally.

In your example:

norm_pl = PowerLawNormSpectralModel()
norm_pl.prior = MultivariateGaussianPrior()

Would the MulitvariateGaussianPrior be applied to all three parameters of PowerLawNormSpectralModel? Even the reference? How could one control this?

About the non-static evaluation of the priors:

I know that this deviates from the implementation outline in the PIG. However, a static evaluation is no longer possible if the model parameters are passed to the multidimensional prior, as suggested in the PR.
Even if the linking happens internally, priors depending on the values of multiple parameters cannot be evaluated statically.
I think it is easier for the user if priors sitting on single parameters and priors sitting on multiple parameters/models are handled consistently. Therefore, I also removed the static evaluation from the GaussianPrior and UniformPrior.

adonath · 2023-11-16T17:28:53Z

Your first example with the coupled indices is an interesting use case, that will help us! I think we are going in the exact same direction, with some small differences. Your first use case I think should look like:

models = Models([model1, model2])

# this creates a full covariance matrix internally
prior = MultiVariateGaussianPrior.from_subcovariance(
    pars=[model1.spectral_model.index, model2.spectral_model.index],
    subcovar=C_sub
)

models.prior = prior


# but this should also work:
prior = MultiVariateGaussianPrior(
    parameters=None
    covariance=C_full
)

# just based on the fact that the number parameters in models is the same size as the full covariance
# matrix one can assign it internally. See 
models.prior = prior

But yes internally the MultiVariateGaussianPrior needs to store a covariance matrix and a list of parameters.

For this use case we can probably just re-use the existing Covariance class.

adonath · 2023-11-16T17:34:33Z

Which intrinsically sets: .prior on the two Parameter instances.

This is the part I'm less sure about. This would require passing information both direction: up and down the hierarchy. I implemented this for the covariance handling of models and it is rather complex, with not such a huge benefit. Or do you propose to only pass the information down?

As prior are always set explicitly by the user, I think it is not important that we always synchronize the information:

norm_pl = PowerLawNormSpectralModel()
norm_pl.prior = MultivariateGaussianPrior(covariance=np.eye(len(norm_pl.norms)))

assert norm_pl.norms[0].prior is None

I think it is fine if, when the prior is set on a higher level to account for correlations, to not represent this on a lower level.

registerrier · 2024-05-16T19:06:01Z

Any news here @katrinstreil ?
It would be nice to have multi-dimensional priors and spectral deconvolution in for 1.3.

katrinstreil and others added 12 commits October 20, 2023 14:45

prior_stat_sum

65d1a8b

removing parameters.priors

6732a8c

adding tests

9b0433e

Merge branch 'gammapy:main' into prior_statistics

527b728

Merge branch 'gammapy:main' into prior_statistics

fa2f3c7

mulitdim

0bd7a34

serialisation

12f92b4

serialisation

9574159

serialisation

53d2440

tests

280684c

finishing tests

01b9efa

finishing tests

f194f46

katrinstreil added the feature label Nov 16, 2023

katrinstreil added this to the 1.2 milestone Nov 16, 2023

adonath reviewed Nov 16, 2023

View reviewed changes

registerrier modified the milestones: 1.2, 1.3 Jan 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prior statistics and set-up for multidimensional priors #4907

Prior statistics and set-up for multidimensional priors #4907

katrinstreil commented Nov 16, 2023

codecov bot commented Nov 16, 2023

adonath left a comment

adonath Nov 16, 2023

adonath Nov 16, 2023

adonath Nov 16, 2023

adonath Nov 16, 2023

adonath Nov 16, 2023

adonath Nov 16, 2023

adonath commented Nov 16, 2023

katrinstreil commented Nov 16, 2023

adonath commented Nov 16, 2023 •

edited

adonath commented Nov 16, 2023 •

edited

registerrier commented May 16, 2024 •

edited

Prior statistics and set-up for multidimensional priors #4907

Are you sure you want to change the base?

Prior statistics and set-up for multidimensional priors #4907

Conversation

katrinstreil commented Nov 16, 2023

codecov bot commented Nov 16, 2023

Codecov Report

adonath left a comment

Choose a reason for hiding this comment

adonath Nov 16, 2023

Choose a reason for hiding this comment

adonath Nov 16, 2023

Choose a reason for hiding this comment

adonath Nov 16, 2023

Choose a reason for hiding this comment

adonath Nov 16, 2023

Choose a reason for hiding this comment

adonath Nov 16, 2023

Choose a reason for hiding this comment

adonath Nov 16, 2023

Choose a reason for hiding this comment

adonath commented Nov 16, 2023

katrinstreil commented Nov 16, 2023

adonath commented Nov 16, 2023 • edited

adonath commented Nov 16, 2023 • edited

registerrier commented May 16, 2024 • edited

adonath commented Nov 16, 2023 •

edited

adonath commented Nov 16, 2023 •

edited

registerrier commented May 16, 2024 •

edited