New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement FitStatistics / Priors ideas. #4237
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @nbiederbeck .
I have left inline comments. Maybe adding docstrings would help.
My main comment is that FitStatistic
should be independent of the Dataset
structure but only expect array-like inputs.
@@ -163,6 +163,9 @@ def parameters(self): | |||
[getattr(self, name) for name in self.default_parameters.names] | |||
) | |||
|
|||
def stat_sum(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems to me that the stat_sum
should rather be on the Models
or Parameters
object. That's where we have the global information on the models used.
Is there a use case for computing the priors associated to a single model?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that priors might inherit from ModelBase
. In which case the evaluate
or __call__
methods would be used to compute the associated statistic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you are right. To me it seems that a single model is a subset of Models
(e.g. len(Models) == 1
) right? Then this would surely be better placed there. Please forgive my ignorance about the Gammapy internals and where put this best.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should figure out a clean API here. I think the Models.stat_sum()
is confusing a bit as well. Long-term we would like to have a more separated API between MapDataset
, Models
and Fit
. I am very sure that we will evolve into a direction where models are passed to e.g MapDataset.stat_sum(models=models)
. So I would propose to evolve into a direction of e.g. implementing a PriorFitStatistic
class, which takes the Models
class and evaluates the priors. Like PriorFitStatistic(beta=1.).stat_sum(models=models)
. By now it seems to me this PriorFitStatistics
will only need to be supported for Datasets
.
gammapy/modeling/parameter.py
Outdated
@@ -145,6 +147,12 @@ def __init__( | |||
self.interp = interp | |||
self.scale_method = scale_method | |||
|
|||
def stat_sum(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does very little in practice. It is probably not needed. Everything could calculated in the global stat_sum
where the loop over parameters is performed.
gammapy/datasets/map.py
Outdated
class FitStatistic: | ||
"""Calculate -2 * log(L).""" | ||
|
||
def stat_sum(self, dataset): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it does not make sense to bind FitStatistic
to the Dataset
classes. They should only expect array-like input. (see CountsStatistic
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you are totally right about that!
@@ -180,6 +200,7 @@ class MapDataset(Dataset): | |||
psf = LazyFitsData(cache=True) | |||
mask_fit = LazyFitsData(cache=True) | |||
mask_safe = LazyFitsData(cache=True) | |||
fit_statistic = CashFitStatistic() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will likely require a setter
method because we will have to check that a FitStatistic
is applicable to a Dataset
.
if par.prior is not None: | ||
args = {} | ||
# this is almost duplicated code from gammapy.dataset.map | ||
for key in signature(par.prior.stat_sum).parameters.keys(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think currently this might evaluate all Priors for all Parameters. This would of course has to be fixed properly. I was thinking about
parameters.x
# but need to do
parameters['x'].value
and so each parameter has the attribute .value
.
@@ -319,3 +322,41 @@ def test_stat_contour(): | |||
|
|||
# Check that original value state wasn't changed | |||
assert_allclose(dataset.models.parameters["y"].value, 300) | |||
|
|||
|
|||
def test_with_prior(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test is certainly at the wrong position, but I added it to show how setting and evaluating priors can work.
While doing this, I was thinking about how to implement #3955. I need to further investigate this, but I assume it is going in a good direction. Please let me know what you think. Thank you. |
Codecov Report
@@ Coverage Diff @@
## main #4237 +/- ##
==========================================
- Coverage 94.83% 94.71% -0.12%
==========================================
Files 216 216
Lines 30501 30597 +96
==========================================
+ Hits 28925 28980 +55
- Misses 1576 1617 +41
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
e5d7932
to
6d23f1c
Compare
6d23f1c
to
ddb263a
Compare
@@ -1080,13 +1179,18 @@ def plot_residuals( | |||
return ax_spatial, ax_spectral | |||
|
|||
def stat_sum(self): | |||
"""Total statistic function value given the current model parameters.""" | |||
counts, npred = self.counts.data.astype(float), self.npred().data | |||
"""Total likelihood given the current model parameters.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it really the likelihood or -2log(L)?
ddb263a
to
6da8280
Compare
This is a very preliminary proposal for implementing (1) more flexible FitStatistics (2) priors on Parameters With this implementation, both can have the same base class, are equally serializable, can have non-breaking defaults, and be provided by users. Signed-off-by: Noah Biederbeck <noah.biederbeck@tu-dortmund.de>
Signed-off-by: Noah Biederbeck <noah.biederbeck@tu-dortmund.de>
Signed-off-by: Noah Biederbeck <noah.biederbeck@tu-dortmund.de>
Signed-off-by: Noah Biederbeck <noah.biederbeck@tu-dortmund.de>
Signed-off-by: Noah Biederbeck <noah.biederbeck@tu-dortmund.de>
6da8280
to
1032771
Compare
@nbiederbeck can you update your code of your branch from the main, such that the tests are passing? thanks |
Hi @bkhelifi, thanks for reaching out. My plate is full this month, I can start looking into this in November. If someone steps in earlier and takes over, I'm happy to assist. |
This is a very preliminary proposal for implementing (1) more flexible FitStatistics, and (2) priors on Parameters.
With this implementation, both can have the same base class, are equally serializable, can have non-breaking defaults, and can be provided by users.
Dear reviewer
Certainly I am missing something obvious. I open this PR, so we can look at the same code at the same time.
Please see also my slides from the coding sprint and the earlier notebook from Axel