Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add different options to compute stat_array on FluxPointsDatasets #5135

Merged
merged 25 commits into from May 17, 2024

Conversation

QRemy
Copy link
Contributor

@QRemy QRemy commented Feb 29, 2024

Add different options to compute stat_array on FluxPointsDatasets :

  • chi2 : etimate from chi2 statistics.
  • profile : estimate from interpolation of the likelihood profile.
  • distrib : estimate from probability distributions,
    assumes that flux points correspond to asymmetric gaussians and upper limits complemantary error functions.

Default is chi2, in that case upper limits are ignored and the mean of asymetrics error is used.
The distrib case provides an approximation if the profile is not available
which allows to take into accounts upper limits and asymetrics errors.

Updated /tutorials/analysis-1d/spectral_analysis.py to present the different cases:
image
Without ul the results are the same:
image

@QRemy QRemy added the feature label Feb 29, 2024
@QRemy QRemy added this to the 1.3 milestone Mar 2, 2024
@QRemy QRemy marked this pull request as ready for review March 2, 2024 10:59
Copy link

codecov bot commented Mar 2, 2024

Codecov Report

Attention: Patch coverage is 97.36842% with 3 lines in your changes are missing coverage. Please review.

Project coverage is 94.46%. Comparing base (a048535) to head (a55f23c).
Report is 79 commits behind head on main.

Files Patch % Lines
gammapy/datasets/flux_points.py 97.46% 2 Missing ⚠️
gammapy/estimators/points/tests/test_lightcurve.py 95.45% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #5135      +/-   ##
==========================================
+ Coverage   94.25%   94.46%   +0.20%     
==========================================
  Files         234      235       +1     
  Lines       35226    35493     +267     
==========================================
+ Hits        33204    33528     +324     
+ Misses       2022     1965      -57     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@AtreyeeS AtreyeeS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @QRemy ! The profile is an elegant solution for considering upper limits during the fit!

I don't understand the implementation of the distrib option. Maybe you can add some information in gammapy.stat docs page?

Otherwise, only minor comments inline

examples/tutorials/analysis-1d/spectral_analysis.py Outdated Show resolved Hide resolved
examples/tutorials/analysis-1d/spectral_analysis.py Outdated Show resolved Hide resolved
examples/tutorials/analysis-1d/spectral_analysis.py Outdated Show resolved Hide resolved
gammapy/datasets/flux_points.py Outdated Show resolved Hide resolved
gammapy/datasets/flux_points.py Outdated Show resolved Hide resolved
gammapy/datasets/flux_points.py Outdated Show resolved Hide resolved
gammapy/datasets/flux_points.py Show resolved Hide resolved
gammapy/datasets/flux_points.py Outdated Show resolved Hide resolved
gammapy/datasets/flux_points.py Outdated Show resolved Hide resolved
gammapy/datasets/flux_points.py Outdated Show resolved Hide resolved
gammapy/datasets/flux_points.py Outdated Show resolved Hide resolved
gammapy/datasets/flux_points.py Outdated Show resolved Hide resolved
@QRemy
Copy link
Contributor Author

QRemy commented Mar 6, 2024

Thanks @AtreyeeS I implemented most of the comments.

I don't understand the implementation of the distrib option. Maybe you can add some information in gammapy.stat docs page?

Assuming gaussian errors the likelihood is given by the probability density function of the normal distribution.
For the upper limit case it is necessary to marginalize over the unknown measurement so we integrate the normal distribution up to the upper limit value which gives the complementary error function.

Screenshot 2024-03-06 at 17 03 23
see eq. C7 from https://iopscience.iop.org/article/10.1088/0004-637X/773/2/168/pdf
(in eq C6 they are doing something more complicated as they also marginalised over the unknown true value)

@adonath
Copy link
Member

adonath commented Mar 6, 2024

Thanks @QRemy, the statistical approach to handle the ULs looks reasonable to me. I was wonderign whether we can maybe find a more descriptive name. I think even "chi2-marginalize-ul" would be better. We should definitely give the paper reference in addition.

AtreyeeS
AtreyeeS previously approved these changes Mar 6, 2024
Copy link
Member

@AtreyeeS AtreyeeS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @QRemy ! Elegant indeed.
Maybe you can add a little info the documentation (what you wrote in the comment, and the paper reference).
And add stat_type option in the example here

Copy link
Member

@AtreyeeS AtreyeeS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The distrib option is often returning a nan for some bins, leading to the total stat becoming nan and a failure of the minimisation. Any idea why?

gammapy/datasets/flux_points.py Show resolved Hide resolved
gammapy/datasets/flux_points.py Show resolved Hide resolved
gammapy/datasets/tests/test_flux_points.py Show resolved Hide resolved
gammapy/datasets/flux_points.py Outdated Show resolved Hide resolved
Copy link
Contributor

@registerrier registerrier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @QRemy ! I have left some comments inline.

My main comment is that you could compute most of the necessary arrays in advance. I suppose that for large SEDs this might be much more efficient.

examples/tutorials/analysis-1d/spectral_analysis.py Outdated Show resolved Hide resolved
examples/tutorials/analysis-1d/spectral_analysis.py Outdated Show resolved Hide resolved
examples/tutorials/analysis-1d/spectral_analysis.py Outdated Show resolved Hide resolved
gammapy/datasets/flux_points.py Outdated Show resolved Hide resolved
gammapy/datasets/flux_points.py Outdated Show resolved Hide resolved
gammapy/datasets/flux_points.py Outdated Show resolved Hide resolved
gammapy/datasets/flux_points.py Outdated Show resolved Hide resolved
gammapy/datasets/flux_points.py Show resolved Hide resolved
gammapy/datasets/flux_points.py Show resolved Hide resolved
gammapy/datasets/flux_points.py Outdated Show resolved Hide resolved
@QRemy QRemy added this to In progress in gammapy.estimators via automation Mar 11, 2024
Copy link
Member

@bkhelifi bkhelifi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @QRemy.. I have some questions for you.

gammapy/utils/interpolation.py Show resolved Hide resolved
gammapy/datasets/tests/test_flux_points.py Show resolved Hide resolved
bkhelifi
bkhelifi previously approved these changes May 6, 2024
Copy link
Member

@bkhelifi bkhelifi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @QRemy . All good for me;)

registerrier
registerrier previously approved these changes May 16, 2024
Copy link
Contributor

@registerrier registerrier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @QRemy . This looks good to me.

I have left a simple comment regarding not providing the dictionary of stat functions but simply their keys.

Otherwise, this would be much cleaner with a stat functions being defined as external objects. This can be a further PR though.

Signed-off-by: Quentin Remy <quentin.remy@mpi-hd.mpg.de>
QRemy and others added 23 commits May 17, 2024 11:07
Signed-off-by: Quentin Remy <quentin.remy@mpi-hd.mpg.de>
Signed-off-by: Quentin Remy <quentin.remy@mpi-hd.mpg.de>
Signed-off-by: Quentin Remy <quentin.remy@mpi-hd.mpg.de>
Signed-off-by: Quentin Remy <quentin.remy@mpi-hd.mpg.de>
Signed-off-by: Quentin Remy <quentin.remy@mpi-hd.mpg.de>
Signed-off-by: Quentin Remy <quentin.remy@mpi-hd.mpg.de>
Signed-off-by: Quentin Remy <quentin.remy@mpi-hd.mpg.de>
Co-authored-by: Atreyee Sinha <asinha@ucm.es>
Signed-off-by: Quentin Remy <quentin.remy@mpi-hd.mpg.de>
Signed-off-by: Quentin Remy <quentin.remy@mpi-hd.mpg.de>
Signed-off-by: Quentin Remy <quentin.remy@mpi-hd.mpg.de>
Signed-off-by: Quentin Remy <quentin.remy@mpi-hd.mpg.de>
Signed-off-by: Quentin Remy <quentin.remy@mpi-hd.mpg.de>
Co-authored-by: Régis Terrier <regis.terrier@m4x.org>
Signed-off-by: Quentin Remy <quentin.remy@mpi-hd.mpg.de>
Signed-off-by: Quentin Remy <quentin.remy@mpi-hd.mpg.de>
Signed-off-by: Quentin Remy <quentin.remy@mpi-hd.mpg.de>
Signed-off-by: Quentin Remy <quentin.remy@mpi-hd.mpg.de>
Co-authored-by: Bruno Khélifi <khelifi@in2p3.fr>
Signed-off-by: Quentin Remy <quentin.remy@mpi-hd.mpg.de>
Signed-off-by: Quentin Remy <quentin.remy@mpi-hd.mpg.de>
Signed-off-by: Quentin Remy <quentin.remy@mpi-hd.mpg.de>
@QRemy QRemy merged commit 813ec3e into gammapy:main May 17, 2024
16 checks passed
gammapy.estimators automation moved this from In progress to Done May 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Development

Successfully merging this pull request may close these issues.

None yet

5 participants