feat: local c2st metric #1109

JuliaLinhart · 2024-03-22T21:07:45Z

What does this implement/fix? Explain your changes

L-C2ST(-NF) diagnostic: class, tutorial and tests.

Does this close any currently open issues?

Fixes #1005

Any relevant code examples, logs, error output, etc?

...

Any other comments?

...

Checklist

Put an x in the boxes that apply. You can also fill these out after creating
the PR. If you're unsure about any of them, don't hesitate to ask. We're here to
help! This is simply a reminder of what we are going to look for before merging
your code.

[x ] I have read and understood the contribution
guidelines
[x ] I agree with re-licensing my contribution from AGPLv3 to Apache-2.0.
[ x] I have commented my code, particularly in hard-to-understand areas
[x ] I have added tests that prove my fix is effective or that my feature works
[ x] I have reported how long the new tests run and potentially marked them
with pytest.mark.slow.
[ x] New and existing unit tests pass locally with my changes
[x ] I performed linting and formatting as described in the contribution
guidelines
I rebased on main (or there are no conflicts with main)

sbi/analysis/plot.py

sbi/simulators/gaussian_mixture.py

tests/lc2st_test.py

JuliaLinhart · 2024-03-23T08:48:25Z

Thanks for this review! I'll do the changes, review the doc and fix typing issues.

JuliaLinhart · 2024-03-23T14:01:31Z

Almost all the suggestions by @agramfort have been addressed. Except:

pandas is still used for the .groupby() method in the marginal_plot_with_proba_intensity function from sbi.analysis.plot.py.
the added simulator has no corresponding pytest script.

Future additional features could include:

function that regroups the marginal_plot_with_proba_intensity into a single pairplot.
generic HypothesisTest class to centralize sbc, lc2st and other diagnostics relying on hypothesis testing.

janfb

Awesome, thanks a lot for adding this!
Looks great already, I added a couple of comments and questions. I think this needs some renaming here and there to match our variable names conventions and PEP 8.

sbi/analysis/plot.py

sbi/analysis/test_utils.py

sbi/diagnostics/lc2st.py

sbi/simulators/gaussian_mixture.py

tests/lc2st_test.py

codecov · 2024-04-08T13:45:26Z

Codecov Report

Attention: Patch coverage is 65.87537% with 115 lines in your changes are missing coverage. Please review.

Project coverage is 83.04%. Comparing base (9a8c7c0) to head (7e0bc11).
Report is 1 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1109      +/-   ##
==========================================
- Coverage   83.93%   83.04%   -0.90%     
==========================================
  Files          90       92       +2     
  Lines        6930     7272     +342     
==========================================
+ Hits         5817     6039     +222     
- Misses       1113     1233     +120

Flag	Coverage Δ
unittests	`83.04% <65.87%> (-0.90%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files	Coverage Δ
sbi/analysis/__init__.py	`100.00% <ø> (ø)`
sbi/diagnostics/lc2st.py	`96.03% <96.03%> (ø)`
sbi/utils/analysis_utils.py	`52.94% <20.00%> (-47.06%)`	⬇️
sbi/simulators/gaussian_mixture.py	`44.44% <44.44%> (ø)`
sbi/analysis/plot.py	`61.32% <10.12%> (-5.92%)`	⬇️

... and 1 file with indirect coverage changes

JuliaLinhart · 2024-04-08T16:21:57Z

I think I have addressed all your comments and requests @janfb, except the one where I should get rid of the groupby method from pandas. I will try to fix this as soon as I can.

JuliaLinhart · 2024-04-09T12:28:06Z

All done @janfb

janfb

Thanks for the updates!

I added a couple of additional comments, but overall I think we are close to convergence 🙂

I had a look at the tutorial as well - I reads very well! I think it's a great to have a comprehensive introduction to the method so that users know how to use it and how to interpret the results. Just one comment: At the moment you are generating the different plots but you are not explaining them. I think it would be essential to add an explaination and interpretation to each diagnostic plot.

Thanks for the effort!

sbi/analysis/test_utils.py

sbi/diagnostics/lc2st.py

sbi/simulators/gaussian_mixture.py

tests/lc2st_test.py

tutorials/17_diagnostics_lc2st.ipynb

psteinb · 2024-04-15T15:14:07Z

I am happy to help review this PR. But given the activity that is already visible, I'd push this effort to a later stage. Feel free to ping me if my help is needed.

JuliaLinhart · 2024-04-22T10:49:46Z

Response to review from @janfb: the above commit fixes following requests

rename tutoral to 18_..., plots and results description
change content of anamysis/test_utils.py to sbi/utils/analysis_utils.py
description of p-value computation in lc2st.py
explicit name and description of the tests in lc2st_tests.py and adapt code to test the true postitive and negative rates of the hypothesis test. Runtime is longer for 100 test runs (7min), but otherwise the "rate" is not a trusworthy empirical result in my opinion)

Only remaing question: do you want to make the theta_o generation in the LC2ST_NF test the user's responsibility?

janfb

Thanks for the updates - looks good!

One more thing, the theta_o generation in the LC2ST_NF needs clarification.

Also, 8 tests are failing at the moment.

sbi/diagnostics/lc2st.py

tests/lc2st_test.py

janfb · 2024-04-22T11:51:23Z

I am happy to help review this PR. But given the activity that is already visible, I'd push this effort to a later stage. Feel free to ping me if my help is needed.

That's great @psteinb ! Do you have capacity to review the tutorial? That'd be great 🙏

JuliaLinhart · 2024-04-22T12:16:49Z

Oh i don't know why the tests fail... they pass when I run them locally!
Any idea why that would be the case?

JuliaLinhart · 2024-04-23T07:43:17Z

I think the npe.sample method (where npe is a DensityEstimator object) is different for the tests vs. on the branch I am working on. It seems to be the handling of the context variable, but I am not sure because I can't verify anything locally

michaeldeistler · 2024-04-23T07:44:59Z

just rebase on main

git checkout main
git pull
git checkout 1005-implement-l-c2st-metric
git rebase main
git push -f

JuliaLinhart · 2024-04-23T10:13:27Z

Oh right.. Sorry! So I fixed it, but had to reshape a lot...

JuliaLinhart · 2024-04-29T16:57:26Z

I did some experiments and I was wondering how you choose the parameters for theMLPClassifier from sklearn corresponding to the default classifier=mlp in c2st and the LC2ST class. Especially early_stopping seems to be a limitation in some application examples... (and also a little the regularizarion parameter alpha)

@psteinb you worked on that right?

For me the MLPClassifier from sklearn with default parameters with alpha=0 and max_iter=25000 yields pretty stable results, but is prone to overfitting. I therefore suggest that if we stick with your mlp, to do ensembling with different seeds (over 5 models by default). It yields more stable results with smaller confidence regions, but is slower (and the small confidence regions can lead to high rejection rates). This is my last commit and I added tests.

Let me know what you think :)

psteinb

This is an awesome PR. Thanks @JuliaLinhart and all reviewers so far. I felt very humble reviewing it as a lot of dedication, discipline and rigor went into it.

I focused on reviewing the tutorial. Note, I couldn't directly suggest edits to the notebook towards the latter quarter of the notebook, for some reason the github webpage always wanted to remove images whenever I made a code suggestion.

sbi/analysis/plot.py

tutorials/18_diagnostics_lc2st.ipynb

Co-authored-by: Peter Steinbach <p.steinbach@hzdr.de>

JuliaLinhart · 2024-05-06T11:31:03Z

There you go @psteinb :) Thanks a lot for your review!

…t doc

janfb

thanks for looking into the classifier performance and variance!

There is one open question from my previous review and one question regarding the ensemble training.

Thanks! 🙏

sbi/diagnostics/lc2st.py

psteinb

The tutorial looks good to me!

JuliaLinhart · 2024-05-16T09:53:43Z

Hello everyone. I propose in this last commit a solution to the ensembling / cross-val issue stated above.

I chose to create a EnsembleClassifier class whose predicted probabilities are the average prediction over all classifiers (that differ only by their random_state). Cross-val is just a training strategy that can also be performed on an ensemble classifier.

Here's something to think about: The cross-val scores are the test statistics obtained for each fold. If someone wishes to perform a test, i.e. compute p-values, with the cross-val strategy, the test statistic becomes the average statistic over all folds. This is different from ensembling, where the test statistic is computed on the average prediction.

I also added a section ## Classifier choice and calibration data size: how to ensure meaningful test results in the tutorial if you want to check it out @psteinb .

Finally, I added a small description for the LC2ST_NF to be more explicit on what the theta_o are. @janfb let me know if that answers your questions.

janfb

Thanks @JuliaLinhart for these additional changes and for the commitment during this long PR! 👏
It all looks good to me now!

Thanks also @psteinb for your review on this.

Will be merged soon 🎉

JuliaLinhart · 2024-05-17T07:48:06Z

Thank you for your valuable comments and reviews !!

JuliaLinhart linked an issue Mar 22, 2024 that may be closed by this pull request

Implement l-c2st (local validation without reference samples) #1005

Closed

JuliaLinhart requested review from psteinb, janfb and michaeldeistler March 22, 2024 21:08

agramfort reviewed Mar 23, 2024

View reviewed changes

janfb mentioned this pull request Mar 25, 2024

Local sbc, issue #621 #628

Closed

janfb reviewed Mar 25, 2024

View reviewed changes

janfb reviewed Apr 12, 2024

View reviewed changes

janfb reviewed Apr 22, 2024

View reviewed changes

sbi/diagnostics/lc2st.py Outdated Show resolved Hide resolved

tests/lc2st_test.py Outdated Show resolved Hide resolved

JuliaLinhart added 10 commits April 23, 2024 11:47

lc2st class - first imp

bb32a93

null hypothesis

3c81d9c

start of notebook

51d4516

LC2ST class and notebook version 2

6d428f6

notebook with graphical diagnostics on GaussianMixture

568b69e

missing text in notebook and final fixes

db6a8c6

move gaussian_mixture model from utils to sbi.simulators

81d7cea

ruff fix

5ee8e50

ruff fix

0d6af83

typing fix and small doc changes

2510c66

JuliaLinhart added 2 commits April 23, 2024 11:52

10 --> 100 test runs

2843d57

negatif --> negativ

32b375a

JuliaLinhart force-pushed the 1005-implement-l-c2st-metric branch from 6e7f3f7 to 32b375a Compare April 23, 2024 09:52

rebase changes + pytest fix

f039b1c

ensembling

73ff7d5

JuliaLinhart and others added 2 commits April 29, 2024 19:02

Merge branch 'main' into 1005-implement-l-c2st-metric

2acbf35

ruff fix

0fa228f

psteinb requested changes May 2, 2024

View reviewed changes

JuliaLinhart and others added 2 commits May 6, 2024 09:13

add reference for pp-plot

4a2f808

Co-authored-by: Peter Steinbach <p.steinbach@hzdr.de>

tutorial changes and ruff

df36001

JuliaLinhart added 2 commits May 6, 2024 17:12

change the default n_ensemble back to 1, explain in tutorial and lc2s…

947fc00

…t doc

change the default n_ensemble back to 1, explain in tutorial and lc2s…

a2c1527

…t doc

janfb reviewed May 7, 2024

View reviewed changes

sbi/diagnostics/lc2st.py Outdated Show resolved Hide resolved

sbi/diagnostics/lc2st.py Outdated Show resolved Hide resolved

psteinb self-requested a review May 13, 2024 11:11

psteinb approved these changes May 13, 2024

View reviewed changes

ensembling, clf-choice in tutorial, lc2st-nf description in doc

c4ce834

pyright fix

7e0bc11

JuliaLinhart requested a review from janfb May 16, 2024 13:23

janfb approved these changes May 16, 2024

View reviewed changes

janfb self-assigned this May 16, 2024

janfb added the enhancement New feature or request label May 16, 2024

janfb changed the title ~~1005 implement l c2st metric~~ feat #1005: local c2st metric May 16, 2024

janfb changed the title ~~feat #1005: local c2st metric~~ feat: local c2st metric May 16, 2024

janfb merged commit 3c1e725 into main May 17, 2024
7 checks passed

janfb deleted the 1005-implement-l-c2st-metric branch May 17, 2024 10:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: local c2st metric #1109

feat: local c2st metric #1109

JuliaLinhart commented Mar 22, 2024 •

edited

JuliaLinhart commented Mar 23, 2024

JuliaLinhart commented Mar 23, 2024 •

edited

janfb left a comment

codecov bot commented Apr 8, 2024 •

edited

JuliaLinhart commented Apr 8, 2024

JuliaLinhart commented Apr 9, 2024

janfb left a comment

psteinb commented Apr 15, 2024

JuliaLinhart commented Apr 22, 2024 •

edited

janfb left a comment •

edited

janfb commented Apr 22, 2024

JuliaLinhart commented Apr 22, 2024 •

edited

JuliaLinhart commented Apr 23, 2024 •

edited

michaeldeistler commented Apr 23, 2024 •

edited

JuliaLinhart commented Apr 23, 2024

JuliaLinhart commented Apr 29, 2024 •

edited

psteinb left a comment

JuliaLinhart commented May 6, 2024

janfb left a comment

psteinb left a comment

JuliaLinhart commented May 16, 2024

janfb left a comment

JuliaLinhart commented May 17, 2024

feat: local c2st metric #1109

feat: local c2st metric #1109

Conversation

JuliaLinhart commented Mar 22, 2024 • edited

What does this implement/fix? Explain your changes

Does this close any currently open issues?

Any relevant code examples, logs, error output, etc?

Any other comments?

Checklist

JuliaLinhart commented Mar 23, 2024

JuliaLinhart commented Mar 23, 2024 • edited

janfb left a comment

Choose a reason for hiding this comment

codecov bot commented Apr 8, 2024 • edited

Codecov Report

JuliaLinhart commented Apr 8, 2024

JuliaLinhart commented Apr 9, 2024

janfb left a comment

Choose a reason for hiding this comment

psteinb commented Apr 15, 2024

JuliaLinhart commented Apr 22, 2024 • edited

janfb left a comment • edited

Choose a reason for hiding this comment

janfb commented Apr 22, 2024

JuliaLinhart commented Apr 22, 2024 • edited

JuliaLinhart commented Apr 23, 2024 • edited

michaeldeistler commented Apr 23, 2024 • edited

JuliaLinhart commented Apr 23, 2024

JuliaLinhart commented Apr 29, 2024 • edited

psteinb left a comment

Choose a reason for hiding this comment

JuliaLinhart commented May 6, 2024

janfb left a comment

Choose a reason for hiding this comment

psteinb left a comment

Choose a reason for hiding this comment

JuliaLinhart commented May 16, 2024

janfb left a comment

Choose a reason for hiding this comment

JuliaLinhart commented May 17, 2024

JuliaLinhart commented Mar 22, 2024 •

edited

JuliaLinhart commented Mar 23, 2024 •

edited

codecov bot commented Apr 8, 2024 •

edited

JuliaLinhart commented Apr 22, 2024 •

edited

janfb left a comment •

edited

JuliaLinhart commented Apr 22, 2024 •

edited

JuliaLinhart commented Apr 23, 2024 •

edited

michaeldeistler commented Apr 23, 2024 •

edited

JuliaLinhart commented Apr 29, 2024 •

edited