Handle NaNs in the logp in SMC #7293

astoeriko · 2024-04-29T13:34:53Z

Description

The changes deal with the case that the logp is NaN in the SMC sampler.
With the changes, samples with a NaN-logp will be simply discarded during the resampling step by assigning them a logp of -inf.

I know that this is not an ideal solution to the problem, but rather a pragmatic workaround. Maybe it would be wise to add a warning that there were NaN values?

Related Issue

Closes BUG: NaNs in logp mess up resampling step in SMC #7292
Related to #

Checklist

Checked that the pre-commit linting/style checks pass
Included tests that prove the fix is effective or that the new feature works
Added necessary documentation (docstrings and/or example notebooks)
If you are a pro: each commit corresponds to a relevant logical change

Type of change

📚 Documentation preview 📚: https://pymc--7293.org.readthedocs.build/en/7293/

welcome · 2024-04-29T13:34:57Z

]
💖 Thanks for opening this pull request! 💖 The PyMC community really appreciates your time and effort to contribute to the project. Please make sure you have read our Contributing Guidelines and filled in our pull request template to the best of your ability.

aseyboldt · 2024-04-29T15:54:20Z

Thank you for the PR!

It would be nice if we can come up with a test that this is in fact enough.
Maybe we can arbitrarily introduce some nan values into a model, and sample it? So for instance

with pm.Model():
    x = pm.Normal("x")
    pm.Normal("y", mu=x, sigma=0.1, observed=1)
    # Return nan in 50% of the prior draws
    pm.Potential("make_nan", pt.where(pt.geq(x, 0), 0, np.nan))

astoeriko · 2024-04-29T16:16:08Z

Testing this with a simple example would be great! I was wondering if there is a way to artificially introduce NaN values in a model. So thanks for providing an example, this will help me verify with a simpler model if the problems I am seeing when sampling with SMC are indeed related to not handling NaN values.
I am still not very clear about what the test should test in the end. That we do not end up with a single sample per chain?

aseyboldt · 2024-04-29T16:59:51Z

I think checking that all samples are positive and the posterior variance is reasonable should be enough.
The true posterior standard deviation should be $\sqrt{(1 + 1/100)^{-1}} \approx 0.1$, so maybe we just check that it's between 0.05 and 0.2 or so? We could also be more thorough and do a ks test against the true posterior, but I think for our purposes here that shouldn't be necessary.

Handle NaNs in the logp in SMC

2c08417

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Handle NaNs in the logp in SMC #7293

Handle NaNs in the logp in SMC #7293

astoeriko commented Apr 29, 2024 •

edited by github-actions bot

welcome bot commented Apr 29, 2024

aseyboldt commented Apr 29, 2024

astoeriko commented Apr 29, 2024

aseyboldt commented Apr 29, 2024

Handle NaNs in the logp in SMC #7293

Are you sure you want to change the base?

Handle NaNs in the logp in SMC #7293

Conversation

astoeriko commented Apr 29, 2024 • edited by github-actions bot

Description

Related Issue

Checklist

Type of change

welcome bot commented Apr 29, 2024

aseyboldt commented Apr 29, 2024

astoeriko commented Apr 29, 2024

aseyboldt commented Apr 29, 2024

astoeriko commented Apr 29, 2024 •

edited by github-actions bot