Probability function returned NaN #415

omaarattia · 2021-12-13T09:06:58Z

Good morning,

I have an issue on emcee. The EnsembleSampler crashes after a few iterations and raises a ValueError: Probability function returned NaN. The thing is my user-defined probability function cannot return NaNs, as it defined something like this:

def my_probability_function(x):
    prob = logP(*x)
    if prob > 0 or np.isnan(prob): return -np.inf
    return prob

where logP is basically an evaluation of a beforehand-constructed RBF interpolator.

I call the EnsembleSampler like this:

import emcee
import numpy as np
from pathos.pools import ProcessPool

NW = 64 #Number of workers
ND = 7 #Number of dimensions
MCMC_PATH = 'path/to/backend.h5'
backend = emcee.backends.HDFBackend(MCMC_PATH)
backend.reset(NW, ND)
with ProcessPool(nodes=8) as pool:
    pool.restart()
    sampler = emcee.EnsembleSampler(NW, ND, my_probability_function, pool=pool, backend=backend)
    sampler.run_mcmc(p0, 5000, progress=True)

Changing the logP function to e.g. a nearest interpolator solves the problem, so I guess the issue comes from a bad interaction between emcee and scipy's RBF interpolator?

I tried advancing the MCMC iteration by iteration and comparing at each step the manual and emcee-stored evaluation of the probability function:

sampler.run_mcmc(pp, 1, skip_initial_state_check=True)
pp = sampler.get_chain(flat=True)[-NW:]

logP_manual = [my_probability_function(p) for p in pp]
logP_emcee, _ = sampler.compute_log_prob(pp)

diff = np.abs(1. - logP_manual/logP_emcee)

and at some point, diff was not identically zero, meaning that my_probability_function sometimes returns different results depending on if one evaluates it outside or inside emcee!

Can you help me please? Unfortunately, I was not able to replicate the problem in a minimalist example, as it only emerges when the RBF interpolator is constructed on my personal 2M data points in 7 dimensions.

Many thanks in advance.

The text was updated successfully, but these errors were encountered:

dfm · 2021-12-13T13:10:40Z

I'm not too sure what to say here. There isn't any real reason why emcee would cause different behavior with your interpolator - it just directly calls the function you provide:

emcee/src/emcee/ensemble.py

Line 489 in afba7a5

results = list(map_func(self.log_prob_fn, p))

Some thoughts:

There isn't any strong reason why a probability density needs to be less than 1, so you probably don't want the prob > 0 check in your function.
The only processing that emcee does with the results of your function call is to pass it though a Python float, so perhaps there's some numerical instability there, but if there is that suggests that there's probably something else wrong with your function. You might want to debug what's happening there?
Do you find the same behavior if you don't use multiprocessing? Perhaps there's an issue with multiprocessing and your interpolator.

omaarattia · 2021-12-13T13:49:12Z

Thank you very much for your answer. I'll try your suggestions and let you know.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Probability function returned NaN #415

Probability function returned NaN #415

omaarattia commented Dec 13, 2021 •

edited

dfm commented Dec 13, 2021

omaarattia commented Dec 13, 2021

Probability function returned NaN #415

Probability function returned NaN #415

Comments

omaarattia commented Dec 13, 2021 • edited

dfm commented Dec 13, 2021

omaarattia commented Dec 13, 2021

omaarattia commented Dec 13, 2021 •

edited