Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Probabilities containing NaNs when choosing n_samples too small #4

Open
stennebroek opened this issue Nov 10, 2020 · 5 comments
Open

Comments

@stennebroek
Copy link

stennebroek commented Nov 10, 2020

I'm experiencing an error which I have trouble finding the root cause of. Apparently some probabilities contain NaN values when choosing the number of samples for the prior(s) (n_samples) too small. I found that the chance of getting this error increases as the n_sample decreases. Please find the error below.

/home/sam/anaconda3/envs/semproj/lib/python3.8/site-packages/optbayesexpt/particlepdf.py:167: RuntimeWarning: invalid value encountered in true_divide
  self.particle_weights = temp / np.sum(temp)
Traceback (most recent call last):
  File "benchmark_protocol_2.8.py", line 468, in <module>
    unknown_res.fit("Unknown resonance", "Simple Bayesian", unknown_res.SimpleBayesian, 
  File "benchmark_protocol_2.8.py", line 163, in fit
    self.save_full_metrics(*protocol_function(sample, *exp_parameters))
  File "benchmark_protocol_2.8.py", line 391, in SimpleBayesian
    single_probe_freq = obe.opt_setting()
  File "/home/sam/anaconda3/envs/semproj/lib/python3.8/site-packages/optbayesexpt/obe_base.py", line 387, in opt_setting
    utility = self.utility()
  File "/home/sam/anaconda3/envs/semproj/lib/python3.8/site-packages/optbayesexpt/obe_base.py", line 364, in utility
    var_p = self.yvar_from_parameter_draws()
  File "/home/sam/anaconda3/envs/semproj/lib/python3.8/site-packages/optbayesexpt/obe_base.py", line 299, in yvar_from_parameter_draws
    paramsets = self.randdraw(self.N_DRAWS).T
  File "/home/sam/anaconda3/envs/semproj/lib/python3.8/site-packages/optbayesexpt/particlepdf.py", line 250, in randdraw
    indices = self.rng.choice(self.n_particles, size=n_draws,\
  File "_generator.pyx", line 644, in numpy.random._generator.Generator.choice
ValueError: probabilities contain NaN

Any help is much appreciated. Thank you in advance.

@rmcmichael-nist
Copy link
Collaborator

One workaround for bad weight values is to implement the OptBayesExpt.enforce_parameter_constraints() method in a child class of OptBayesExpt() to ensure that your likelihood function can't generate NaNs. There are examples in demos/sweeper/obe_sweeper.py, and in demos/lockin/lockin_of_coil.py.

Here's how the problem NaNs can happen. The particles that represent parameter values are given random displacements in the ParticlePDF.resample() method. With these random steps, parameters that are close enough to zero may become very small, or may change sign, leading to large or nonsense model_function outputs and subsequent NaN likelihood values.

Note that while reducing n_samples may make optbayesexpt run faster, it also increases the chances of incorrect results. "Sample impoverishment," may prevent distribution mean values from converging to known parameter values, and may produce falsely narrow distributions.

@stennebroek
Copy link
Author

stennebroek commented Nov 16, 2020

Thanks for your answer! I'm working on it and hope to get back to you soon.

EDIT: Implementing the OptBayesExpt.enforce_parameter_constraints() method seems to work as suggested, thanks!

@stennebroek
Copy link
Author

Excuse me, actually, I thought it seemed to work, but I still keep encountering this problem even after enforcing parameter constraints. My model function has 3 parameters, and the prior of two of them is nowhere close to zero (one is Unif(5, 15) and the other Unif(2720, 3030)). The prior of the third parameter (C0) is Unif(0, 0.4), but after ensuring that particles with C0 < 0 would be assigned a weight of 0 the behaviour didn't change. Even after ensuring that particles' weight for which the third parameter would have a value below 0.05 would be set to 0 the errors still persisted.

Now I seem to get a mix of the error above and an error saying that the "SVD did not converge" when I'm trying to decrease n_samples from 50000 to 10000 for example. For n_samples=50000 everything still works fine. Any thoughts?

@rmcmichael-nist
Copy link
Collaborator

Hi Stennebroek,

Thanks for helping out in getting to the bottom of this.

I'm wondering if parameter values might be the wrong target. Maybe we should examine the weight values too.

I'd like to pinpoint the routines that are raising these errors. Could you post the error messages? How do you determine the uncertainty values? Could they ever be negative or NaN? It might be helpful to see your model function as well.

@stennebroek
Copy link
Author

stennebroek commented Nov 25, 2020

Sure, this is the error saying that the SVD did not converge:

/home/sam/anaconda3/envs/semproj/lib/python3.8/site-packages/optbayesexpt/particlepdf.py:135: RuntimeWarning: Degrees of freedom <= 0 for slice
  raw_covariance = np.cov(self.particles, aweights=self.particle_weights)
/home/sam/anaconda3/envs/semproj/lib/python3.8/site-packages/numpy/lib/function_base.py:2480: RuntimeWarning: divide by zero encountered in true_divide
  c *= np.true_divide(1, fact)
Traceback (most recent call last):
  File "benchmark_protocol_2.11.py", line 658, in <module>
    simple_bayes_2 = unknown_res.plot_compare_time("Unknown resonance", "Simple Bayesian", 
  File "benchmark_protocol_2.11.py", line 383, in plot_compare_time
    self.fit(setting_name, protocol_name, protocol_function, exp_parameter_names, exp_parameters)
  File "benchmark_protocol_2.11.py", line 195, in fit
    self.save_full_metrics(*protocol_function(sample, *exp_parameters))
  File "benchmark_protocol_2.11.py", line 555, in SimpleBayesian
    obe.pdf_update(measurement)
  File "/home/sam/anaconda3/envs/semproj/lib/python3.8/site-packages/optbayesexpt/obe_base.py", line 222, in pdf_update
    self.bayesian_update(likyhd)
  File "/home/sam/anaconda3/envs/semproj/lib/python3.8/site-packages/optbayesexpt/particlepdf.py", line 169, in bayesian_update
    self.resample_test()
  File "/home/sam/anaconda3/envs/semproj/lib/python3.8/site-packages/optbayesexpt/particlepdf.py", line 181, in resample_test
    self.resample()
  File "/home/sam/anaconda3/envs/semproj/lib/python3.8/site-packages/optbayesexpt/particlepdf.py", line 227, in resample
    nudged = newcenters + self.rng.multivariate_normal(origin, newcovar,
  File "_generator.pyx", line 3508, in numpy.random._generator.Generator.multivariate_normal
  File "<__array_function__ internals>", line 5, in svd
  File "/home/sam/anaconda3/envs/semproj/lib/python3.8/site-packages/numpy/linalg/linalg.py", line 1661, in svd
    u, s, vh = gufunc(a, signature=signature, extobj=extobj)
  File "/home/sam/anaconda3/envs/semproj/lib/python3.8/site-packages/numpy/linalg/linalg.py", line 97, in _raise_linalgerror_svd_nonconvergence
    raise LinAlgError("SVD did not converge")
numpy.linalg.LinAlgError: SVD did not converge

The other error is the same as the one in my first message. I'm fitting many Lorentzian dips to noisy data (assuming shot noise, so the standard deviation is known). These dips all have a random location, random FWHM and random depth, though they are guaranteed to be confined by an interval that I have defined. Sometimes the SVD error pops up when fitting many of these spectra and sometimes the NaN error pops up.

My model function is the following:

@staticmethod
def obe_lorentzian(settings, parameters, constants):
    # omega should be a 2D array with size (len(probe_freq), sample_size)
    # omega0, gamma and C0 should be 1D arrays with size (1, sample_size)
    omega, = settings
    omega0, gamma, C0 = parameters

    return 1. - C0/(1. + 4*((omega - omega0)/gamma)**2)

If you mean the standard deviations that are given as input to pdf_update(), then I estimate them according to the measured value assuming Poissonian noise (simply sqrt(M) if the measurement value is M, because M is large (100000-200000) for all measurements). I have looked at my script again and I cannot identify anything that would make the the uncertainties negative or NaN.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants