Issue when running on SLURM cluster #76

rfranceschi · 2022-11-24T16:01:24Z

UltraNest version: >= 3.3.3
Python version: 3.8.15
Operating System: SUSE Linux Enterprise Server 12 SP5

Description

Running UltraNest on a remote SLURM cluster on multiple nodes using Intel MPI 2019.9.

What I Did

Previously my code has been running without issues. There have been no changes in the fitting part of my code, but now it fails with the following message:

File ".../fitters.py", line 437, in fit
    for i, result in enumerate(self.sampler.run_iter(**self.run_kwargs)):
File ".../lib/python3.8/site-packages/ultranest/integrator.py", line 2579, in run_iter
    dlogz_min_num_live_points, (Llo_KL, Lhi_KL), (Llo_ess, Lhi_ess) = self._find_strategy(
  File ".../lib/python3.8/site-packages/ultranest/integrator.py", line 1545, in _find_strategy
    widthratio = 1 - np.exp(logweights[1:,0] - logweights[:-1,0])

This is how the sampler is initialized, nothing special happening here:

self.sampler = ultranest.ReactiveNestedSampler(
            [str(param) for param in self.parameters],
            lnprob,
            self.transform,
            log_dir=self.log_dir,
            resume=self.resume,
            storage_backend=self.storage_backend,
            **self.fitter_kwargs
        )

This is not happening on my local machine and I suspect this may be due to a change in the cluster. Could this be the case, or could this be an issue within UltraNest or my code?

Thank you in advance!

The text was updated successfully, but these errors were encountered:

JohannesBuchner · 2022-11-24T16:09:49Z

Please post the entire error message and the debug.log

Please also post your entire arguments: storage_backend, fitter_kwargs

Perhaps there is a rounding issue, if the sampled points receive strange weights.

You could try putting a print there to see what logweights and widthratio are

rfranceschi · 2022-11-25T14:35:11Z

Apologies, I forgot the last line of the error message:

File ".../fitters.py", line 437, in fit
    for i, result in enumerate(self.sampler.run_iter(**self.run_kwargs)):
File ".../lib/python3.8/site-packages/ultranest/integrator.py", line 2579, in run_iter
    dlogz_min_num_live_points, (Llo_KL, Lhi_KL), (Llo_ess, Lhi_ess) = self._find_strategy(
  File ".../lib/python3.8/site-packages/ultranest/integrator.py", line 1545, in _find_strategy
    widthratio = 1 - np.exp(logweights[1:,0] - logweights[:-1,0])
IndexError: too many indices for array: array is 1-dimensional, but 2 were indexed

Here are the parameters:

storage_backend: Literal['hdf5', 'csv', 'tsv'] = 'hdf5'
run_kwargs={'dlogz': 1, 'dKL': 1, 'frac_remain': 0.5, 'Lepsilon': 0.01, 'min_num_live_points': 100}

I am attaching two files with the values of logweights and widthratio, and the debug logger.

debug.log
logweights.txt
widthratio.txt

JohannesBuchner · 2022-11-25T14:54:13Z

That's odd, logweights is a 1-dimensional array instead of two-dimensional.

JohannesBuchner · 2022-11-25T14:57:44Z

Could you check if changing the line https://github.com/JohannesBuchner/UltraNest/blob/v3.3.3/ultranest/integrator.py#L1543

logweights = np.array(main_iterator.logweights[:itmax])

to

logweights = np.array(main_iterator.logweights[:itmax,:])

fixes this issue?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue when running on SLURM cluster #76

Issue when running on SLURM cluster #76

rfranceschi commented Nov 24, 2022

JohannesBuchner commented Nov 24, 2022

rfranceschi commented Nov 25, 2022

JohannesBuchner commented Nov 25, 2022

JohannesBuchner commented Nov 25, 2022

Issue when running on SLURM cluster #76

Issue when running on SLURM cluster #76

Comments

rfranceschi commented Nov 24, 2022

Description

What I Did

JohannesBuchner commented Nov 24, 2022

rfranceschi commented Nov 25, 2022

JohannesBuchner commented Nov 25, 2022

JohannesBuchner commented Nov 25, 2022