Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

saved_logwt_bs error after completion #112

Open
shiningsurya opened this issue Nov 1, 2023 · 1 comment
Open

saved_logwt_bs error after completion #112

shiningsurya opened this issue Nov 1, 2023 · 1 comment

Comments

@shiningsurya
Copy link

  • UltraNest version: 3.4.4
  • Python version: 3.8.8
  • Operating System: linux, hpc

Description

(let me preface by saying i love this piece of code)

i am solving a 6D fitting problem using ultranest. Five are angles hence are wrapped parameters. One is a simple parameter.
i am running it on hpc with 480 tasks using MPI.

it crashes after it has converged to the ML point.

CHI       :    +0.000|                                 +2.756  *     |   +3.142
NU        :     +0.00|                         +2.17  *  +2.19       |    +3.14
DELTA     :     +0.00|                     +1.88  *  +1.91           |    +3.14
TAU       :    +0.000|   +1.569  *  +1.577                           |   +6.283
PHI       :    +0.000|                    +3.799  *  +3.803          |   +6.283
OMEGA_SPIN: +0.050006283|    +2.520271104  *  +2.520271106              |+6.333179024

[ultranest] Explored until L=-2e+02  9.61 [-189.6137..-189.6128]*| it/evals=950560/566217664 eff=0.1679% N=16384 
[ultranest] Likelihood function evaluations: 566217664
Traceback (most recent call last):
  File "/hercules/results/sbethapudi/frbm/frb-modeling/python/run_precession_kp.py", line 171, in <module>
    result  = sampler.run (
  File "/u/sbethapudi/.local/lib/python3.8/site-packages/ultranest/integrator.py", line 2173, in run
    for result in self.run_iter(
  File "/u/sbethapudi/.local/lib/python3.8/site-packages/ultranest/integrator.py", line 2539, in run_iter
    self._update_results(main_iterator, saved_logl, saved_nodeids)
  File "/u/sbethapudi/.local/lib/python3.8/site-packages/ultranest/integrator.py", line 2635, in _update_results
    results = combine_results(
  File "/u/sbethapudi/.local/lib/python3.8/site-packages/ultranest/netiter.py", line 897, in combine_results
    recv_saved_logwt_bs = mpi_comm.gather(saved_logwt_bs, root=0)
  File "mpi4py/MPI/Comm.pyx", line 1262, in mpi4py.MPI.Comm.gather
  File "mpi4py/MPI/msgpickle.pxi", line 680, in mpi4py.MPI.PyMPI_gather
  File "mpi4py/MPI/msgpickle.pxi", line 685, in mpi4py.MPI.PyMPI_gather
  File "mpi4py/MPI/msgpickle.pxi", line 148, in mpi4py.MPI.Pickle.allocv
  File "mpi4py/MPI/msgpickle.pxi", line 139, in mpi4py.MPI.Pickle.alloc
SystemError: Negative size passed to PyBytes_FromStringAndSize

This has happened during multiple runs with the same error.
Looking through the traceback, it is failing in the gather step.

This happened after the iteration has completed.
In my code, after i run sampler.run , i run store_tree,print_results,plot_corner.

What I Did

these are my parameters for run.

        min_num_live_points=16384,                                                                                             
        ## target evidence uncertainty                                                                                         
        dlogz=1e-1,                                                                                                            
        dKL=1e-1,                                                                                                              
        frac_remain=1E-9,                                                                                                      
        Lepsilon=1E-5,                                                                                                         
        ## less than live_points                                                                                               
        min_ess=8192,                                                                                                          
        max_num_improvement_loops=8, 
@JohannesBuchner
Copy link
Owner

JohannesBuchner commented Nov 2, 2023

If you can reproduce, please add a print before to see what saved_logwt_bs contains and how large it is before this line:

    recv_saved_logwt_bs = mpi_comm.gather(saved_logwt_bs, root=0)

in file "/u/sbethapudi/.local/lib/python3.8/site-packages/ultranest/netiter.py", line 897, in combine_results

This is an error I have not seen before.

mpi4py/mpi4py#23 suggests you may have crossed a 2GB threshold that your MPI does not support. I guess this translates into a limit on number of live points x number of iterations, the latter is increased with the improvement loops.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants