Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Overflow Error: value too large to convert to int" When I used "ScatterArray()" to scatter large numpy arrays #649

Open
WangYun1995 opened this issue Feb 28, 2021 · 1 comment

Comments

@WangYun1995
Copy link

Hi, @rainwoodman

For performing 'FOF' parallel, I want to load all coordinates of 1820**3 particles if comm.rank == 0. And then scatter these coordinates to other ranks. My code snippet is shown below

comm = CurrentMPIComm.get()
assert comm.size == 8

if comm.rank == 0:
    with open("/home2/TNG100-1/postprocessing/dm_coordinates_z0.dat") as ff:
        dm_pos = np.fromfile(ff, dtype='float32').reshape((3,6028568000), order="F")

    ddm_coord = (dm_pos.transpose()) / 1000.0  
else:
    ddm_coord = None

dm_coord = ScatterArray(ddm_coord, comm=comm, root=0)

However, after typed mpiexec -n 8 ipython filename.py, it returned to me the Overflow Error.

---------------------------------------------------------------------------
OverflowError                             Traceback (most recent call last)
~/test_01.py in <module>
     23     ddm_coord = None
     24 
---> 25 dm_coord = ScatterArray(ddm_coord, comm=comm, root=0)
     26 
     27 print("rank:", comm.rank, "Shape of dm_coord", dm_coord.shape)

/opt/miniconda3/envs/nbodykit-env/lib/python3.8/site-packages/nbodykit/utils.py in ScatterArray(data, comm, root, counts)
    339     # do the scatter
    340     comm.Barrier()
--> 341     comm.Scatterv([data, (counts, offsets), dt], [recvbuffer, dt])
    342     dt.Free()
    343     return recvbuffer

mpi4py/MPI/Comm.pyx in mpi4py.MPI.Comm.Scatterv()

mpi4py/MPI/msgbuffer.pxi in mpi4py.MPI._p_msg_cco.for_scatter()

mpi4py/MPI/msgbuffer.pxi in mpi4py.MPI._p_msg_cco.for_cco_send()

mpi4py/MPI/msgbuffer.pxi in mpi4py.MPI.message_vector()

mpi4py/MPI/asarray.pxi in mpi4py.MPI.chkarray()

mpi4py/MPI/asarray.pxi in mpi4py.MPI.getarray()

OverflowError: value too large to convert to int

Can you help me fix this issue?

@WangYun1995 WangYun1995 changed the title "Overflow Error: value to large to convert to int" When I used "ScatterArray()" to scatter large numpy arrays "Overflow Error: value to large too convert to int" When I used "ScatterArray()" to scatter large numpy arrays Feb 28, 2021
@WangYun1995 WangYun1995 changed the title "Overflow Error: value to large too convert to int" When I used "ScatterArray()" to scatter large numpy arrays "Overflow Error: value too large to convert to int" When I used "ScatterArray()" to scatter large numpy arrays Feb 28, 2021
@rainwoodman
Copy link
Member

1820 ** 3 is a lot of particles. mpi4py has an internal limit of 2^32 or 2^31 for an array's length. I think you are running against that.

In general for reading a large file, I recommend directly reading the data to the rank where you'd them to be, rather than read-then-scatter. This way we avoid the bottleneck on the root rank.

It would be something like this:

N = 1820**3
start = N *comm.rank // comm.size
end = N * (comm.rank + 1) // comm.size

with open('f.dat', 'b') as fp:
  fp.seek(start * 4)
  local_x = np.fromfile("abc.dat", (end - start), dtype='f4')
  fp.seek(N * 4 + start * 4)
  local_y = np.fromfile("abc.dat", (end - start), dtype='f4')
  fp.seek(N * 8 + start * 4)
  local_z = np.fromfile("abc.dat", (end - start), dtype='f4')

A similar algorithm is implemented in BinaryCatalog; so you can also use that instead:

https://nbodykit.readthedocs.io/en/latest/catalogs/reading.html#Binary-Data

cat=BinaryCatalog(..., ('X', 'f4'), ('Y', 'f4'), ('Z', 'f4'), size=1820**3, comm=comm)
cat['Position'] = transform.StackColumns([cat['X'], cat['Y'], cat['Z'])

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants