Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is there any way to parallelize the computation of multiple convolutions? #683

Open
WangYun1995 opened this issue Sep 15, 2023 · 8 comments

Comments

@WangYun1995
Copy link

Hi,
First, I have a 3D density mesh (Nmesh=1536^3) obtained by using nbodykit, stored as a bigfile on my disk.
Then I want to convolve the mesh with a band-pass filter at multiple scales, which is equivalent to the multiplication operation in Fourier domain. Finally, at each scale, I compute some binned statistic env_WPS of the convolved mesh, and my code snippet is shown below.

import numpy as np
from scipy import stats
from nbodykit.lab import BigFileMesh

def bandpass(scale):
   '''The band-pass filter in the Fourier space
   '''
   def filter( k, v ):
      cn       = 4.0*np.sqrt( np.sqrt(np.pi**3)/(9.0+55.0*np.e) )
      k2       = k[0]**2 + k[1]**2 + k[2]**2
      k2      /= scale**2 
      k        = np.sqrt(k2)
      filter_  = cn*( np.exp(k-0.5*k2)*(k2-k) + np.exp(-0.5*k**2-k)*(k2+k) )
      filter_ /= np.sqrt(scale**3)
      return v*filter_
   return filter

# Load the density mesh
path_mesh  = '/.../.../dens_field.bigfile'
dens_mesh  = BigFileMesh(path_mesh, 'Field')

# Parameters
Lbox  = dens_mesh.attrs['BoxSize'][0]    # Unit: Mpc/h
Nmesh = dens_mesh.attrs['Nmesh'][0]     # Integer
kf = 2*np.pi/Lbox
kNyq  = Nmesh*np.pi/Lbox
scales  = np.geomspace(kf, 0.4*kNyq, 25)
dens_bins = np.geomspace(0.1,100,7,endpoint=True) # The density bins

# Compute the volume fraction of each density environment
dens_m     = real_mesh_.compute(mode='real', Nmesh=Nmesh)
dens_m_    = np.ravel(dens_m)
dens_bins_ = np.pad( dens_bins, (1, 1), 'constant', constant_values=(0,np.amax(dens_m)) )
Nvol, _, _ = stats.binned_statistic(dens_m_, dens_m_, 'count', bins=dens_bins_)
fvol       = Nvol/Nmesh**3

# Initialze the env_WPS and global-WPS
env_WPS    = np.zeros( (len(scales), len(dens_bins)+1) )
global_WPS = np.zeros_like( scales )
# Perform the CWT and compute the env-WPS and global-WPS
for i, scale in enumerate(scales):
   cwt_mesh           = real_mesh.apply(bandpass(scale), mode='complex', kind='wavenumber')
   cwt_rmesh          = cwt_mesh.compute(mode='real', Nmesh=Nmesh)
   cwt2               = cwt_rmesh**2
   env_WPS[i,:], _, _ = stats.binned_statistic(dens_m_, np.ravel(cwt2), 'mean', bins=dens_bins_)
   global_WPS[i]      = np.sum( env_WPS[i,:]*fvol )

Since the above code uses only one core, it is inefficient in the case of a large Nmesh. So how to parallelize the code in the context of NBODYKIT?

@rainwoodman
Copy link
Member

  1. Running the code under mpirun. Each MPI process will get a part of the full density field.
  2. I think 'stats.binned_statistic' would only compute the partial statistics on the part of the full density field that is local to this process. You will likely need to call mpi.COMM_WORLD.allreduce(...) on the result of stats.binned_statistic to add up the partial statistics (need to undo the mean first), and then recompute the mean.

@WangYun1995
Copy link
Author

Thank you for your nice reply, i will try what you said.

@WangYun1995
Copy link
Author

Hi Yu,
The method you suggest does work. However, Parallelization is not going to be very efficient. For example, I use a mesh with Nmesh = 768**3 as a test. When using only one core, the time taken is 1407 seconds. When using 64 cores, the time taken is 722 seconds. When using 96 cores, the time taken is 611 seconds.
Obviously, turning on the parallelization only saves half the time, not 1407/64 or 1407/96 seconds.
Why is this the case?

@rainwoodman
Copy link
Member

rainwoodman commented Sep 20, 2023 via email

@WangYun1995
Copy link
Author

The operations inside the for loop consume the most time.

@rainwoodman
Copy link
Member

rainwoodman commented Sep 22, 2023 via email

@WangYun1995
Copy link
Author

This really improves the efficiency. I will acknowledge you in my new paper.

@rainwoodman
Copy link
Member

rainwoodman commented Sep 23, 2023 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants