Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

install nbodykit on NERSC Perlmutter #675

Open
biweidai opened this issue Feb 5, 2023 · 5 comments
Open

install nbodykit on NERSC Perlmutter #675

biweidai opened this issue Feb 5, 2023 · 5 comments

Comments

@biweidai
Copy link

biweidai commented Feb 5, 2023

Hi,

The nbodykit built for NERSC cori ( https://nbodykit.readthedocs.io/en/latest/getting-started/install.html#nbodykit-on-nersc ) does not seem to work on Perlmutter. When trying to load nbodykit with

source /global/common/software/m3035/conda-activate.sh 3.8

I got the following error:

/global/common/software/m3035/conda/envs/bcast-bccp-3.8/etc/conda/activate.d/activate-nersc-prgenv-gnu.sh: line 6: /etc/profile.d/nerschost.sh: No such file or directory
/global/common/software/m3035/conda/envs/bcast-bccp-3.8/etc/conda/activate.d/activate-nersc-prgenv-gnu.sh: line 7: /etc/profile.d/modules.sh: No such file or directory
/global/common/software/m3035/conda/envs/bcast-bccp-3.8/etc/conda/activate.d/activate-nersc-prgenv-gnu.sh: line 8: /etc/profile.d/mpi-selector.sh: No such file or directory
/global/common/software/m3035/conda/envs/bcast-bccp-3.8/etc/conda/activate.d/activate-nersc-prgenv-gnu.sh: line 9: /etc/bash.bashrc.local: No such file or directory

So I tried to create my conda environment with nbodykit. I install mpi4py with

module swap PrgEnv-${PE_ENV,,} PrgEnv-gnu
MPICC="cc -shared" pip install --force-reinstall --no-cache-dir --no-binary=mpi4py mpi4py

following https://docs.nersc.gov/development/languages/python/parallel-python/#mpi4py-in-your-custom-conda-environment
I tested it on Perlmutter computing node and it works fine.

But when I try to install nbodykit with

conda install -c bccp nbodykit

It doesn't use the mpi4py I built and reinstalls mpi4py with conda, which no longer works on Perlmutter computing nodes. Can I force it to use the mpi4py I built?

I also tried reinstalling mpi4py again to overwrite the mpi4py conda installed, and I got the following error when running the code:

Attempting to use an MPI routine before initializing MPICH

@rainwoodman
Copy link
Member

rainwoodman commented Feb 5, 2023 via email

@jmsull
Copy link

jmsull commented Jun 28, 2023

Lagging behind Biwei I am just now running into this issue now that cori is gone forever (and have not yet found a way for this to work).

@biweidai
Copy link
Author

Yu's suggestion works for me! I manually install the nbodykit dependencies and nbodykit with PrgEnv-gnu and pip install. Have you tried this?
By the way, if you are going to run nbodykit on jupyter lab, I think the conda install should work.

@jmsull
Copy link

jmsull commented Jun 28, 2023

For posterity, here is what I did:

module load PrgEnv-gnu
module load gsl
conda create --name nbodykit_env python=3.7 pip
env MPICC=cc python -m pip install --no-cache-dir mpi4py
srun -n 5 python -m mpi4py.bench helloworld # test mpi4py
pip install numpy cython
pip install nbodykit[extras] 

which seems to work on a perlmutter compute node

@jmsull
Copy link

jmsull commented Mar 11, 2024

@rainwoodman As you alluded to toward the top of this issue, recreating the bcast-pip scripts would be nice to have - I am running some <4 node jobs and am seeing 5-10 mins of startup time. IIRC bcast-pip improves this.
Biwei and I were discussing this today and may have some bandwidth to do this if you can get us started?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants