Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HOOMD-blue-4.0.1-CUDA-11.7.0 installation fails on remote GPU (libldap.so.2 not found) #20369

Open
dfmcub opened this issue Apr 16, 2024 · 0 comments

Comments

@dfmcub
Copy link

dfmcub commented Apr 16, 2024

Greetings!

I need to install HOOMD in a cluster with slurm as the workload manager, and is essential that it works in both CPU compute nodes and GPU compute nodes . These are the most important characteristics:

  • OS : Rocky Linux 8.9 (all nodes)
  • GPU cards : NVIDIA GeForce RTX 4090
  • CUDA version : 12.3

The control node has all the packages installed with easybuild inside /usr/local/ , which is mounted with NFS on top of the compute nodes own /usr/local/ .

Since the control node has no GPUs, I'm trying to install HOOMD-blue-4.0.1-foss-2022a-CUDA-11.7.0.eb by sendig the command in a batch job to any of the GPU compute nodes.

This is the content of the script I used, install_hoomd.sh

#!/bin/bash
#SBATCH --output=install_job_%j_result.log
#SBATCH --error=install_job_%j_error.log

eb HOOMD-blue-4.0.1-foss-2022a-CUDA-11.7.0.eb --robot
$ sbatch -w gpuNode --mem=6G install_hoomd.sh

But then the following error shows up:

== FAILED: Installation ended unsuccessfully (build directory: /usr/local/easybuild/builds/HOOMDblue/4.0.1/foss-2022a-CUDA-11.7.0): build failed (first 300 chars): cmd " cmake -DCMAKE_INSTALL_PREFIX=/usr/local/easybuild/software/HOOMD-blue/4.0.1-foss-2022a-CUDA-11.7.0 -DCMAKE_INSTALL_LOCALSTATEDIR=/usr/local/easybuild/software/HOOMD-blue/4.0.1-foss-2022a-CUDA-11.7.0/var -DCMAKE_INSTALL_RUNSTATEDIR=/usr/local/easybuild/software/HOOMD-blue/4.0.1-foss-2022a-CUDA (took 53 secs)

And the only error text found at the temporary log file is this one:

error while loading shared libraries: libldap.so.2 : cannot open shared object file: No such file or directory

Command that trigges the error :

cmake -DCMAKE_INSTALL_PREFIX=/usr/local/easybuild/software/HOOMD-blue/4.0.1-foss-2022a-CUDA-11.7.0
-DCMAKE_INSTALL_LOCALSTATEDIR=/usr/local/easybuild/software/HOOMD-blue/4.0.1-foss-2022a-CUDA-11.7.0/var
-DCMAKE_INSTALL_RUNSTATEDIR=/usr/local/easybuild/software/HOOMD-blue/4.0.1-foss-2022a-CUDA-11.7.0/var/run
-DCMAKE_INSTALL_SYSCONFDIR=/usr/local/easybuild/software/HOOMD-blue/4.0.1-foss-2022a-CUDA-11.7.0/etc
-DCMAKE_BUILD_TYPE=Release
-DCMAKE_VERBOSE_MAKEFILE=ON
-DCMAKE_FIND_USE_PACKAGE_REGISTRY=OFF
-DENABLE_GPU=ON
-DHOOMD_GPU_PLATFORM=CUDA
-DENABLE_MPI=ON
-DBUILD_MD=ON
-DBUILD_METAL=ON
-DENABLE_TBB=ON
-DBUILD_TESTING=ON
-DPYTHON_EXECUTABLE=$EBROOTPYTHON/bin/python /usr/local/easybuild/builds/HOOMDblue/4.0.1/foss-2022a-CUDA-11.7.0/hoomd-4.0.1/

I browsed the HOOMD documentation and it never mentions that openldap libraries are required for it's installation, and that file is only used by the cluster's login server. What I'm doing wrong ?

This is the easyconfig I used:

easyblock = 'CMakeMake'

name = 'HOOMD-blue'
version = "4.0.1"
versionsuffix = '-CUDA-%(cudaver)s'

homepage = "https://glotzerlab.engin.umich.edu/hoomd-blue/"
description = """HOOMD-blue is a general-purpose particle simulation
toolkit, implementing molecular dynamics and hard particle Monte Carlo
optimized for fast execution on both GPUs and CPUs."""

toolchain = {'name': 'foss', 'version': '2022a'}
toolchainopts = {'usempi': True}

github_account = 'glotzerlab'
source_urls = [GITHUB_LOWER_RELEASE]
sources = ['hoomd-%(version)s.tar.gz']
checksums = ['b63dd8debb96f9c530983bd54ecbafa8fd07e017ded3ea64604cfb1f41a644b8']

builddependencies = [
    ('CMake', '3.24.3'),
    ('pybind11', '2.9.2'),
]

dependencies = [
    ('CUDA', '11.7.0', '', SYSTEM),
    ('UCX-CUDA', '1.12.1', versionsuffix),
    ('Python', '3.10.4'),
    ('SciPy-bundle', '2022.05'),
    ('tbb', '2021.5.0'),
    ('Eigen', '3.4.0'),
    ('Cereal', '1.3.2', '', SYSTEM),
]

_copts = [
    '-DENABLE_GPU=ON',
    '-DHOOMD_GPU_PLATFORM=CUDA',
    '-DENABLE_MPI=ON',
    '-DBUILD_MD=ON',
    '-DBUILD_METAL=ON',
    '-DENABLE_TBB=ON',
    '-DBUILD_TESTING=ON',
    '-DPYTHON_EXECUTABLE=$EBROOTPYTHON/bin/python',
]
configopts = ' '.join(_copts)
postinstallcmds = [
    'ln -s hoomd/include %(installdir)s/include',
]

runtest = 'test'

sanity_check_paths = {
    'files': ['hoomd/__init__.py', 'hoomd/include/hoomd/Compute.h'],
    'dirs': ['lib/cmake'],
}

sanity_check_commands = [
    "python -c 'import hoomd'",
]

modextrapaths = {'PYTHONPATH': ''}

moduleclass = 'phys'

A lot of thanks for the great work !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant