Problems with AutoDock-GPU compiled on a cluster #241

xavgit · 2023-09-03T09:39:36Z

Hi,
I have compiled from the sources the latest version of AutoDock-GPU on the Leonardo cluster.
I have compiled two versions.
The first with DEVICE= CUDA and NUMWI=256 and the second with DEVICE=OCLGPU and NUMWI=256.

I have used the following commands:
module load cuda
module load python
and
module list returns:
Currently Loaded Modulefiles:

profile/base 2) python/3.10.8--gcc--11.3.0 3) cuda/11.8

Key:
default-version
Then
export GPU_INCLUDE_PATH=/leonardo/prod/opt/compilers/cuda/11.8/none/include
export GPU_LIBRARY_PATH=/leonardo/prod/opt/compilers/cuda/11.8/none/lib64
make DEVICE= ............................
If I run both the executables without any arguments there is no problem.
I have made a test with only one ligand with the compiled versions and I receive errors.
First I use prepare_gpf4.py and ADFR's autogrid4 and then the AutoDock executables.

For the CUDA version I run:
$ python3 test_ad4gpu.py
sh: line 1: 3637611 Aborted (core dumped) /leonardo/home/userexternal/slemme00/sources/AutoDock-GPU/bin_256wi/autodock_gpu_256wi -x 0 --ffile receptor.maps.fld --lfile DB16260.pdbqt --nrun 100 -N ./docking_res/DB16260.pdbqt_docking_res --gbest 1 > ./docking_res/DB16260.pdbqt_docking_res.log 2>&1
$ less docking_res/DB16260.pdbqt_docking_res.log
autodock_gpu_256wi: ./host/src/performdocking.cpp:128: void setup_gpu_for_docking(GpuData&, GpuTempData&): Assertion `0' failed.

I also have used previous python script with slurm directives and 40 ligands but I receive the same problems.

For the OCLGPU version I run:
$ python3 test_ad4gpu_ocl.py
$ less docking_res_ocl/DB16260.pdbqt_docking_res.log
AutoDock-GPU version: v1.5.3-54-g41083c5e1224d54ad043b62ca53f6618d5e8325d-dirty

Running 1 docking calculation

Kernel source used for development: ./device/calcenergy.cl
Kernel string used for building: ./host/inc/stringify.h
Kernel compilation flags: -I ./device -I ./common -DN256WI -cl-mad-enable
Error: clGetPlatformIDs(): -1001

For the system I use at login I have:
Atos Bull Sequana XH21355 "Da Vinci" Blade -
Red Hat Enterprise Linux 8.6 (Ootpa)

3456 compute nodes with:
- 32 cores Ice Lake at 2.60 GHz
- 4 x NVIDIA Ampere A100 GPUs, 64GB
- 512 GB RAM

Internal Network: Nvidia Mellanox HDR DragonFly++
SLURM 22.05.7

test_ad4gpu.py
import os
os.system( '/leonardo/home/userexternal/slemme00/sources/AutoDock-GPU/bin_256wi/autodock_gpu_256wi -x 0 --ffile receptor.maps.fld --lfile ' + 'DB16260.pdbqt' + ' --nrun 100 -N ' + './docking_res/' + 'DB16260.pdbqt' + '_docking_res --gbest 1 > ./docking_res/' + 'DB16260.pdbqt' + '_docking_res.log 2>&1' )

test_ad4gpu_ocl.py
import os
os.system( '/leonardo/home/userexternal/slemme00/sources/AutoDock-GPU/bin_oclgpu_256/autodock_gpu_256wi -x 0 --ffile receptor.maps.fld --lfile ' + 'DB16260.pdbqt' + ' --nrun 100 -N ' + './docking_res_ocl/' + 'DB16260.pdbqt' + '_docking_res --gbest 1 > ./docking_res_ocl/' + 'DB16260.pdbqt' + '_docking_res.log 2>&1' )

The previous python scripts were modified from a working code on a PC with only one RTX 2080Ti.

What I can do?

Thanks.

Saverio

atillack · 2023-09-03T11:07:48Z

@xavgit The Cuda runtime error should get resolved compiling with TARGETS="80" (plus other desired compute capabilities if there are other architectures). The OpenCL error you are seeing usually means the OpenCL platform isn't registered (installed) on the system.

xavgit · 2023-09-03T13:58:20Z

Hi,
all is fine now with your help.

Thanks.

Saverio

xavgit assigned atillack Sep 3, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Problems with AutoDock-GPU compiled on a cluster #241

Problems with AutoDock-GPU compiled on a cluster #241

xavgit commented Sep 3, 2023

atillack commented Sep 3, 2023

xavgit commented Sep 3, 2023

Problems with AutoDock-GPU compiled on a cluster #241

Problems with AutoDock-GPU compiled on a cluster #241

Comments

xavgit commented Sep 3, 2023

atillack commented Sep 3, 2023

xavgit commented Sep 3, 2023