HYPRE CUDA build problems #1049

slapgas · 2024-01-10T12:36:32Z

I want to compile HYPRE with CUDA using NVIDIA HPC-SDK.

I load the nvhpc environment and use the following script:

cmake -G Ninja \
-DMPI_C_COMPILER=mpicc \
-DCMAKE_BUILD_TYPE=Release \
-DCMAKE_INSTALL_PREFIX=$TARGET \
-DHYPRE_PRINT_ERRORS=ON \
-DHYPRE_BUILD_EXAMPLES=ON \
-DHYPRE_BUILD_TESTS=ON \
-DHYPRE_WITH_CUDA=ON \
-DHYPRE_CUDA_SM='75' \
-DHYPRE_WITH_GPU_AWARE_MPI=ON \
-DHYPRE_ENABLE_UNIFIED_MEMORY=ON \
../src/

ninja
ninja install

Build completes successfully. However, no examples seem to work.

For instance, trying to run example 3 using

mpirun -np 1 ./ex3

I get

CUDA ERROR (code = 700, an illegal memory access was encountered) at /home/sadboysquad/lib/hypre/src/utilities/device_utils.c:258
CUDA ERROR (code = 700, an illegal memory access was encountered) at /home/sadboysquad/lib/hypre/src/utilities/memory.c:346
hypre error in file "/home/sadboysquad/lib/hypre/src/utilities/memory.c", line 65, error code = 2 - Out of memory trying to allocate 9800 bytes

--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
with errorcode -1.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------

Similar errors appear in all examples.

My nvaccelinfo ouput is:

CUDA Driver Version:           12030
NVRM version:                  NVIDIA UNIX x86_64 Kernel Module  545.29.06  Thu Nov 16 01:59:08 UTC 2023

Device Number:                 0
Device Name:                   NVIDIA GeForce GTX 1650 Ti
Device Revision Number:        7.5
Global Memory Size:            4085317632
Number of Multiprocessors:     16
Concurrent Copy and Execution: Yes
Total Constant Memory:         65536
Total Shared Memory per Block: 49152
Registers per Block:           65536
Warp Size:                     32
Maximum Threads per Block:     1024
Maximum Block Dimensions:      1024, 1024, 64
Maximum Grid Dimensions:       2147483647 x 65535 x 65535
Maximum Memory Pitch:          2147483647B
Texture Alignment:             512B
Clock Rate:                    1485 MHz
Execution Timeout:             Yes
Integrated Device:             No
Can Map Host Memory:           Yes
Compute Mode:                  default
Concurrent Kernels:            Yes
ECC Enabled:                   No
Memory Clock Rate:             6001 MHz
Memory Bus Width:              128 bits
L2 Cache Size:                 1048576 bytes
Max Threads Per SMP:           1024
Async Engines:                 3
Unified Addressing:            Yes
Managed Memory:                Yes
Concurrent Managed Memory:     Yes
Preemption Supported:          Yes
Cooperative Launch:            Yes
Default Target:                cc75

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HYPRE CUDA build problems #1049

HYPRE CUDA build problems #1049

slapgas commented Jan 10, 2024

HYPRE CUDA build problems #1049

HYPRE CUDA build problems #1049

Comments

slapgas commented Jan 10, 2024