Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HYPRE CUDA build problems #1049

Open
slapgas opened this issue Jan 10, 2024 · 0 comments
Open

HYPRE CUDA build problems #1049

slapgas opened this issue Jan 10, 2024 · 0 comments

Comments

@slapgas
Copy link

slapgas commented Jan 10, 2024

I want to compile HYPRE with CUDA using NVIDIA HPC-SDK.

I load the nvhpc environment and use the following script:

cmake -G Ninja \
-DMPI_C_COMPILER=mpicc \
-DCMAKE_BUILD_TYPE=Release \
-DCMAKE_INSTALL_PREFIX=$TARGET \
-DHYPRE_PRINT_ERRORS=ON \
-DHYPRE_BUILD_EXAMPLES=ON \
-DHYPRE_BUILD_TESTS=ON \
-DHYPRE_WITH_CUDA=ON \
-DHYPRE_CUDA_SM='75' \
-DHYPRE_WITH_GPU_AWARE_MPI=ON \
-DHYPRE_ENABLE_UNIFIED_MEMORY=ON \
../src/

ninja
ninja install

Build completes successfully. However, no examples seem to work.

For instance, trying to run example 3 using

mpirun -np 1 ./ex3

I get

CUDA ERROR (code = 700, an illegal memory access was encountered) at /home/sadboysquad/lib/hypre/src/utilities/device_utils.c:258
CUDA ERROR (code = 700, an illegal memory access was encountered) at /home/sadboysquad/lib/hypre/src/utilities/memory.c:346
hypre error in file "/home/sadboysquad/lib/hypre/src/utilities/memory.c", line 65, error code = 2 - Out of memory trying to allocate 9800 bytes

--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
with errorcode -1.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------

Similar errors appear in all examples.

My nvaccelinfo ouput is:

CUDA Driver Version:           12030
NVRM version:                  NVIDIA UNIX x86_64 Kernel Module  545.29.06  Thu Nov 16 01:59:08 UTC 2023

Device Number:                 0
Device Name:                   NVIDIA GeForce GTX 1650 Ti
Device Revision Number:        7.5
Global Memory Size:            4085317632
Number of Multiprocessors:     16
Concurrent Copy and Execution: Yes
Total Constant Memory:         65536
Total Shared Memory per Block: 49152
Registers per Block:           65536
Warp Size:                     32
Maximum Threads per Block:     1024
Maximum Block Dimensions:      1024, 1024, 64
Maximum Grid Dimensions:       2147483647 x 65535 x 65535
Maximum Memory Pitch:          2147483647B
Texture Alignment:             512B
Clock Rate:                    1485 MHz
Execution Timeout:             Yes
Integrated Device:             No
Can Map Host Memory:           Yes
Compute Mode:                  default
Concurrent Kernels:            Yes
ECC Enabled:                   No
Memory Clock Rate:             6001 MHz
Memory Bus Width:              128 bits
L2 Cache Size:                 1048576 bytes
Max Threads Per SMP:           1024
Async Engines:                 3
Unified Addressing:            Yes
Managed Memory:                Yes
Concurrent Managed Memory:     Yes
Preemption Supported:          Yes
Cooperative Launch:            Yes
Default Target:                cc75
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant