Errors about compiling and running hypre gpu example of amrex tutorials #3854

ruohai0925 · 2024-03-27T19:31:33Z

Hi, I plan to run the hypre gpu example of amrex-tutorials/ExampleCodes/LinearSolvers/ABecLaplacian_C. I use the latest AMReX, amrex-tutorials, and hypre. Here are my detailed steps, followed by some questions.

I build the hypre library without using the unified memory, MPI, OpenMP. I use CUDA and SINGLE precision.

cmake -DHYPRE_WITH_MPI=OFF -DHYPRE_WITH_OPENMP=OFF -DHYPRE_BUILD_EXAMPLES=OFF -DHYPRE_ENABLE_UNIFIED_MEMORY=OFF -DCMAKE_VERBOSE_MAKEFILE:BOOL=ON -DCMAKE_CXX_STANDARD=17 -DHYPRE_ENABLE_SINGLE=ON -DHYPRE_WITH_CUDA=ON -DCUDA_HOME=”/usr/local/cuda” -DHYPRE_CUDA_SM=86 -DHYPRE_BUILD_TYPE=Release -DCMAKE_BUILD_TYPE=Release -DHYPRE_ENABLE_DEVICE_POOL=ON .. 

cmake --build . --config Release 

cmake --install . --prefix /home/zengx372/Codes/hypre/src/hypre_gpu_wounifiedmemory --config Release

Then I do

export HYPRE_DIR=/home/zengx372/Codes/hypre/src/hypre_gpu_wounifiedmemory

Next I directly go to the third step of this pape https://amrex-codes.github.io/amrex/tutorials_html/Hypre_Install.html#building-with-hypre-via-gnumake to compile this example using GNUMake by using the following script.

DEBUG = FALSE

USE_MPI  = FALSE
USE_OMP  = FALSE
USE_CUDA = TRUE
USE_PRECISION = FLOAT
USE_HYPRE = TRUE

USE_PETSC = FALSE

COMP = gnu

DIM = 3

AMREX_HOME ?= ../../../../amrex

include $(AMREX_HOME)/Tools/GNUMake/Make.defs

include ./Make.package

Pdirs   := Base Boundary LinearSolvers/MLMG

Ppack   += $(foreach dir, $(Pdirs), $(AMREX_HOME)/Src/$(dir)/Make.package)

include $(Ppack)

include $(AMREX_HOME)/Tools/GNUMake/Make.rules

The compiling error is:

nvcc -MMD -MP -ccbin=g++ -Xcompiler=' -Werror=return-type -g -O3 -std=c++17  -pthread  -std=c++17' --std=c++17 -Wno-deprecated-gpu-targets -m64 --generate-code arch=compute_80,code=sm_80 -maxrregcount=255 --expt-relaxed-constexpr --expt-extended-lambda --forward-unknown-to-host-compiler -Xcudafe --diag_suppress=esa_on_defaulted_function_ignored -Xcudafe --diag_suppress=implicit_return_from_non_void_function --Werror cross-execution-space-call -lineinfo --ptxas-options=-O3 --ptxas-options=-v --use_fast_math  --Werror ext-lambda-captures-this --display-error-number --diag-error 20092 -x cu -dc   -DAMREX_USE_CUDA -DAMREX_USE_GPU -DBL_COALESCE_FABS -DAMREX_GPU_MAX_THREADS=256 -DAMREX_USE_GPU_RDC -DBL_SPACEDIM=3 -DAMREX_SPACEDIM=3 -DBL_FORT_USE_UNDERSCORE -DAMREX_FORT_USE_UNDERSCORE -DBL_Linux -DAMREX_Linux -DNDEBUG -DAMREX_USE_HYPRE -Itmp_build_dir/s/3d.gnu.CUDA.EXE -I. -I../../../../amrex/Src/Extern/HYPRE -I../../../../amrex/Src/Base -I../../../../amrex/Src/Base/Parser -I../../../../amrex/Src/Boundary -I../../../../amrex/Src/LinearSolvers/MLMG -isystem /home/zengx372/Codes/hypre/src/hypre_gpu_wounifiedmemory/include -isystem /usr/local/cuda/include  -c MyTest.cpp -o tmp_build_dir/o/3d.gnu.CUDA.EXE/MyTest.o
../../../../amrex/Src/Extern/HYPRE/AMReX_HypreIJIface.cpp(153): error: argument of type "amrex::HypreIJIface::HypreRealType *" is incompatible with parameter of type "amrex::Real *"
      m_solverFinalResidualNormPtr(m_solver, &m_final_res_norm);
                                             ^

../../../../amrex/Src/Extern/HYPRE/AMReX_HypreIJIface.cpp(392): error: a value of type "HYPRE_Int (*)(HYPRE_Solver, HYPRE_Real)" cannot be assigned to an entity of type "amrex::HypreIJIface::HypreIntType (*)(HYPRE_Solver, amrex::Real)"
      m_solverSetTolPtr = &HYPRE_BoomerAMGSetTol;
                        ^

../../../../amrex/Src/Extern/HYPRE/AMReX_HypreIJIface.cpp(396): error: a value of type "HYPRE_Int (*)(HYPRE_Solver, HYPRE_Real *)" cannot be assigned to an entity of type "amrex::HypreIJIface::HypreIntType (*)(HYPRE_Solver, amrex::Real *)"
      m_solverFinalResidualNormPtr = &HYPRE_BoomerAMGGetFinalRelativeResidualNorm;

It seems that there are some mismatches of the datatype in the above output.

Then I also try to use CMake to compile this example by following https://amrex-codes.github.io/amrex/tutorials_html/Hypre_Install.html#building-with-hypre-via-cmake. The commands are:

cmake .. -DAMReX_HYPRE=ON -DHYPRE_LIBRARIES=${HYPRE_DIR}/lib64/libHYPRE.a -DHYPRE_INCLUDE_DIRS=${HYPRE_DIR}/include -DAMReX_LINEAR_SOLVERS=ON -DAMReX_FORTRAN=ON -DAMReX_PRECISION=SINGLE -DAMReX_MPI=NO -DAMReX_OMP=OFF -DAMReX_CUDA_DEBUG=NO -DAMReX_GPU_BACKEND=CUDA -DAMReX_CUDA_ARCH=86 -DAMReX_AMRLEVEL=NO

cmake --build . -j8

The build works as

But when I run the example with:

./ABecLaplacian_C inputs.hypre

It shows the error:

Initializing AMReX (24.03-37-g2a9e992fce9b)...
Initializing CUDA...
CUDA initialized with 1 device.
AMReX (24.03-37-g2a9e992fce9b) initialized
MLMG: # of AMR levels: 1
      # of MG levels on the coarsest AMR level: 1
MLMG: Initial rhs               = 2347.400635
MLMG: Initial residual (resid0) = 2347.400635
Erroneous arithmetic operation
See Backtrace.0 file for details

And here is the Backtrace file:

0: ./ABecLaplacian_C() [0x4e50e6]
   amrex::BLBackTrace::print_backtrace_info(_IO_FILE*) at ??:?

1: ./ABecLaplacian_C() [0x4e77c2]
   amrex::BLBackTrace::handler(int) at ??:?

2: /lib64/libc.so.6(+0x4ad50) [0x7fe7a0419d50]

3: ./ABecLaplacian_C() [0x94a3bd]
   hypre_BoomerAMGSolve at ??:?

4: ./ABecLaplacian_C() [0x8de441]
   amrex::HypreIJIface::solve(float, float, int) at ??:?

5: ./ABecLaplacian_C() [0x8bfd1e]
   amrex::HypreABecLap3::solve(amrex::MultiFab&, amrex::MultiFab const&, float, float, int, amrex::BndryDataT<amrex::MultiFab> const&, int) at ??:?

6: ./ABecLaplacian_C() [0x753df6]
   void amrex::MLMGT<amrex::MultiFab>::bottomSolveWithHypre<amrex::MultiFab, 0>(amrex::MultiFab&, amrex::MultiFab const&) at ??:?

7: ./ABecLaplacian_C() [0x75467e]
   amrex::MLMGT<amrex::MultiFab>::actualBottomSolve() at ??:?

8: ./ABecLaplacian_C() [0x754c50]
   amrex::MLMGT<amrex::MultiFab>::mgVcycle(int, int) at ??:?

9: ./ABecLaplacian_C() [0x75660c]
   amrex::MLMGT<amrex::MultiFab>::oneIter(int) at ??:?

10: ./ABecLaplacian_C() [0x45a636]
   float amrex::MLMGT<amrex::MultiFab>::solve<amrex::MultiFab>(amrex::Vector<amrex::MultiFab*, std::allocator<amrex::MultiFab*> > const&, amrex::Vector<amrex::MultiFab const*, std::allocator<amrex::MultiFab const*> > const&, float, float, char const*) at ??:?

11: ./ABecLaplacian_C() [0x45b724]
   float amrex::MLMGT<amrex::MultiFab>::solve<amrex::MultiFab>(std::initializer_list<amrex::MultiFab*>, std::initializer_list<amrex::MultiFab const*>, float, float, char const*) at ??:?

12: ./ABecLaplacian_C() [0x44ce3e]
   MyTest::solveABecLaplacian() at ??:?

13: ./ABecLaplacian_C() [0x426f59]
   main at ??:?

14: /lib64/libc.so.6(__libc_start_main+0xef) [0x7fe7a040429d]

15: ./ABecLaplacian_C() [0x44868a]
   _start
../sysdeps/x86_64/start.S:122

Can someone help to give me some suggestions? I will be happy to provide info if needed.

The text was updated successfully, but these errors were encountered:

ruohai0925 · 2024-03-27T23:31:45Z

The above errors still appear even if I compile and build the hypre with -DHYPRE_ENABLE_UNIFIED_MEMORY=ON and add the amrex.the_arena_is_managed = 1 in the inputs file.

WeiqunZhang · 2024-03-31T23:24:22Z

Does it work if you use double precision?

WeiqunZhang · 2024-03-31T23:58:32Z

I can reproduce the failure. It might be that the particular problem is simply too hard in single precision. The relative tolerance in the setup is hard wired to 1.e-10. I changed them all to 1.e-4. Then if I run with prob_type=1 (i.e., Poisson), it seems to work. If I run with prob_type=2 n_cell=32, it runs for a number of iterations and the residual blows up.

WeiqunZhang · 2024-04-01T00:11:52Z

If you make the problem type 2 easier by make it more diagonal dominant,

index 97c928a88d..f3d7bb697d 100644
--- a/Tests/LinearSolvers/ABecLaplacian_C/MyTest.H
+++ b/Tests/LinearSolvers/ABecLaplacian_C/MyTest.H
@@ -75,7 +75,7 @@ private:
     amrex::Vector<amrex::MultiFab> acoef;
     amrex::Vector<amrex::MultiFab> bcoef;
 
-    amrex::Real ascalar = 1.e-3;
+    amrex::Real ascalar = 1.e-1;
     amrex::Real bscalar = 1.0;
 };

it also works. So I don't think there is a real issue here.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Errors about compiling and running hypre gpu example of amrex tutorials #3854

Errors about compiling and running hypre gpu example of amrex tutorials #3854

ruohai0925 commented Mar 27, 2024 •

edited

ruohai0925 commented Mar 27, 2024

WeiqunZhang commented Mar 31, 2024

WeiqunZhang commented Mar 31, 2024

WeiqunZhang commented Apr 1, 2024

Errors about compiling and running hypre gpu example of amrex tutorials #3854

Errors about compiling and running hypre gpu example of amrex tutorials #3854

Comments

ruohai0925 commented Mar 27, 2024 • edited

ruohai0925 commented Mar 27, 2024

WeiqunZhang commented Mar 31, 2024

WeiqunZhang commented Mar 31, 2024

WeiqunZhang commented Apr 1, 2024

ruohai0925 commented Mar 27, 2024 •

edited