Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] OCCA Backend Update #1043

Draft
wants to merge 22 commits into
base: main
Choose a base branch
from
Draft

Conversation

kris-rowe
Copy link
Collaborator

When complete, this PR will update the OCCA backend to be compatible with OCCA v1.4 and add support for the OCCA OpenCL and SYCL backends.

This PR will close #816.
This PR conflicts with #1007.

@jedbrown
Copy link
Member

Thanks @kris-rowe! Feel free to tag in myself or @jeremylt at any time.

@jedbrown
Copy link
Member

Great, I see this compiling. This issue is probably an easy fix, but I get this failure.

$ make test search=t001 BACKENDS=/cpu/self/occa
[...]
not ok 3 t001-ceed /cpu/self/occa stderr
# +/home/jed/src/libCEED/backends/occa/ceed-occa.cpp:336 in registerBackend():
# +---[ Error ]--------------------------------------------------------------------
# +File     : /home/jed/src/occa/src/types/json.cpp
# +Line     : 491
# +Function : operator[]
# +Message  : Path '' is not an object
# +Stack
# +4 build/t001-ceed(+0x11a6)
# +3 /usr/lib/libc.so.6(+0x232d0)
# +2 /usr/lib/libc.so.6(__libc_start_main+0x8a)
# +1 build/t001-ceed(+0x10a5)

Building with OPT='-g' gives this trace

#0  0x00007ffff776a4dc in ?? () from /usr/lib/libc.so.6
#1  0x00007ffff771a998 in raise () from /usr/lib/libc.so.6
#2  0x00007ffff770453d in abort () from /usr/lib/libc.so.6
#3  0x00007ffff7a32887 in CeedErrorAbort (ceed=0x5555558af080, filename=0x7ffff7bb36b8 "/home/jed/src/libCEED/backends/occa/ceed-occa.cpp", line_no=336, func=0x7ffff7bb37c9 "registerBackend", err_code=-2, format=0x5555558b4010 "\n---[ Error ]", '-' <repeats 68 times>, "\n    File     : /home/jed/src/occa/src/types/json.cpp\n    Line     : 491\n    Function : operator[]\n    Message  : Path "..., args=0x7fffffffd060) at /home/jed/src/libCEED/interface/ceed.c:1245
#4  0x00007ffff7a325c5 in CeedErrorImpl (ceed=0x5555558af080, filename=0x7ffff7bb36b8 "/home/jed/src/libCEED/backends/occa/ceed-occa.cpp", lineno=336, func=0x7ffff7bb37c9 "registerBackend", ecode=-2, format=0x5555558b4010 "\n---[ Error ]", '-' <repeats 68 times>, "\n    File     : /home/jed/src/occa/src/types/json.cpp\n    Line     : 491\n    Function : operator[]\n    Message  : Path "...) at /home/jed/src/libCEED/interface/ceed.c:1168
#5  0x00007ffff7aa3b42 in ceed::occa::registerBackend (resource=0x7fffffffde0b "/cpu/self/occa", ceed=0x5555558af080) at /home/jed/src/libCEED/backends/occa/ceed-occa.cpp:336
#6  0x00007ffff7a31e0a in CeedInit (resource=0x7fffffffde0b "/cpu/self/occa", ceed=0x7fffffffd7e0) at /home/jed/src/libCEED/interface/ceed.c:967
#7  0x00005555555551b8 in main (argc=2, argv=0x7fffffffd908) at /home/jed/src/libCEED/tests/t001-ceed.c:10

@kris-rowe
Copy link
Collaborator Author

I tracked down the configuration issue. Will see if the rest of the tests pass locally now.

@@ -0,0 +1,4 @@
#ifndef _OCCA_INCLUDE_CEED_H_
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want to include this during JiT, see the example files like this in `/include/ceed/jit-source/cuda/cuda-jit.h'

Any files that will be used as JiT source should go in include/ceed/jit-source/occa

@@ -19,10 +19,11 @@ CEED_QFUNCTION(Mass2DBuild)(void *ctx, const CeedInt Q,
// *INDENT-OFF*
// in[0] is Jacobians with shape [2, nc=2, Q]
// in[1] is quadrature weights, size (Q)
const CeedScalar (*J)[2][CEED_Q_VLA] = (const CeedScalar(*)[2][CEED_Q_VLA])in[0],
*w = in[1];
typedef CeedScalar array_t[3][CEED_Q_VLA];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that we want to force users to have to do this.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For sure. This is a temporary hack since there is a deadline next week.

I will update the OCCA parser to handle pointers-to-arrays correctly in the near future.

@jedbrown
Copy link
Member

jedbrown commented Sep 6, 2022

Our CI is back online so I think we'll be able to test this now with CUDA and/or ROCm. We have oneAPI compilers installed, though not with Intel hardware. What would you consider a good choice for testing?

@jedbrown jedbrown added this to the v0.11 milestone Sep 6, 2022
@jedbrown jedbrown mentioned this pull request Sep 6, 2022
6 tasks
@@ -29,11 +28,12 @@ struct ChannelContext_ {
CeedScalar B; // !< Body-force driving the flow
struct NewtonianIdealGasContext_ newtonian_ctx;
};
#define ChannelContext struct ChannelContext_*;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#define ChannelContext struct ChannelContext_*;
#define ChannelContext struct ChannelContext_*

@jedbrown
Copy link
Member

jedbrown commented Sep 7, 2022

I can run this variant of the channel test (which includes output validation) with /cpu/self/occa and /gpu/cuda, but it fails with /gpu/cuda/occa.

$ build/fluids-navierstokes -options_file examples/fluids/channel.yaml -compare_final_state_atol 2e-11 -compare_final_state_filename examples/fluids/tests-output/fluids-navierstokes-channel.bin -dm_plex_box_faces 5,5,1 -ts_max_steps 5 -ceed /cpu/self/occa -dm_mat_preallocate_skip 0 -snes_fd_color

The trace is:

Thread 1 "fluids-navierst" received signal SIGSEGV, Segmentation fault.
0x00007fffbae86c51 in occa::modeMemory_t::addMemoryRef(occa::memory*) () from /home/jed/src/occa/lib/libocca.so
(gdb) bt
#0  0x00007fffbae86c51 in occa::modeMemory_t::addMemoryRef(occa::memory*) () from /home/jed/src/occa/lib/libocca.so
#1  0x00007ffff5aeed56 in ceed::occa::arrayToMemory<double> (array=array@entry=0x7ffe4be05400) at /home/jed/src/libCEED/backends/occa/ceed-occa-vector.hpp:27
#2  0x00007ffff5aecdef in ceed::occa::Vector::useArrayPointer (this=0x555559006800, mtype=<optimized out>, array=0x7ffe4be05400) at /home/jed/src/libCEED/backends/occa/ceed-occa-vector.cpp:246
#3  0x00007ffff5aed2f0 in ceed::occa::Vector::setArray (this=0x555559006800, mtype=mtype@entry=CEED_MEM_DEVICE, cmode=cmode@entry=CEED_USE_POINTER, array=array@entry=0x7ffe4be05400) at /home/jed/src/libCEED/backends/occa/ceed-occa-vector.cpp:166
#4  0x00007ffff5aed366 in ceed::occa::Vector::ceedSetArray (vec=<optimized out>, mtype=CEED_MEM_DEVICE, cmode=CEED_USE_POINTER, array=0x7ffe4be05400) at /home/jed/src/libCEED/backends/occa/ceed-occa-vector.cpp:450
#5  0x00007ffff5aac1fe in CeedVectorSetArray (vec=0x5555590f9920, mem_type=<optimized out>, copy_mode=copy_mode@entry=CEED_USE_POINTER, array=<optimized out>) at /home/jed/src/libCEED/interface/ceed-vector.c:274
#6  0x000055555557d9ea in ICs_FixMultiplicity (dm=0x555555d65480, ceed_data=0x555555d11720, user=<optimized out>, Q_loc=0x55555911d3b0, Q=0x555559118b30, time=<optimized out>, time@entry=0) at /home/jed/src/libCEED/examples/fluids/src/misc.c:38
#7  0x000055555555a4ef in main (argc=<optimized out>, argv=<optimized out>) at /home/jed/src/libCEED/examples/fluids/navierstokes.c:174

I'm just testing locally so far; does this work for you at JLSE?

@kris-rowe
Copy link
Collaborator Author

Our CI is back online so I think we'll be able to test this now with CUDA and/or ROCm. We have oneAPI compilers installed, though not with Intel hardware. What would you consider a good choice for testing?

The OCCA CUDA backend should be sufficient for testing. If you really wanted to test the SYCL backend specifically, you could build the public Intel LLVM compilers with the SYCL CUDA plugin enabled.

@jeremylt
Copy link
Member

jeremylt commented Sep 26, 2022

Correct me if I'm wrong, but I understand from the comments and commits that the core tests, t1*-t5* and ex1, ex2 all pass with these changes.

In preparation for our upcoming release, I'd like to get this initial work merged. Specifically, I would like to merge the changes in Makefile, backends/occa/*, backends/ceed-backend-list.h and OMIT the changes in examples/*, tests/*, and include/ceed/jit-source/gallery/*

I think a couple of changes need to be added

  • Update for changes in main (rebase or merge?)
  • Move files used in JiT to include/ceed/jit-source/occa
  • Update tests/junit.py (replaces tests/tap.sh to only run tests for t1*-t3* on OCCA backends
  • Update README.md with new OCCA min version and backends
  • Update docs/sphinx/source/releasenotes.md

I can help with any/all of those changes

@jeremylt
Copy link
Member

Thanks for all the hard work here @kris-rowe!

PETSc released last week, so I'd like to tidy up the big open PRs and get a libCEED release soon. I don't want to step on any toes, but I have the time to run down those small tasks I listed above so we get this PR into the release if that's ok with you.

@kris-rowe
Copy link
Collaborator Author

PETSc released last week, so I'd like to tidy up the big open PRs and get a libCEED release soon. I don't want to step on any toes, but I have the time to run down those small tasks I listed above so we get this PR into the release if that's ok with you.

If you have time that would be a huge help.

@jeremylt
Copy link
Member

Ok, #1072 should be ready to merge now

@jedbrown
Copy link
Member

Shall we rebase this after #1072? And is it possible to make block diagonal assembly fall back to CPU for now? This will allow us to run ceed-fluids on PVC with a narrowly localized performance distortion.

@jeremylt jeremylt removed this from the v0.11 milestone Oct 17, 2022
#ifndef _OCCA_INCLUDE_CEED_H_
#define _OCCA_INCLUDE_CEED_H_
// Phony header to include when compiling OKL
#endif
Copy link
Member

@jeremylt jeremylt Oct 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's really intended that JiT headers don't go here in the backend folder.

Comment on lines 110 to 123
<<<<<<< HEAD
const CeedScalar(*J)[CEED_Q_VLA] = (const CeedScalar(*)[CEED_Q_VLA])in[0];
const CeedScalar(*w) = in[1];

// Outputs
CeedScalar(*q_data_sur)[CEED_Q_VLA] = (CeedScalar(*)[CEED_Q_VLA])out[0];
=======
typedef CeedScalar vec_t[CEED_Q_VLA];
const vec_t* J = (const vec_t*) in[0];
const CeedScalar * const w = in[1];
// Outputs
vec_t * q_data_sur = (vec_t*) out[0];
// *INDENT-ON*
>>>>>>> f0e9d76d (Rewrites fluids example qfunctions to be compatible with OCCA.)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Conflict markers

@jedbrown
Copy link
Member

jedbrown commented Dec 2, 2022

Here's a suggested test.

diff --git i/examples/fluids/blasius.yaml w/examples/fluids/blasius.yaml
index cf3056b1e..fd7516ee6 100644
--- i/examples/fluids/blasius.yaml
+++ w/examples/fluids/blasius.yaml
@@ -4,7 +4,7 @@ implicit: true
 ts:
   adapt_type: 'none'
   type: 'beuler'
-  dt: 0.2e-5
+  dt: 0.1e-5
   max_time: 1.0e-3
 output_freq: 10
 
@@ -51,3 +51,15 @@ stg:
   use: false
   inflow_path: "./STGInflow_blasius.dat"
   mean_only: true
+
+pmat_pbdiagonal:
+ksp_type: bcgsl
+pc_type: vpbjacobi
+amat_type: shell
+
+# monitors
+ts_monitor:
+snes_monitor:
+ksp_converged_reason:
+
+ceed: /gpu/cuda

It should produce output like

 build/fluids-navierstokes  -options_file examples/fluids/blasius.yaml
[...]
0 TS dt 1e-06 time 0.
    0 SNES Function norm 6.801978174922e-04
    Linear solve converged due to CONVERGED_RTOL iterations 222
    1 SNES Function norm 1.515152456321e-07
    Linear solve converged due to CONVERGED_RTOL iterations 286
    2 SNES Function norm 4.628934945817e-11
1 TS dt 1e-06 time 1e-06
[...]

@jedbrown
Copy link
Member

jedbrown commented Dec 2, 2022

For performance monitoring, you can add -log_view -log_view_gpu_time.

@jeremylt jeremylt added the 0-WIP label Jan 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

OCCA Backend
3 participants