memcheck and CEED_STRIDES_BACKEND #1460

jedbrown · 2024-02-02T19:40:11Z

If you call CeedElemRestrictionCreateStrided() with CEED_STRIDES_BACKEND, and you stick in values on the CPU, you can get code that works fine with the CPU device and fails with the GPU device, and it is very hard to see why that is happening.

A couple ideas to catch this sooner:

We could make memcheck use a weird permutation for its backend so you can at least detect this on the host.
We could forbid access to vectors associated with CEED_STRIDES_BACKEND. I think this would require extra state. It should still be possible to compute permutation-invariant things like norms.

The text was updated successfully, but these errors were encountered:

jeremylt · 2024-02-02T19:54:06Z

Really, the issue is that CEED_STRIDES_BACKEND was specifically intended to only be valid when you keep using the same backend. Everything works fine if you use the exact same backend throughout. I thought the documentation called this out.

@ref CEED_STRIDES_BACKEND may be used for `CeedVector` ordered by the same `Ceed` backend.

That isn't the best explanation though. We should make it clearer (though people don't always read the documentation carefully to catch all implications of what's in there).

Technically, the L-vector ordering is supposed to be consistent across backends when using CEED_STRIDES_BACKEND, so it just matters when you try to get at the raw E-vector ordering, which isn't something users are supposed to be doing.

Making the memcheck backend use a bizarre permutation would also mean we need a brand new memcheck CeedOperator implementation, so I'd recommend against that.

knepley · 2024-02-02T19:57:30Z

Technically, the L-vector ordering is supposed to be consistent across backends when using CEED_STRIDES_BACKEND, so it just matters when you try to get at the raw E-vector ordering, which isn't something users are supposed to be doing.

What I did is pull out the array as CEED_MEMORY_HOST and fill it up (with geometric data), and then use that as a kernel input. On the CPU it worked fine, but on the GPU it was reordered. When I used explicit strides, everything worked normally.

jeremylt · 2024-02-02T20:08:48Z

Woops, misremembered and mispoke above. The strides specify the L-vector ordering, so CEED_STRIDES_BACKEND says you are using the specific ordering only designed to work for the backend that created the CeedElemRestriction. So the only valid usage of CEED_STRIDES_BACKEND is for a CeedOperator filling an output vector which is an input vector for a different CeedOperator from the exact same backend.

Taking an ordering from one backend and handing it off to a different backend is specifically a usage that CEED_STRIDES_BACKEND was intended not to support .

I don't think taking the pointer out of the CeedVector and using it is something that's easy for libCEED to warn about.

But, I think now that actually changing the meaning of CEED_STRIDES_BACKEND for the memcheck backend will get that backend yell at you appropriately in the cases we can warn about.

jedbrown · 2024-02-02T20:17:35Z

I think we could forbid host access to such a vector (using an extra flag). Most vectors transparently mirror, but we could say it's invalid to do that on these L-vectors. We could possibly do something like that on the host too, but that would require that backend access use an internal interface to bypass that check.

jeremylt · 2024-02-02T23:38:04Z

Ok, I at least have a way to make the problem seen, though it doesn't provide an explanation as to what is happening.

#1463

It would be handy for #1454 to have some sort of internal API for bypassing some checks when accessing vector memory spaces. I haven't worked through exactly what I need, but some deeper replication of the dual memory spaces we have for device backends would have caught the small vector issues we had in the deal.II example when first testing on HIP.

sebastiangrimberg · 2024-02-03T17:49:45Z

Technically, the L-vector ordering is supposed to be consistent across backends when using CEED_STRIDES_BACKEND, so it just matters when you try to get at the raw E-vector ordering, which isn't something users are supposed to be doing.
What I did is pull out the array as CEED_MEMORY_HOST and fill it up (with geometric data), and then use that as a kernel input. On the CPU it worked fine, but on the GPU it was reordered. When I used explicit strides, everything worked normally.

I’ve done a similar thing, but used CeedElemRestrictionGetELayout on the host when indexing into the quadrature point data E-vector. It seems to work both on CPU and GPU backends, but is this not a supported way of doing things? The use case is if you have most of your operator quadrature data which is computable from inside a QFunction, but there is an extra component which is more easily populated just with a simple loop without going through the libCEED operator machinery.

jeremylt · 2024-02-03T18:07:36Z

CEED_STRIDES_BACKEND means that the backend picks the L-vector ordering. It doesn't guarantee that the L-vector is ordered identical to the E-vector. It's really intended for transferring data between CeedOperater of the same backend.

The guaranteed approach under the API as written would be to apply the CeedElemRestriction and then go through the E-vector. We could also add a CeedElemRestrictionGetLLayout that would provide the actual backend L-vector layout.

jeremylt · 2024-02-05T18:01:42Z

Ok, I think I like my PR better now. Summary of the changes

Expanded note in the documentation about CEED_STRIDES_BACKEND
Set CEED_STRIDES_BACKEND to be different for memcheck than the default layouts for both CPU and GPU
Add CeedElemRestrictionGetLLayout for strided restrictions
Fix CeedElemRestrictionGet[L,E]Layout signature, previously there ways an unneeded pointer

#1463

I think this helps the memcheck backend catch some cases of misuse and offers a new function to help users muck around with the contents of the L-vector themselves.

jeremylt · 2024-02-10T22:24:19Z

Ok, I've merged fixes.

jedbrown added the backend label Feb 2, 2024

jeremylt mentioned this issue Feb 2, 2024

CEED_STRIDES_BACKEND related updates #1463

Merged

jeremylt closed this as completed Feb 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

memcheck and CEED_STRIDES_BACKEND #1460

memcheck and CEED_STRIDES_BACKEND #1460

jedbrown commented Feb 2, 2024

jeremylt commented Feb 2, 2024 •

edited

knepley commented Feb 2, 2024

jeremylt commented Feb 2, 2024

jedbrown commented Feb 2, 2024

jeremylt commented Feb 2, 2024

sebastiangrimberg commented Feb 3, 2024

jeremylt commented Feb 3, 2024

jeremylt commented Feb 5, 2024

jeremylt commented Feb 10, 2024

memcheck and CEED_STRIDES_BACKEND #1460

memcheck and CEED_STRIDES_BACKEND #1460

Comments

jedbrown commented Feb 2, 2024

jeremylt commented Feb 2, 2024 • edited

knepley commented Feb 2, 2024

jeremylt commented Feb 2, 2024

jedbrown commented Feb 2, 2024

jeremylt commented Feb 2, 2024

sebastiangrimberg commented Feb 3, 2024

jeremylt commented Feb 3, 2024

jeremylt commented Feb 5, 2024

jeremylt commented Feb 10, 2024

jeremylt commented Feb 2, 2024 •

edited