Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

portable device-side abort() or throw ? #2258

Open
fwyzard opened this issue Apr 22, 2024 · 3 comments
Open

portable device-side abort() or throw ? #2258

fwyzard opened this issue Apr 22, 2024 · 3 comments

Comments

@fwyzard
Copy link
Contributor

fwyzard commented Apr 22, 2024

In a few places in the CMS code we would like to signal an error from the device side to the host code.

Something similar to a device-side assert(false), but with a more meaningful error message and always available (ALPAKA_ASSERT_ACC may not be implemented on some back-ends).

For example, in one case we have something like

#if defined(__CUDACC__) && defined(__CUDA_ARCH__)
#define CMS_DEVICE_THROW(MSG)           \
  {                                     \
    printf("%s\n", (MSG));              \
    __trap();                           \
  }
#elif defined(__HIPCC__) && defined(__HIP_DEVICE_COMPILE__)
#define CMS_DEVICE_THROW(MSG)           \
  {                                     \
    printf("%s\n", (MSG));              \
    abort();                            \
  }
#else
#define CMS_DEVICE_THROW(MSG)           \
  {                                     \
    throw std::runtime_error(MSG);      \
  }
#endif

Does Alpaka already provide something like this ?

If not, would it be interesting to have it in Alpaka ?

@SimeonEhrig
Copy link
Member

I talked with @psychocoderHPC . I did a mistake. I was not alpaka, it was PIC which I saw it and it use native traps.

@psychocoderHPC
Copy link
Member

If not, would it be interesting to have it in Alpaka ?

IMO this is interesting. A cool feature would be could handle such throws to avoid an application crash but as I know this is not possible.
As @SimeonEhrig said we have it in PMacc too so if alpaka would provide these we can remove vendor code in PMacc https://github.com/ComputationalRadiationPhysics/picongpu/blob/d18b6822e3074e55b3eab6d7c4c9024564efea33/include/pmacc/verify.hpp#L64-L72

@fwyzard For HIP we used the Clang intrinsic __builtin_trap(); not sure if this is equal to abort()
Take care abort() could introduce additional register usage if I remember correctly but maybe __builtin_trap is increasing the register footprint too.

@fwyzard
Copy link
Contributor Author

fwyzard commented Apr 23, 2024

With CUDA 12.3 and nvcc -O2 -g -arch=sm_75 (link)

__global__ void test() {
    __trap();
}

compiles to

{
    trap;
    ret;
}

while

__global__ void test() {
    assert(0);
}

compiles to

{
    mov.u64         %rd1, $str;
    cvta.global.u64 %rd2, %rd1;
    mov.u64         %rd3, $str$1;
    cvta.global.u64 %rd4, %rd3;
    mov.u64         %rd5, __unnamed_1;
    cvta.global.u64 %rd6, %rd5;
    { // callseq 0, 0
        st.param.b64    [param0+0], %rd2;
        st.param.b64    [param1+0], %rd4;
        st.param.b32    [param2+0], 4;
        st.param.b64    [param3+0], %rd6;
        st.param.b64    [param4+0], 1;
        call.uni __assertfail, (param0, param1, param2, param3, param4);
    } // callseq 0
    ret;
}

CUDA does not have abort(); there is cooperative_groups::details::abort() that expands either to __trap() or to __assert(0) (if extra debug information is enabled).


With ROCm 5.7 and hipcc -O2 -g --offload-arch=gfx90a (link)

__global__ void test() {
    __trap();
}

compiles to

test1():                              ; @test1()
        s_trap 2

while

__global__ void test() {
    assert(0);
}

compiles to

test2():                              ; @test2()
        s_add_u32 flat_scratch_lo, s6, s9
        s_addc_u32 flat_scratch_hi, s7, 0
        s_add_u32 s0, s0, s9
        s_addc_u32 s1, s1, 0
        s_mov_b64 s[8:9], s[4:5]
        s_mov_b32 s32, 0
        s_getpc_b64 s[6:7]
        s_add_u32 s6, s6, __assert_fail@rel32@lo+4
        s_addc_u32 s7, s7, __assert_fail@rel32@hi+12
        s_swappc_b64 s[30:31], s[6:7]```

plus a very long definition for __assert_fail.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants