Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UnexpectedException test failing on ARM ABIs #247

Open
triplef opened this issue Oct 4, 2023 · 21 comments
Open

UnexpectedException test failing on ARM ABIs #247

triplef opened this issue Oct 4, 2023 · 21 comments

Comments

@triplef
Copy link
Member

triplef commented Oct 4, 2023

This test was recently added in #220 and subsequently disabled on ARM (226455b) due to failing on the cross-build ARM CI targets.

I’m able to reproduce it on an arm64 Android emulator with this result in gdb:

Program received signal SIGABRT, Aborted.
0x0000007fbdee5f74 in abort () from target:/apex/com.android.runtime/lib64/bionic/libc.so
(gdb) bt
#0  0x0000007fbdee5f74 in abort () from target:/apex/com.android.runtime/lib64/bionic/libc.so
#1  0x0000007fbe2e7178 in objc_exception_throw (object=0x5555558ee0 <.objc_str_Exception>)
    at eh_personality.c:261
#2  0x0000005555556a4c in main ()
    at UnexpectedException.m:38

Any idea what might be going on here? Interestingly the exception hook is working fine in our app on Android ARM devices and has been for years.

Steps to reproduce with an Android emulator (on macOS ARM host, NDK paths may vary):

adb push libobjc.so /data/local/tmp
adb push Test/UnexpectedException /data/local/tmp
adb push $ANDROID_NDK_ROOT/sources/cxx-stl/llvm-libc++/libs/arm64-v8a/libc++_shared.so /data/local/tmp

# run test
adb shell LD_LIBRARY_PATH=/data/local/tmp/ /data/local/tmp/UnexpectedException

# debug with gdb
adb push $ANDROID_NDK_ROOT/prebuilt/android-arm64/gdbserver/gdbserver /data/local/tmp
adb forward tcp:5039 tcp:5039
adb shell
> cd /data/local/tmp
> LD_LIBRARY_PATH=$PWD ./gdbserver :5039 ./UnexpectedException

# on host machine
$ANDROID_NDK_ROOT/prebuilt/darwin-x86_64/bin/gdb
> target remote :5039
> c
@davidchisnall
Copy link
Member

This abort is at the end of the throw function. The unwind library is returning from the unwind function, which happens only if unwinding fails. For some reason, it looks as if it is not reporting the end of the stack as the reason. Can you see what the value of err is? If you define DEBUG_EXCEPTIONS At the top of the file, it will log it (and a load of other info) for you.

@triplef
Copy link
Member Author

triplef commented Oct 4, 2023

These are the logs I get:

Exception caught by C++: 0
Throwing 0x5555558ee0
Throw returned -1115110992

(I commented out the log "Throwing %p, in flight exception: %p" because it doesn’t build as td->lastThrownObject doesn’t exist.)

@triplef
Copy link
Member Author

triplef commented Oct 4, 2023

Sometimes it also returns -1113013840, but seemingly always either that or more often -1115110992.

@davidchisnall
Copy link
Member

Hmm, that's deeply strange. That looks like _Unwind_RaiseException is returning something that isn't a valid enum value, and possibly not even a valid integer. I've run this on FreeBSD/AArch64 (which uses the LLVM unwind library), and it passes.

Does Android use the GNU unwinder? Can you look in _Unwind_RaiseException and see if you can see what it thinks it's returning?

@triplef
Copy link
Member Author

triplef commented Oct 4, 2023

Interestingly when running without the debugger the err value seems to be random.

I’m not sure about which unwinder Android uses, and I haven’t been able to locate the sources for _Unwind_RaiseException so far. I’d also be interested to know whether the failure is the same using cross-builds like on the CI targets. If so maybe it’s easier to debug this there?

Unfortunately I don’t have time to dig into this deeper right now, but I wanted to at least document this issue here.

@hmelder
Copy link
Collaborator

hmelder commented Apr 23, 2024

Can you look in _Unwind_RaiseException and see if you can see what it thinks it's returning?

The error originates from the libgcc implementation of _Unwind_RaiseException and the generated asm code. Somehow all registers from the start of _Unwind_RaiseException are reloaded before returning _URC_END_OF_STACK in unwind.inc:108. As a result the first parameter passed gets returned.

Note that uw_frame_state_for (&cur_context, &fs); returns _URC_END_OF_STACK.

I just requested an account creation for GCC Bugzilla.

Dump of _Unwind_RaiseException
libgcc_s.so.1`_Unwind_RaiseException:
    0xfffff7f27700 <+0>:   sub    sp, sp, #0xc10
    0xfffff7f27704 <+4>:   stp    x29, x30, [sp]
    0xfffff7f27708 <+8>:   mov    x29, sp
    0xfffff7f2770c <+12>:  xpaclri 
    0xfffff7f27710 <+16>:  stp    x21, x22, [sp, #0x40]
    0xfffff7f27714 <+20>:  add    x22, sp, #0xc0
    0xfffff7f27718 <+24>:  add    x21, sp, #0x840
    0xfffff7f2771c <+28>:  stp    x0, x1, [sp, #0x10]
    0xfffff7f27720 <+32>:  add    x1, sp, #0xc10
    0xfffff7f27724 <+36>:  stp    x2, x3, [sp, #0x20]
    0xfffff7f27728 <+40>:  mov    x2, x30
    0xfffff7f2772c <+44>:  stp    x19, x20, [sp, #0x30]
    0xfffff7f27730 <+48>:  mov    x20, x0
    0xfffff7f27734 <+52>:  add    x19, sp, #0x480
    0xfffff7f27738 <+56>:  mov    x0, x22
    0xfffff7f2773c <+60>:  stp    x23, x24, [sp, #0x50]
    0xfffff7f27740 <+64>:  stp    x25, x26, [sp, #0x60]
    0xfffff7f27744 <+68>:  stp    x27, x28, [sp, #0x70]
    0xfffff7f27748 <+72>:  stp    d8, d9, [sp, #0x80]
    0xfffff7f2774c <+76>:  stp    d10, d11, [sp, #0x90]
    0xfffff7f27750 <+80>:  stp    d12, d13, [sp, #0xa0]
    0xfffff7f27754 <+84>:  stp    d14, d15, [sp, #0xb0]
    0xfffff7f27758 <+88>:  bl     0xfffff7f27020 ; uw_init_context_1 at unwind-dw2.c:1324:1
    0xfffff7f2775c <+92>:  mov    x1, x22
    0xfffff7f27760 <+96>:  mov    x0, x19
    0xfffff7f27764 <+100>: mov    x2, #0x3c0 ; =960 
    0xfffff7f27768 <+104>: bl     0xfffff7f13740
    0xfffff7f2776c <+108>: b      0xfffff7f277a0 ; <+160> at unwind.inc:104:14
    0xfffff7f27770 <+112>: cbnz   w2, 0xfffff7f27814 ; <+276> at unwind.inc:113:9
    0xfffff7f27774 <+116>: ldr    x5, [sp, #0xbe0]
    0xfffff7f27778 <+120>: cbz    x5, 0xfffff7f27794 ; <+148> at unwind.inc:127:7
    0xfffff7f2777c <+124>: ldr    x2, [x20]
    0xfffff7f27780 <+128>: blr    x5
    0xfffff7f27784 <+132>: cmp    w0, #0x6
    0xfffff7f27788 <+136>: b.eq   0xfffff7f2781c ; <+284> at unwind.inc:132:18
    0xfffff7f2778c <+140>: cmp    w0, #0x8
    0xfffff7f27790 <+144>: b.ne   0xfffff7f27814 ; <+276> at unwind.inc:113:9
    0xfffff7f27794 <+148>: mov    x1, x21
    0xfffff7f27798 <+152>: mov    x0, x19
    0xfffff7f2779c <+156>: bl     0xfffff7f27264 ; uw_update_context at unwind-dw2.c:1266:1
    0xfffff7f277a0 <+160>: mov    x1, x21
    0xfffff7f277a4 <+164>: mov    x0, x19
    0xfffff7f277a8 <+168>: bl     0xfffff7f25f00 ; uw_frame_state_for at unwind-dw2.c:997:3
    0xfffff7f277ac <+172>: mov    w2, w0
    0xfffff7f277b0 <+176>: mov    w1, #0x1 ; =1 
    0xfffff7f277b4 <+180>: mov    x4, x19
    0xfffff7f277b8 <+184>: mov    x3, x20
    0xfffff7f277bc <+188>: mov    w0, w1
    0xfffff7f277c0 <+192>: cmp    w2, #0x5
    0xfffff7f277c4 <+196>: b.ne   0xfffff7f27770 ; <+112> at unwind.inc:110:10
->  0xfffff7f277c8 <+200>: mov    x4, #0x0 ; =0 
    0xfffff7f277cc <+204>: mov    w0, w2
    0xfffff7f277d0 <+208>: ldp    x29, x30, [sp]
    0xfffff7f277d4 <+212>: ldp    x0, x1, [sp, #0x10]
    0xfffff7f277d8 <+216>: ldp    x2, x3, [sp, #0x20]
    0xfffff7f277dc <+220>: ldp    x19, x20, [sp, #0x30]
    0xfffff7f277e0 <+224>: ldp    x21, x22, [sp, #0x40]
    0xfffff7f277e4 <+228>: ldp    x23, x24, [sp, #0x50]
    0xfffff7f277e8 <+232>: ldp    x25, x26, [sp, #0x60]
    0xfffff7f277ec <+236>: ldp    x27, x28, [sp, #0x70]
    0xfffff7f277f0 <+240>: ldp    d8, d9, [sp, #0x80]
    0xfffff7f277f4 <+244>: ldp    d10, d11, [sp, #0x90]
    0xfffff7f277f8 <+248>: ldp    d12, d13, [sp, #0xa0]
    0xfffff7f277fc <+252>: ldp    d14, d15, [sp, #0xb0]
    0xfffff7f27800 <+256>: add    sp, sp, #0xc10
    0xfffff7f27804 <+260>: cbz    x4, 0xfffff7f27810 ; <+272> at unwind.inc:141:1
    0xfffff7f27808 <+264>: add    sp, sp, x5
    0xfffff7f2780c <+268>: br     x6
    0xfffff7f27810 <+272>: ret    
    0xfffff7f27814 <+276>: mov    w2, #0x3 ; =3 
    0xfffff7f27818 <+280>: b      0xfffff7f277c8 ; <+200> at unwind.inc:141:1
    0xfffff7f2781c <+284>: str    xzr, [x20, #0x10]
    0xfffff7f27820 <+288>: mov    x0, x19
    0xfffff7f27824 <+292>: bl     0xfffff7f13860 ; symbol stub for: pthread_key_create
    0xfffff7f27828 <+296>: mov    x4, x0
    0xfffff7f2782c <+300>: ldr    x3, [sp, #0x7c0]
    0xfffff7f27830 <+304>: mov    x1, x22
    0xfffff7f27834 <+308>: mov    x2, #0x3c0 ; =960 
    0xfffff7f27838 <+312>: mov    x0, x19
    0xfffff7f2783c <+316>: sub    x3, x4, x3, lsr #63
    0xfffff7f27840 <+320>: str    x3, [x20, #0x18]
    0xfffff7f27844 <+324>: bl     0xfffff7f13740
    0xfffff7f27848 <+328>: mov    x2, x21
    0xfffff7f2784c <+332>: mov    x1, x19
    0xfffff7f27850 <+336>: mov    x0, x20
    0xfffff7f27854 <+340>: bl     0xfffff7f273ac ; _Unwind_RaiseException_Phase2 at unwind.inc:41:1
    0xfffff7f27858 <+344>: mov    w2, #0x2 ; =2 
    0xfffff7f2785c <+348>: cmp    w0, #0x7
    0xfffff7f27860 <+352>: b.ne   0xfffff7f277c8 ; <+200> at unwind.inc:141:1
    0xfffff7f27864 <+356>: mov    x1, x19
    0xfffff7f27868 <+360>: mov    x0, x22
    0xfffff7f2786c <+364>: bl     0xfffff7f24a60 ; uw_install_context_1 at unwind-dw2.c:1405:1
    0xfffff7f27870 <+368>: mov    x19, x0
    0xfffff7f27874 <+372>: ldr    x20, [sp, #0x798]
    0xfffff7f27878 <+376>: ldr    x0, [sp, #0x790]
    0xfffff7f2787c <+380>: mov    x1, x20
    0xfffff7f27880 <+384>: bl     0xfffff7f276f0 ; _Unwind_DebugHook at unwind-dw2.c:1382:1
    0xfffff7f27884 <+388>: bl     0xfffff7f2af40 ; __arm_za_disable
    0xfffff7f27888 <+392>: mov    x5, x19
    0xfffff7f2788c <+396>: mov    x6, x20
    0xfffff7f27890 <+400>: mov    x4, #0x1 ; =1 
    0xfffff7f27894 <+404>: b      0xfffff7f277cc ; <+204> at unwind.inc:141:1

Current position right after if (code == _URC_END_OF_STACK).
The return value of uw_frame_state_for is 5 (_URC_END_OF_STACK).

      if (code == _URC_END_OF_STACK)
	/* Hit end of stack with no handler found.  */
	return _URC_END_OF_STACK;

After mov w0, w2 (-> 0xfffff7f277d0 <+208>: ldp x29, x30, [sp])

0xfffff7f27894 <+404>: b      0xfffff7f277cc ; <+204> at unwind.inc:141:1
(lldb) register read
General Purpose Registers:
        x0 = 0x0000000000000005
        x1 = 0x0000000000000001
        x2 = 0x0000000000000005
        x3 = 0x0000aaaaaaafa790
        x4 = 0x0000000000000000
        x5 = 0x0000ffffffffe160
        x6 = 0xfffffffffffffff8
        x7 = 0x0000000000000004
        x8 = 0x0000000000000001
        x9 = 0x0000fffff7f402a8
       x10 = 0x0000000000000000
       x11 = 0x0000fffff7f40308
       x12 = 0x0000fffff7ff77c0
       x13 = 0x0000000000000010
       x14 = 0x0000000000000000
       x15 = 0x0000fffff7bc63c0  
       x16 = 0x0000fffff7f40000
       x17 = 0x0000000000000000
       x18 = 0x0000000000000007
       x19 = 0x0000ffffffffe610
       x20 = 0x0000aaaaaaafa790
       x21 = 0x0000ffffffffe9d0
       x22 = 0x0000ffffffffe250
       x23 = 0x0000ffffffffefc8
       x24 = 0x0000fffff7ffdb90  ld-linux-aarch64.so.1`_rtld_global_ro
       x25 = 0x0000000000000000
       x26 = 0x0000fffff7ffe008  _rtld_global
       x27 = 0x0000aaaaaaabfda0  UnexpectedExceptionDebug`__do_global_dtors_aux_fini_array_entry
       x28 = 0x0000000000000000
        fp = 0x0000ffffffffe190
        lr = 0x0000fffff7f277ac  libgcc_s.so.1`_Unwind_RaiseException + 172 at unwind.inc:104:14
        sp = 0x0000ffffffffe190
        pc = 0x0000fffff7f277d0  libgcc_s.so.1`_Unwind_RaiseException + 208 at unwind.inc:141:1
      cpsr = 0x60201000

After ldp x0, x1

(lldb) register read
General Purpose Registers:
        x0 = 0x0000aaaaaaafa790
        x1 = 0x0000000000000000
        x2 = 0x0000000000000058
        x3 = 0x0000000000000000
        x4 = 0x0000000000000000
        x5 = 0x0000ffffffffe160
        x6 = 0xfffffffffffffff8
        x7 = 0x0000000000000004
        x8 = 0x0000000000000001
        x9 = 0x0000fffff7f402a8
       x10 = 0x0000000000000000
       x11 = 0x0000fffff7f40308
       x12 = 0x0000fffff7ff77c0
       x13 = 0x0000000000000010
       x14 = 0x0000000000000000
       x15 = 0x0000fffff7bc63c0  
       x16 = 0x0000fffff7f40000
       x17 = 0x0000000000000000
       x18 = 0x0000000000000007
       x19 = 0x0000ffffffffefb8
       x20 = 0x0000000000000001
       x21 = 0x0000aaaaaaabfda0  UnexpectedExceptionDebug`__do_global_dtors_aux_fini_array_entry
       x22 = 0x0000aaaaaaaa0ae8  UnexpectedExceptionDebug`main at UnexpectedException.m:33
       x23 = 0x0000ffffffffefc8
       x24 = 0x0000fffff7ffdb90  ld-linux-aarch64.so.1`_rtld_global_ro
       x25 = 0x0000000000000000
       x26 = 0x0000fffff7ffe008  _rtld_global
       x27 = 0x0000aaaaaaabfda0  UnexpectedExceptionDebug`__do_global_dtors_aux_fini_array_entry
       x28 = 0x0000000000000000
        fp = 0x0000ffffffffee00
        lr = 0x0000fffff7f76d38  libobjc.so.4.6`objc_exception_throw + 520 at eh_personality.c:256:22
        sp = 0x0000ffffffffeda0
        pc = 0x0000fffff7f27810  libgcc_s.so.1`_Unwind_RaiseException + 272 at unwind.inc:141:1
      cpsr = 0x60201000
(lldb) 

@hmelder
Copy link
Collaborator

hmelder commented Apr 25, 2024

@davidchisnall
Copy link
Member

Thanks for root causing this! Let's leave this open until there's a fix, but until then we can recommend that Arm users use LLVM's unwinder instead of GCC's.

@triplef
Copy link
Member Author

triplef commented Apr 25, 2024

How does one choose between the LLVM vs. GCC unwinder, and is the GCC one also ever used when building with Clang?

I’m a bit confused about this issue because we don’t see issues with exception handling on Android in our app (built with Clang and the NDK toolchain) and the exception hook is working fine too, but I was originally able to reproduce this with the test using the NDK toolchain.

@hmelder
Copy link
Collaborator

hmelder commented Apr 25, 2024

How does one choose between the LLVM vs. GCC unwinder, and is the GCC one also ever used when building with Clang?

Here is an example:

clang-18 test.c -o test -rtlib=compiler-rt --unwindlib=libunwind -fuse-ld=lld-18

@hmelder
Copy link
Collaborator

hmelder commented Apr 25, 2024

but I was originally able to reproduce this with the test using the NDK toolchain

Yep thats weird. Do you know the NDK version you ran the test on?

@davidchisnall
Copy link
Member

I think Android uses the LLVM unwinder. It's typically integrated in the C support files, but precisely where depends on the platform. FreeBSD uses the LLVM one as well, which is why I didn't see these issues.

It's also not always clear which one you're using. On FreeBSD, we ship the LLVM one but with the same name as the GCC one to avoid configure scripts failing.

I'd report this to distros as a bug and tell them that the simple fix is to use the LLVM one instead of the GCC one.

@hmelder
Copy link
Collaborator

hmelder commented Apr 25, 2024

Just tested it with libunwind-18 on Ubuntu 23.10 and it works as expected:

hugo@ubuntu:/tmp$ clang -L/usr/lib/llvm-18/lib test.c -o test -rtlib=compiler-rt --unwindlib=libunwind -fuse-ld=lld -lunwind
hugo@ubuntu:/tmp$ ./test
RaiseException returned 0x5

I'll test it on Android this evening.

@triplef
Copy link
Member Author

triplef commented Apr 25, 2024

Do you know the NDK version you ran the test on?

Unfortunately no. I think the NDK migrated to the LLVM toolchain over the last couple of releases, so it might have been a release that was still using the GCC unwinder.

Can we detect in CMake which unwinder is used and already enable the test when LLVM is used?

@triplef
Copy link
Member Author

triplef commented Apr 25, 2024

Just found this re. unwinder on Android:

The unwinder APIs are exposed from the platform's libc.so starting with API 30 (Android R).

@hmelder
Copy link
Collaborator

hmelder commented Apr 25, 2024

Can we detect in CMake which unwinder is used and already enable the test when LLVM is used?

I would enable it unconditionally. The libgcc patch will probably be backported to gcc 13 and maybe gcc 12.

@hmelder
Copy link
Collaborator

hmelder commented Apr 25, 2024

Just found this re. unwinder on Android:

The unwinder APIs are exposed from the platform's libc.so starting with API 30 (Android R).

Seems like libgcc was removed in NDK r23. Here is the change in clang: https://reviews.llvm.org/D96403

@pinskia
Copy link

pinskia commented Apr 26, 2024

I would enable it unconditionally. The libgcc patch will probably be backported to gcc 13 and maybe gcc 12.

s/libgcc/gcc/. The bug is not in libgcc directly but rather the code that GCC produces has the bug.

I hope to get it backported in time for the GCC 11.5 release but we will see. I will be posting the patch over the weekend and I doubt it will be reviewed until Monday or later.

@pinskia
Copy link

pinskia commented Apr 26, 2024

I should note that a few other targets has a similar bug:
powerpc: PR 114846
arm: PR 114847

loongarch had the similar bug but it was fixed in GCC 14: longarch: PR 114848

Those are the "major" targets I tried that had the bug.

@pinskia
Copy link

pinskia commented Apr 26, 2024

Just an FYI I have posted the GCC patch: https://gcc.gnu.org/pipermail/gcc-patches/2024-April/650080.html .

@hmelder
Copy link
Collaborator

hmelder commented Apr 27, 2024

Thank you @pinskia!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

4 participants