Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

aarch64 - Failed to iterate the unwind table: UnknownCallFrameInstruction(DwCfa(45)) #114

Open
krzysiek6d opened this issue Jul 24, 2023 · 16 comments

Comments

@krzysiek6d
Copy link

krzysiek6d commented Jul 24, 2023

I'm trying to cross-compile bytehound for aarch64

git clone https://github.com/koute/bytehound.git
cd bytehound
CC='aarch64-linux-gnu-gcc' cargo build --target aarch64-unknown-linux-gnu --release -p bytehound-preload

....
		Compiling bytehound-preload v0.11.0 (/home/pawluch/temp/bytehound/preload)
	    Building [=========================> ] 88/89: bytehound-preload
	    <some warnings>
	    warning: variant `Glibc` is never constructed
	    warning: constant `STT_GNU_IFUNC` is never used
	    <and more>
	    <and finally error>
	    error: linking with `cc` failed: exit status: 1
  		|
  		= note: "cc" "-Wl,--version-script=/tmp/rustc37VL1r/list" "/tmp/rustc37VL1r/symbols.o" "/home/pawluch/temp/bytehound/target/aarch64-unknown-linux-gnu/release/deps/bytehound.bytehound.1141c7cc-cgu.6.rcgu.o" "-Wl,--as-needed" "-L" "/home/pawluch/temp/bytehound/target/aarch64-unknown-linux-gnu/release/deps" "-L" "/home/pawluch/temp/bytehound/target/release/deps" "-L" "/home/pawluch/temp/bytehound/target/aarch64-unknown-linux-gnu/release/build/libmimalloc-sys-2a7d2fcd1ef6bcf9/out" "-L" "/home/pawluch/temp/bytehound/target/aarch64-unknown-linux-gnu/release/build/nwind-45c923a06e07786b/out" "-L" "/home/pawluch/temp/bytehound/target/aarch64-unknown-linux-gnu/release/build/tikv-jemalloc-sys-56e739c58320660d/out/build/lib" "-L" "/home/pawluch/.rustup/toolchains/nightly-2022-10-13-x86_64-unknown-linux-gnu/lib/rustlib/aarch64-unknown-linux-gnu/lib" "-Wl,-Bstatic" "/tmp/rustc37VL1r/liblibmimalloc_sys-c81f41ca1d102f5f.rlib" "/tmp/rustc37VL1r/libtikv_jemalloc_sys-d8283604389b0ad8.rlib" "/tmp/rustc37VL1r/libnwind-223626fcb709192a.rlib" "/home/pawluch/.rustup/toolchains/nightly-2022-10-13-x86_64-unknown-linux-gnu/lib/rustlib/aarch64-unknown-linux-gnu/lib/libcompiler_builtins-e0edb30d3bc4ef0f.rlib" "-Wl,-Bdynamic" "-lpthread" "-lstdc++" "-lgcc_s" "-lutil" "-lrt" "-lpthread" "-lm" "-ldl" "-lc" "-Wl,--eh-frame-hdr" "-Wl,-znoexecstack" "-L" "/home/pawluch/.rustup/toolchains/nightly-2022-10-13-x86_64-unknown-linux-gnu/lib/rustlib/aarch64-unknown-linux-gnu/lib" "-o" "/home/pawluch/temp/bytehound/target/aarch64-unknown-linux-gnu/release/deps/libbytehound.so" "-Wl,--gc-sections" "-shared" "-Wl,-zrelro,-znow" "-Wl,-O1" "-nodefaultlibs"
  		<lot of lines with: >
  		/usr/bin/ld: /home/pawluch/temp/bytehound/target/aarch64-unknown-linux-gnu/release/deps/bytehound.bytehound.1141c7cc-cgu.6.rcgu.o: Relocations in generic ELF (EM: 183)
  		<and finally>
  		/usr/bin/ld: /home/pawluch/temp/bytehound/target/aarch64-unknown-linux-gnu/release/deps/bytehound.bytehound.1141c7cc-cgu.6.rcgu.o: error adding symbols: file in wrong format
          collect2: error: ld returned 1 exit status
          

		warning: `bytehound-preload` (lib) generated 16 warnings
		error: could not compile `bytehound-preload` due to previous error; 16 warnings emitted

What am I doing wrong with this cross-compilation?
I tried with sdk, where I passed also --sysroot and there is the same output

P.S.

I'm trying to recompile it since current bytehound I'm using crashes binaries for that architecture and I thought that maybe I'm using too old version which does not match some libs in the binary I'm trying to profile. It collects some data for ~5mins and then app is crashing. Profiled do not have any good backtrace ;(
image
Disabling shadow-based stack unwinding did not help. Maybe you have some idea what caused that problem?

@koute
Copy link
Owner

koute commented Jul 24, 2023

For cross-compilation to work you also need to specify a linker, otherwise it'll use the one that's default on your system. (You can see in the error message that it prints out /usr/bin/ld, which is not what you want.)

It's probably the best to do this in ~/.cargo/config with something like this:

[target.aarch64-unknown-linux-gnu]
linker = "/path/to/clang-or-gcc"

@koute
Copy link
Owner

koute commented Jul 24, 2023

I'm trying to recompile it since current bytehound I'm using crashes binaries for that architecture and I thought that maybe I'm using too old version which does not match some libs in the binary I'm trying to profile. It collects some data for ~5mins and then app is crashing. Profiled do not have any good backtrace ;(

Yeah, unfortunately without detailed logs it's not possible to figure out why this happens.

@krzysiek6d
Copy link
Author

Thanks, setting CARGO_TARGET_AARCH64_UNKNOWN_LINUX_GNU_LINKER worked for compilation of libbytehound.so
I'll check how it behaves with those changes ;)

@krzysiek6d
Copy link
Author

krzysiek6d commented Jul 24, 2023

Unfortunately the backtraces are still invalid ;(
This calltrace is from some quite old bytehound version

Fragment of calltrace which I'm allowed to share points to
image

after checking what is inside I found (note that this is from some old bytehound build):

#frame 9:

addr2line -isfCe libbytehound.so 0x2c5f4                                                                                                                                                                  
bytehound::global::StrongThreadHandle::system_tid
global.rs:610
bytehound::api::allocate
api.rs:233
malloc
api.rs:244

 

#frame 8:

addr2line -isfCe libbytehound.so 0x2092c
<core::option::Option<T> as core::ops::try_trait::Try>::branch
option.rs:2049
<nwind::arch::aarch64::Arch as nwind::arch::Architecture>::unwind
aarch64.rs:216
nwind::unwind_context::UnwindHandle<A>::unwind
unwind_context.rs:104
nwind::local_unwinding::LocalAddressSpace::unwind_through_fresh_frames
local_unwinding.rs:940
bytehound::unwind::grab
unwind.rs:361

 

#frame 7:

addr2line -isfCe libbytehound.so 0x2acf0
nwind::frame_descriptions::FrameDescriptions<E>::find_unwind_info
frame_descriptions.rs:?
nwind::address_space::Binary<A>::lookup_unwind_row::{{closure}}
address_space.rs:351
core::option::Option<T>::and_then
option.rs:1053
nwind::address_space::Binary<A>::lookup_unwind_row
address_space.rs:351
nwind::dwarf::dwarf_unwind
dwarf.rs:264

 

#frame 6:

addr2line -isfCe libbytehound.so 0x6a540
log::__private_api_log
lib.rs:1470

 

#frame 5:

addr2line -isfCe libbytehound.so 0x371cc
<&bytehound::raw_file::RawFile as std::io::Write>::write
raw_file.rs:92
std::io::Write::write_all
mod.rs:1557
<bytehound::logger::FileLogger as log::Log>::log::{{closure}}
logger.rs:224
bytehound::utils::stack_format
utils.rs:86
bytehound::utils::stack_format_bytes
utils.rs:115
<bytehound::logger::FileLogger as log::Log>::log
logger.rs:221

Moreover my colleague found such prints
Failed to iterate the unwind table: UnknownCallFrameInstruction(DwCfa(45))
But I cant find that log in your project

What kind of logs would be helpful?

Maybe it is not the issue with the bytehound unwinding but in flags I'm using? Do you know which compilation flags could possibly be problematic?
I profiled my own simple app and it seems to work fine on that architecture - backtraces and graphs are good

@koute
Copy link
Owner

koute commented Jul 24, 2023

Maybe it is not the issue with the bytehound unwinding but in flags I'm using? Do you know which compilation flags could possibly be problematic?

Well, that's a possibility. Could you figure out which exact flags are used to compile the program you want to profile?

Moreover my colleague found such prints
Failed to iterate the unwind table: UnknownCallFrameInstruction(DwCfa(45))
But I cant find that log in your project

That error is from the nwind crate which is part of not-perf. It's complaining that it doesn't know how to process the DW_CFA_GNU_window_save DWARF instruction, or possibly the DW_CFA_AARCH64_negate_ra_state aarch64 extension which apparently uses the same opcode (so that's probably the problem). From what I can see (but I can be wrong here) this is possibly enabled by the branch protection security mitigation when compiling your code.

So fixing this specific error would require either recompiling your program so that the unsupported DWARF instruction is not emitted (possibly by disabling the security mitigation, but in your case this is most likely not going to be practical as you'd have to recompile everything), or adding support to gimli so that it can process it, and then maybe (or maybe not, I haven't looked into this in a lot of detail) adding something to nwind so that it also handles it.

Possibly relevant LLVM/libunwind pull request for inspiration.

@krzysiek6d
Copy link
Author

krzysiek6d commented Jul 27, 2023

Hi
Am I right that llvm-dwarfdump should show me this DW_CFA_* entries?

I did it on binary, and it does not contain any debug symbols. Then I thought that maybe some libs used by this binary may contain this entry. So I used llvm-dwarfdump on all libs from ldd, and everything is stripped.
So I thought that maybe not all libs are listed there, so I took the libs from /proc/X/maps and done llbm-dwarfdump and there everything is empty as well. Only libbytehound.so contain some debug info but there is also no DW_CFA_* entry.

According to my simple binary - I was wrong. I mean it worked - graph shown by bytehound is ok, but there are visible 'Failed to iterate the unwind table: UnknownCallFrameInstruction(DwCfa(45))' messages.

Base compilation has such flags:
aarch64-linux-gnu-g++ -march=armv8-a -mcpu=cortex-a57 -mtune=cortex-a57 --sysroot=//sysroots/aarch64-linux-gnu main.cpp
I also tried it with -mbranch-protection=none with the same result.

Now I am confused - why there is a message UnknownCallFrameInstruction visible if we dont have any debug symbols in all libs and binary?

Anyway, I'll try to modify gimli so it will not return UnknownCallFrameInstruction

@krzysiek6d krzysiek6d reopened this Jul 27, 2023
@krzysiek6d krzysiek6d changed the title Compilation issues for aarch64 aarch64 - Failed to iterate the unwind table: UnknownCallFrameInstruction(DwCfa(45)) Jul 27, 2023
@koute
Copy link
Owner

koute commented Jul 27, 2023

Now I am confused - why there is a message UnknownCallFrameInstruction visible if we dont have any debug symbols in all libs and binary?

These are not necessarily from debug symbols. DWARF is also used for normal unwind tables in non-debug binaries. (Usually the .eh_frame section.)

Anyway, I'll try to modify gimli so it will not return UnknownCallFrameInstruction

Well, the way to go here is to make it handle it properly, not just ignore it. (: This error pops up for a reason, and AFAIK this DWARF instruction may be used when fetching the return address register, so if you'll make it ignore it you might get incorrect results later.

The easiest way to fix it is to probably write a test program which is affected by this, then from within that program grab the backtrace using nwind (in not-perf/nwind/src/local_unwinding.rs you have test_unwind_* tests from which you could copy-paste the code to grab a backtrace) and print out the backtrace. This should trigger the error and will probably result in either no backtrace or an incorrect backtrace. So then you can try to fix it in gimli and since you have a test program you'll be able to quickly retest it.

@krzysiek6d
Copy link
Author

krzysiek6d commented Jul 28, 2023

You're right, it's eh_frame - it contains DW_CFA_AARCH64_negate_ra_state entries. llvm-dwarfdump --eh-frame libc.so.6 shows for example

  00000054 00000018 00000058 FDE cie=00000000 pc=0002ae40...0002ae50
  Format:       DWARF32
  DW_CFA_advance_loc: 4
  DW_CFA_AARCH64_negate_ra_state:
  DW_CFA_advance_loc: 4
  DW_CFA_def_cfa_offset: +16
  DW_CFA_offset: W29 -16
  DW_CFA_offset: W30 -8
  DW_CFA_nop:
  DW_CFA_nop:

  0x2ae40: CFA=WSP
  0x2ae44: CFA=WSP: reg34=1
  0x2ae48: CFA=WSP+16: W29=[CFA-16], W30=[CFA-8], reg34=1

I'd like to change dependency for nwind (gimli) in not-perf project.
But I fail on building even without changes in sources - just changing the path from github to my local dir.

NOT-PERF:

[user@pc asdf]$ git clone https://github.com/koute/not-perf.git
[user@pc not-perf]$ cargo build -p nwind

This builds with success

As I wanted to change sources of gimli so I download it
GIMLI:

[user@pc asdf]$ git clone https://github.com/gimli-rs/gimli.git

As you can see in nwind Cargo this points to 0.25 version of gimli:

not-perf/nwind/Cargo.toml:gimli = { version = "0.25", default-features = false, features = ["std", "read", "endian-reader"] }

So I rebase GIMLI to 0.25

[user@pc gimli]$ git checkout -b ver_0.25.0 0.25.0

And change nwind Cargo.toml to point to local gimli using path argument

not-perf/nwind/Cargo.toml:gimli = { version = "0.25", default-features = false, features = ["std", "read", "endian-reader"], path = "../../gimli" }

And build:

[user@pc not-perf]$ cargo build -p nwind

I can see some errors pointing to nwind :

error[E0277]: the trait bound `EndianReader<gimli::RunTimeEndian, BinaryDataSlice>: addr2line::gimli::Reader` is not satisfied
   --> nwind/src/address_space.rs:156:29
error[E0277]: the trait bound `R: addr2line::gimli::Reader` is not satisfied
    --> nwind/src/address_space.rs:190:50
error[E0277]: the trait bound `EndianReader<gimli::RunTimeEndian, BinaryDataSlice>: addr2line::gimli::Reader` is not satisfied
   --> nwind/src/address_space.rs:957:33

But all I did is used the same version (0.25) which is local, not from github.

Cargo.lock in not-perf points to the same version:

[[package]]
name = "gimli"
version = "0.25.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "f0a01e0497841a3b2db4f8afa483cce65f7e96a3498bd6c541734792aeac8fe7"
dependencies = [
 "fallible-iterator",
 "stable_deref_trait",
]

Did I something wrong?

@koute
Copy link
Owner

koute commented Jul 28, 2023

That's because gimli is used not only by nwind, but also by one of its dependencies. The cargo-tree tool is your friend here to check stuff like this. (You can install it with cargo install cargo-tree.)

$ cargo tree -i -p gimli
gimli v0.25.0
├── addr2line v0.16.0
│   └── nwind v0.1.0
│       └── nperf-core v0.1.1
└── nwind v0.1.0

As you can see both nwind and addr2line crates use gimli. You've made nwind use your local copy, but addr2line is still using the copy from crates.io.

There are two ways of handling it.

  1. Replace addr2line too and change it to use your local gimli.
  2. Instead of changing the dependencies override the dependency in the patch section of the toplevel Cargo.toml, e.g. pasting something like this in the toplevel Cargo.toml should work:
[patch.crates-io]
gimli = { path = "/path/to/your/local/gimli" }

Usually it's a lot more convenient to do (2), especially if you're replacing a dependency which is used by a bunch of stuff in the dependency tree.

@krzysiek6d
Copy link
Author

Well, the way to go here is to make it handle it properly, not just ignore it. (: This error pops up for a reason, and AFAIK this DWARF instruction may be used when fetching the return address register, so if you'll make it ignore it you might get incorrect results later.

I patched bytehound with gimli, where I treat this in the same way as NOP.
I have c++ app with ld-preloaded bytehound (with gimli changes) and somethimes it worked, but not always - in some scenarios my app crashes after like 1minute of running with bytehound error:

5e82 63eb WRN Slot 0x0000007F10B4CB38 contains 0x00000000009EDD8C instead of the trampoline address

So probably as you said, it is needed to interpret it properly.

I checked the code you mentioned

Possibly relevant LLVM/libunwind pull request for inspiration.

But even if there are not so many changes I really don't know how to implement that in gimli ;/

As you proposed I wrote test - it just prints the backtrace - note that branch-protection is on my libc.so.6

WITHOUT branch-protection and WITHOUT gimli changes

CARGO_TARGET_AARCH64_UNKNOWN_LINUX_GNU_LINKER=aarch64-linux-gnu-gcc cargo build --target aarch64-unknown-linux-gnu
run output:

_ZN9backtrace15print_backtrace17h179264fd6fac714fE
_ZN9backtrace4main17he05c49d47fcbb15cE
_ZN4core3ops8function6FnOnce9call_once17h46dba0e61892f158E
_ZN3std10sys_common9backtrace28__rust_begin_short_backtrace17h542d0e15cf9ed7adE
_ZN3std2rt10lang_start28_$u7b$$u7b$closure$u7d$$u7d$17h59df1b4045d93c39E
_ZN3std2rt19lang_start_internal17hbc490604880ee546E
_ZN3std2rt10lang_start17h9a78013547d94901E
main

WITH branch protection AND WITHOUT gimli changes

CARGO_TARGET_AARCH64_UNKNOWN_LINUX_GNU_LINKER=aarch64-linux-gnu-gcc RUSTFLAGS="-Z branch-protection=pac-ret" cargo build --target aarch64-unknown-linux-gnu
run output:

_ZN9backtrace15print_backtrace17h179264fd6fac714fE

WITH branch protection and WITH gimli changes - treating this opcode as NOP

_ZN9backtrace15print_backtrace17hdf094bbe5a1463a2E
_ZN9backtrace4main17hdb971abd18887b9fE
_ZN4core3ops8function6FnOnce9call_once17h941ca8d18e94654cE
_ZN3std10sys_common9backtrace28__rust_begin_short_backtrace17h97528e8d20ba95a7E
_ZN3std2rt10lang_start28_$u7b$$u7b$closure$u7d$$u7d$17h6b614486d967554eE
_ZN3std2rt19lang_start_internal17hbc490604880ee546E
_ZN3std2rt10lang_start17h906b6df9afd56867E
main
__libc_start_main
_start

Now I'm stuck. Implementing gimli changes to do proper things is quite difficult, and as you can see - for example if I treat it in the same way as NOP then it works. But for bigger (in my case c++) apps, where bytehound is ld-preloaded - (sometimes) it crashes, and I don't know how to catch the problematic scenario - app is killed, so I can't even grab the coredump.
Do you have any hints?

@koute
Copy link
Owner

koute commented Aug 2, 2023

Yes, it's possible that the app will crash if this is not implemented properly, because the unwinding algorithm in Bytehound depends on the unwinding being correct.

Here's what libunwind does, and here's what gdb does when it encounters this instructions. So it looks like it toggles the special/hidden RA state register. Other DWARF instructions then could maybe use this value? I can't really give you much more details than this because I'm not familiar with how exactly this is implemented and I don't have the time to research it.

It also looks like it could be possible (but I can be totally wrong here; this is just a result of 5 minutes of me googling) to treat this instruction as a NOP if pointer authentication is disabled with the arm64.nopauth kernel command line argument which can apparently be done since version 5.12.

@Bartosz89k
Copy link

I'm not sure but I guess there is an issue with the object file format during the linking process. Specifically, the linker (cc) encountered relocations in generic ELF, and it complains about the format:

/usr/bin/ld: /home/pawluch/temp/bytehound/target/aarch64-unknown-linux-gnu/release/deps/bytehound.bytehound.1141c7cc-cgu.6.rcgu.o: Relocations in generic ELF (EM: 183)

there might be a mismatch between the target architecture specified during compilation and the actual architecture of the linked object files.

Maybe try to check dependency build settings (libmimalloc_sys, libtikv_jemalloc_sys, libnwind) built for aarch64

In my case, after editing and running the following:

$ rustup target add aarch64-unknown-linux-gnu
$ cargo clean
$ cat ~/.cargo/config
[target.aarch64-unknown-linux-gnu]
linker = "aarch64-linux-gnu-gcc"

$ cat env
#!/bin/sh
# rustup shell setup
# affix colons on either side of $PATH to simplify matching
case ":${PATH}:" in
    *:"$HOME/.cargo/bin":*)
        ;;
    *)
        # Prepending path in case a system-installed rustc needs to be overridden
        export PATH="$HOME/.cargo/bin:$PATH"
        ;;
esac

$ source "$HOME/.cargo/env"

I built it:

$ CC='aarch64-linux-gnu-gcc' cargo build --target aarch64-unknown-linux-gnu --release -p bytehound-preload
   Compiling bytehound-preload v0.11.0 (/home/bartoszkopczynski/src/bytehound/preload)
warning: variant `Glibc` is never constructed
  --> preload/src/smaps.rs:19:5
   |
16 | pub enum MapKind {
   |          ------- variant in this enum
...
19 |     Glibc = 2
   |     ^^^^^
   |
   = note: `MapKind` has a derived impl for the trait `Clone`, but this is intentionally ignored during dead code analysis
   = note: `#[warn(dead_code)]` on by default

warning: constant `STT_GNU_IFUNC` is never used
  --> preload/src/elf.rs:31:7
   |
31 | const STT_GNU_IFUNC: u8 = 10;
   |       ^^^^^^^^^^^^^

warning: type alias `ProgramHeader` is never used
  --> preload/src/elf.rs:37:6
   |
37 | type ProgramHeader = libc::Elf64_Phdr;
   |      ^^^^^^^^^^^^^

warning: type alias `Symbol` is never used
  --> preload/src/elf.rs:43:6
   |
43 | type Symbol = libc::Elf64_Sym;
   |      ^^^^^^

warning: struct `Dynamic` is never constructed
  --> preload/src/elf.rs:46:12
   |
46 | pub struct Dynamic {
   |            ^^^^^^^

warning: struct `ObjectInfo` is never constructed
  --> preload/src/elf.rs:51:12
   |
51 | pub struct ObjectInfo< 'a > {
   |            ^^^^^^^^^^

warning: associated function `each` is never used
  --> preload/src/elf.rs:58:12
   |
58 |     pub fn each< F, R >( mut callback: F ) -> Option< R > where F: FnMut( ObjectInfo ) -> ControlFlow< R, () > {
   |            ^^^^

warning: associated function `name_contains` is never used
  --> preload/src/elf.rs:62:12
   |
62 |     pub fn name_contains( &self, substr: impl AsRef< [u8] > ) -> bool {
   |            ^^^^^^^^^^^^^

warning: associated function `each_impl` is never used
  --> preload/src/elf.rs:67:8
   |
67 |     fn each_impl< R >( callback: &mut dyn FnMut( ObjectInfo ) -> ControlFlow< R, () > ) -> Option< R > {
   |        ^^^^^^^^^

warning: associated function `dlsym` is never used
   --> preload/src/elf.rs:120:12
    |
120 |     pub fn dlsym( &self, name: impl AsRef< [u8] > ) -> Option< *mut c_void > {
    |            ^^^^^

warning: associated function `dlsym_impl` is never used
   --> preload/src/elf.rs:124:8
    |
124 |     fn dlsym_impl( &self, name: &[u8] ) -> Option< *mut c_void > {
    |        ^^^^^^^^^^

warning: associated function `dlsym_gnu_hash` is never used
   --> preload/src/elf.rs:162:15
    |
162 |     unsafe fn dlsym_gnu_hash( &self, dt_strtab: *const u8, dt_symtab: *const Symbol, dt_gnu_hash: *const u32, name: &[u8] ) -> Option< *m...
    |               ^^^^^^^^^^^^^^

warning: associated function `dlsym_elf_hash` is never used
   --> preload/src/elf.rs:220:15
    |
220 |     unsafe fn dlsym_elf_hash( &self, dt_strtab: *const u8, dt_symtab: *const Symbol, dt_hash: *const u32, name: &[u8] ) -> Option< *mut c...
    |               ^^^^^^^^^^^^^^

warning: associated function `resolve_symbol` is never used
   --> preload/src/elf.rs:249:15
    |
249 |     unsafe fn resolve_symbol( &self, dt_strtab: *const u8, dt_symtab: *const Symbol, symtab_offset: usize, expected_name: &[u8] ) -> Opti...
    |               ^^^^^^^^^^^^^^

warning: associated function `check_address` is never used
   --> preload/src/elf.rs:269:8
    |
269 |     fn check_address( &self, address: usize ) -> bool {
    |        ^^^^^^^^^^^^^

warning: associated function `dynamic` is never used
   --> preload/src/elf.rs:278:8
    |
278 |     fn dynamic( &self ) -> impl Iterator< Item = &Dynamic > {
    |        ^^^^^^^

warning: `bytehound-preload` (lib) generated 16 warnings
    Finished release [optimized + debuginfo] target(s) in 18.21s

There also might be a problem with the system libraries i.e. issues can arise if the system's C/C++ libraries are not compatible with the cross-compilation target.

@krzysiek6d
Copy link
Author

krzysiek6d commented Aug 2, 2023

I'll recompile it once again, to be sure but I'm quite sure that I passed everything for my latest checks, even details like values od -march, -mcpu, -mtune. CC and linker was used from SDK. Anyway I'll do it once again.

Anyway, besides changes in gimli I also disabled shadow based stack unwinding and it helped a lot, crashes are much later and probably connected with fork calls.

I'll also try to disable branch security in kernel command line since I have newer than 3.12 :)

Thanks

@krzysiek6d
Copy link
Author

I went back to this issue and did the basic steps further, but still I can't fix it.

I see gimli added support for this opcode: gimli-rs/gimli#667
But it is on the new version which it not so easy to port to not-perf, but I think I made it quite properly:
gimli_patch.zip
Anyway this required some adaptations in not-perf. I did only the one which compiler said that are needed, no logic
not_perf_patch.zip

I have some test as you suggested which can be run in docker aarch64

  1. Enable multiarch:
    sudo docker run --rm --privileged multiarch/qemu-user-static --reset -p yes
  2. run docker with mounted directory:
    sudo docker run -it --rm --name klops --privileged -v pwd:/knife arm64v8/ubuntu /bin/bash
  3. Build app (outside docker)
    CARGO_TARGET_AARCH64_UNKNOWN_LINUX_GNU_LINKER=aarch64-linux-gnu-gcc RUSTFLAGS="-Z branch-protection=pac-ret" cargo +nightly build --target aarch64-unknown-linux-gnu
  4. Run app in docker:
root@9308ad9759ad:/# knife/PycharmProjects/backtrace/target/aarch64-unknown-linux-gnu/debug/backtrace
d
input: d

Extracted .eh_frame address from .eh_frame_hdr for /knife/PycharmProjects/backtrace/target/aarch64-unknown-linux-gnu/debug/backtrace: 0x00000000001A2C68
Actual .eh_frame address: 0x00000000001A2C68
Loaded .eh_frame_hdr for '/knife/PycharmProjects/backtrace/target/aarch64-unknown-linux-gnu/debug/backtrace'
Extracted .eh_frame address from .eh_frame_hdr for /usr/lib/aarch64-linux-gnu/ld-linux-aarch64.so.1: 0x0000000000026DA0
Actual .eh_frame address: 0x0000000000026DA0
Loaded .eh_frame_hdr for '/usr/lib/aarch64-linux-gnu/ld-linux-aarch64.so.1'
Extracted .eh_frame address from .eh_frame_hdr for /usr/lib/aarch64-linux-gnu/libstdc++.so.6.0.30: 0x00000000001D2A20
Actual .eh_frame address: 0x00000000001D2A20
Loaded .eh_frame_hdr for '/usr/lib/aarch64-linux-gnu/libstdc++.so.6.0.30'
Extracted .eh_frame address from .eh_frame_hdr for /usr/lib/aarch64-linux-gnu/libgcc_s.so.1: 0x0000000000011760
Actual .eh_frame address: 0x0000000000011760
Loaded .eh_frame_hdr for '/usr/lib/aarch64-linux-gnu/libgcc_s.so.1'
Extracted .eh_frame address from .eh_frame_hdr for /usr/lib/aarch64-linux-gnu/libc.so.6: 0x000000000015C7B8
Actual .eh_frame address: 0x000000000015C7B8
Loaded .eh_frame_hdr for '/usr/lib/aarch64-linux-gnu/libc.so.6'
Extracted .eh_frame address from .eh_frame_hdr for /usr/lib/aarch64-linux-gnu/libm.so.6: 0x000000000007ABD0
Actual .eh_frame address: 0x000000000007ABD0
Loaded .eh_frame_hdr for '/usr/lib/aarch64-linux-gnu/libm.so.6'
Unwinding
DW_CFA_def_cfa
No address info
DW_CFA_AARCH64_negate_ra_state
RA_SIGN_STATE VALUE IS 0
No address info
DW_CFA_def_cfa_offset
No address info
DW_CFA_def_cfa_offset
DW_CFA_remember_state
DW_CFA_advance_loc2
Address info is present for address 179132
Register #31 [SP] at frame #1 is equal to 0x0000005502A01270
Unwinded!
_ZN9backtrace15print_backtrace17h3dc5cc0836609630E
root@9308ad9759ad:/# 

while when I compile it without RUSTFLAGS="-Z branch-protection=pac-ret", so the command is:
CARGO_TARGET_AARCH64_UNKNOWN_LINUX_GNU_LINKER=aarch64-linux-gnu-gcc cargo +nightly build --target aarch64-unknown-linux-gnu

root@9308ad9759ad:/# knife/PycharmProjects/backtrace/target/aarch64-unknown-linux-gnu/debug/backtrace
s
input: s

Extracted .eh_frame address from .eh_frame_hdr for /usr/lib/aarch64-linux-gnu/libgcc_s.so.1: 0x0000000000011760
Actual .eh_frame address: 0x0000000000011760
Loaded .eh_frame_hdr for '/usr/lib/aarch64-linux-gnu/libgcc_s.so.1'
Extracted .eh_frame address from .eh_frame_hdr for /usr/lib/aarch64-linux-gnu/libc.so.6: 0x000000000015C7B8
Actual .eh_frame address: 0x000000000015C7B8
Loaded .eh_frame_hdr for '/usr/lib/aarch64-linux-gnu/libc.so.6'
Extracted .eh_frame address from .eh_frame_hdr for /usr/lib/aarch64-linux-gnu/libstdc++.so.6.0.30: 0x00000000001D2A20
Actual .eh_frame address: 0x00000000001D2A20
Loaded .eh_frame_hdr for '/usr/lib/aarch64-linux-gnu/libstdc++.so.6.0.30'
Extracted .eh_frame address from .eh_frame_hdr for /usr/lib/aarch64-linux-gnu/ld-linux-aarch64.so.1: 0x0000000000026DA0
Actual .eh_frame address: 0x0000000000026DA0
Loaded .eh_frame_hdr for '/usr/lib/aarch64-linux-gnu/ld-linux-aarch64.so.1'
Extracted .eh_frame address from .eh_frame_hdr for /knife/PycharmProjects/backtrace/target/aarch64-unknown-linux-gnu/debug/backtrace: 0x000000000019C788
Actual .eh_frame address: 0x000000000019C788
Loaded .eh_frame_hdr for '/knife/PycharmProjects/backtrace/target/aarch64-unknown-linux-gnu/debug/backtrace'
Extracted .eh_frame address from .eh_frame_hdr for /usr/lib/aarch64-linux-gnu/libm.so.6: 0x000000000007ABD0
Actual .eh_frame address: 0x000000000007ABD0
Loaded .eh_frame_hdr for '/usr/lib/aarch64-linux-gnu/libm.so.6'
Unwinding
DW_CFA_def_cfa
No address info
DW_CFA_def_cfa_offset
No address info
DW_CFA_def_cfa_offset
DW_CFA_remember_state
DW_CFA_advance_loc2
Address info is present for address 178072
Register #31 [SP] at frame #1 is equal to 0x0000005502A01270
DW_CFA_def_cfa
No address info
DW_CFA_def_cfa_offset
No address info
DW_CFA_remember_state
DW_CFA_advance_loc1
Address info is present for address 181239
Register #31 [SP] at frame #2 is equal to 0x0000005502A01340
DW_CFA_def_cfa
No address info
DW_CFA_def_cfa_offset
No address info
Address info is present for address 192307
Register #31 [SP] at frame #3 is equal to 0x0000005502A01360
DW_CFA_def_cfa
No address info
DW_CFA_def_cfa_offset
No address info
Address info is present for address 195723
Register #31 [SP] at frame #4 is equal to 0x0000005502A01390
DW_CFA_def_cfa
No address info
DW_CFA_def_cfa_offset
No address info
Address info is present for address 181927
Register #31 [SP] at frame #5 is equal to 0x0000005502A013C0
DW_CFA_def_cfa
No address info
DW_CFA_def_cfa_offset
No address info
DW_CFA_remember_state
DW_CFA_advance_loc2
Address info is present for address 1379851
Register #31 [SP] at frame #6 is equal to 0x0000005502A014F0
DW_CFA_def_cfa
No address info
DW_CFA_def_cfa_offset
No address info
DW_CFA_advance_loc1
Address info is present for address 181879
Register #31 [SP] at frame #7 is equal to 0x0000005502A01550
DW_CFA_def_cfa
No address info
DW_CFA_def_cfa_offset
DW_CFA_nop
DW_CFA_nop
Address info is present for address 181327
Register #31 [SP] at frame #8 is equal to 0x0000005502A01560
DW_CFA_def_cfa
No address info
DW_CFA_def_cfa_offset
DW_CFA_nop
DW_CFA_nop
DW_CFA_nop
Address info is present for address 160763
Register #31 [SP] at frame #9 is equal to 0x0000005502A01670
DW_CFA_def_cfa
No address info
DW_CFA_def_cfa_offset
No address info
No address info
No address info
No address info
DW_CFA_nop
DW_CFA_nop
Address info is present for address 160971
Register #31 [SP] at frame #10 is equal to 0x0000005502A016D0
DW_CFA_def_cfa
No address info
DW_CFA_undefined
Address info is present for address 106927
Register #31 [SP] at frame #11 is equal to 0x0000005502A016D0
Previous frame not found: failed to determine the return address of frame #11
Unwinded!
_ZN9backtrace15print_backtrace17h3dc5cc0836609630E
_ZN9backtrace4main17h4231a744fccbd353E
_ZN4core3ops8function6FnOnce9call_once17h7fc962f4989924a1E
_ZN3std10sys_common9backtrace28__rust_begin_short_backtrace17hb89dfebeac87f904E
_ZN3std2rt10lang_start28_$u7b$$u7b$closure$u7d$$u7d$17h759427ccc83827b3E
_ZN3std2rt19lang_start_internal17hbc490604880ee546E
_ZN3std2rt10lang_start17haae2c617740da669E
main
__libc_start_main
_start
root@9308ad9759ad:/# 

So the output is much better. This is what you suggested to do.
Note: when no my gimli patch is present then app hangs.

The app I am runnig is:

use nwind::{LocalAddressSpace, UnwindControl, LocalUnwindContext};
use std::io;

fn print_backtrace() {
    let mut address_space = LocalAddressSpace::new().unwrap();
    address_space.use_shadow_stack( false );
    let mut ctx = LocalUnwindContext::new();
    let mut frames = Vec::new();
    println!("Unwinding");
    address_space.unwind( &mut ctx, |frame| {
        frames.push( frame.clone() );
        UnwindControl::Continue
    });
    println!("Unwinded!");
    let mut addresses = Vec::new();
    let mut symbols = Vec::new();
    for &address in frames.iter() {
        if let Some( symbol ) = address_space.decode_symbol_once( address ).name {
            symbols.push( symbol.to_owned() );
        }
        addresses.push( address );
    }
    for symbol in symbols.iter()
    {
        println!("{}", symbol);
    }
}


fn main() {
    let mut user_input = String::new();
    io::stdin().read_line(&mut user_input);
    println!("input: {}", user_input);
    print_backtrace();
}

Now I'm stuck. I hoped that fixing gimli would be enough but it's not. I tried to understand the references you mentioned and it seems that I need to add the authentication, but I dont know where...

I can see https://reviews.llvm.org/D123692
hint1

And there https://llvm.googlesource.com/libunwind/+/96fa50101690f48f0e7a7ffe363a5612d9ecac41%5E%21/
hint2
the hint asm calls but I completely dont know how and where to inject them - I read aarch64.rs and aarch_get_regs.s and I feel it should be somewhere there but ehh, it's quite too complicated ;/
I also don't know how to connect the gimli information about RA_SIGN_STATE pseudoregister with those asm calls. I feel that there should be some if for this authentication but I don't know how to even get the value from this pseudoregister ;/

Any help would be appreciated!

@koute
Copy link
Owner

koute commented Nov 20, 2023

Any help would be appreciated!

I'm not super familiar with how this works, but this page seems to explain the whole pointer authentication feature pretty well:

https://developer.arm.com/documentation/102433/0100/Return-oriented-programming

So, basically, when a function is entered the return address in the LR register is signed (that is, some of the bits which normally are guaranteed to be always zero for pointers are used to store the signature), and then pushed on the stack. And then on return that address is popped, and a special return instruction is used which 1) verifies that the signature is still correct, and 2) performs a return ignoring the bits clobbered with the signature.

Alternatively there's also a backwards-compatible mode where this special return instruction is split into two instruction, where the first instruction verifies the signature and clears the signature bits from the address, and a standard return instruction is used to return. (...and on older CPUs which don't support this feature the special instructions are treated as NOPs)

So, basically, two things need to be fixed here:

  1. When Bytehound reads the addresses from the stack it needs to clear the signature bits to get a valid pointer (since the pointer in memory is going to be mangled with the extra signature bits). So, as far as I can see the DW_CFA_AARCH64_negate_ra_state DWARF instructions flips the bit in the DWARF state machine telling the unwinder whether the return address is mangled or not (or in other words, whether it needs to have those bits cleared), and Bytehound should use that to decide whether to clear the bits from the pointer.
  2. Bytehound's shadow-stack based unwinding is not going to work with this feature enabled because it replaces the return addresses on the stack, and those addresses won't have the signature the code is expecting so the program will crash. In this case the shadow-stack based unwinding has to be disabled, but that will make things a lot slower so it's really a lot better to just disable the pointer authentication feature instead (which, again, AFAIK can be disabled adding a single parameter to the kernel command line arguments).

(...or for (2) the replacement pointers could maybe also be signed by Bytehound and put on the stack? That might or might not work; again, I'm not super familiar with how exactly this works so I'm not sure without trying it out)


Have you actually tried to disable this feature in the kernel? That should, I think, make it work since the pointers won't be mangled nor authenticated anymore (assuming the DW_CFA_AARCH64_negate_ra_state is treated an no-op).

@krzysiek6d
Copy link
Author

Hmm seems that indeed the easiest way is to disable it.

so it's really a lot better to just disable the pointer authentication feature instead (which, again, AFAIK can be disabled adding a single parameter to the kernel command line arguments).

You're right it can be changed via kernel command line arguments. I even wrote to Marc, the author of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/Documentation/admin-guide/kernel-parameters.txt?h=v5.12&id=f8da5752fd1b25f1ecf78a79013e2dfd2b860589
and yes, he confirmed that it can be turned off in kernel command lines.
But then I asked my colleagues how to do that, did not receive any response and forgot about that.

Thanks for your help, I'll focus on turning it off instead of trying to fix, seems that it is something I can handle ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants