Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SIGSEGV when debugging large binary with DWARF data #22569

Open
mcd1992 opened this issue Feb 6, 2024 · 7 comments
Open

SIGSEGV when debugging large binary with DWARF data #22569

mcd1992 opened this issue Feb 6, 2024 · 7 comments
Assignees

Comments

@mcd1992
Copy link
Contributor

mcd1992 commented Feb 6, 2024

Environment

Tue Feb  6 01:55:06 PM CST 2024
radare2 5.8.9 31646 @ linux-x86-64
birth: git.5.8.8-1043-g7d8bad5ba1 2024-02-05__14:32:26
commit: 7d8bad5ba11b19e4a3520d3aeaf1796ce6b4efd0
options: gpl -O? cs:5 cl:2 make
Linux x86_64

Also just updated and tested with latest commit below
radare2 5.8.9 31662 @ linux-x86-64
birth: git.5.8.8-1049-g098669591c 2024-02-06__13:58:11
commit: 098669591ca0327619fd2df572ca81d2dfe50ec0
options: gpl -O? cs:5 cl:2 make

Description

When opening a large (2.3G) ELF bin with DWARF symbols radare will consume over 8G of RAM and get oom-killed. If I set a soft ulimit and let it run again it will SIGSEGV in a memcpy call. See bottom gdb notes. I'm not sure why the debug symbols for libr_util.so aren't showing up; I'm guessing something weird with the macros in the hashtable source C? Let me know if there's any extra info I can get from GDB.

Test

The binary is large so I can't easily distribute it but it can be downloaded/made.
/palworld/Pal/Binaries/Linux/merged: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 3.10.93, BuildID[xxHash]=a37ec78630b980cc, with debug_info, not stripped

You'll need to get the old bins for the game dedicated server Palworld with something like https://github.com/SteamRE/DepotDownloader and get the old manifest files https://steamdb.info/depot/2394012/history/?changeid=M:4603741190199642564

Then eu-unstrip the executable and the .debug ELF together and just r2 merged.bin

GDB output

pwndbg> bt
#0  0x00007ffff5c491a1 in ?? () from /usr/lib/libc.so.6
#1  0x00007ffff7cff328 in ht_pp_insert_kv () from /home/unknown/Development/radare2/prefix/lib/libr_util.so
#2  0x00007ffff7cfef84 in internal_ht_grow () from /home/unknown/Development/radare2/prefix/lib/libr_util.so
#3  0x00007ffff7cff13c in check_growing () from /home/unknown/Development/radare2/prefix/lib/libr_util.so
#4  0x00007ffff7cff334 in ht_pp_insert_kv () from /home/unknown/Development/radare2/prefix/lib/libr_util.so
#5  0x00007ffff7d08d74 in sdb_ht_insert_kvp () from /home/unknown/Development/radare2/prefix/lib/libr_util.so
#6  0x00007ffff7d1470e in sdb_set_internal () from /home/unknown/Development/radare2/prefix/lib/libr_util.so
#7  0x00007ffff7d147bf in sdb_set () from /home/unknown/Development/radare2/prefix/lib/libr_util.so
#8  0x00007ffff7d13843 in sdb_add () from /home/unknown/Development/radare2/prefix/lib/libr_util.so
#9  0x00007ffff5e34f53 in add_sdb_addrline (s=0x555555723060, addr=111037128, file=0x555670fd3190 "Runtime/Core/Public\\Containers/BitArray.h", line=0, column=3, mode=2, 
    print=0x7ffff7e5a7b9 <r_cons_printf>) at dwarf.c:1023
#10 0x00007ffff5e35739 in parse_spec_opcode (bin=0x5555555a5020, 
    obuf=0x7fffc35c6497 "\005\030\006\003\271\016\202\005\b\006<\005\027\006m\005\004\003wX\005\021K\005\v9\005\003\006.\005$\006\003\024.\004\367\001\005/\003\333qf\005\n\006\202\004\221\001\0058\006\003\250\016t\005\027x\005\a\006.\005\032\006\003F<\004\236\001\005\004\003\217~.\006\003\333st\004\221\001\005\032\006\003\226\016\272\004\236\001\005\004\003\217~.\006\003\333sf\004\221\001\005$\006\003\311\016t\005", len=82079570, hdr=0x7fffffffd2c0, regs=0x7fffffffd280, opcode=102 'f', mode=2) at dwarf.c:1156
#11 0x00007ffff5e35f4e in parse_opcodes (bin=0x5555555a5020, obuf=0x7fffc35c6322 "\004\236\001", len=82079570, hdr=0x7fffffffd2c0, regs=0x7fffffffd280, mode=2) at dwarf.c:1326
#12 0x00007ffff5e362fb in parse_line_raw (a=0x5555555a5020, obuf=0x7fffc0533010 "WE\001", len=133014489, mode=2, be=false) at dwarf.c:1400
#13 0x00007ffff5e3a42e in r_bin_dwarf_parse_line (bin=0x5555555a5020, mode=2) at dwarf.c:2603
#14 0x00007ffff77d22a4 in bin_dwarf (core=0x7ffff5a8e010, pj=0x0, mode=2) at cbin.c:1161
#15 0x00007ffff77e1608 in r_core_bin_info (core=0x7ffff5a8e010, action=5263359, pj=0x0, mode=2, va=1, filter=0x0, chksum=0x0) at cbin.c:4750
#16 0x00007ffff77ce9c9 in r_core_bin_set_env (r=0x7ffff5a8e010, binfile=0x55555571f190) at cbin.c:316
#17 0x00007ffff7789f6f in r_core_file_load_for_io_plugin (r=0x7ffff5a8e010, baseaddr=18446744073709551615, loadaddr=0) at cfile.c:450
#18 0x00007ffff778a81a in r_core_bin_load (r=0x7ffff5a8e010, filenameuri=0x555555798480 "/home/unknown/srcds/palworld/Pal/Binaries/Linux/merged", baddr=18446744073709551615) at cfile.c:658
#19 0x00007ffff60f1161 in binload (r=0x7ffff5a8e010, filepath=0x555555798480 "/home/unknown/srcds/palworld/Pal/Binaries/Linux/merged", baddr=18446744073709551615) at radare2.c:547
#20 0x00007ffff60f4651 in r_main_radare2 (argc=2, argv=0x7fffffffdca8) at radare2.c:1488
#21 0x00005555555556fd in main (argc=2, argv=0x7fffffffdca8) at radare2.c:118
#22 0x00007ffff5b18cd0 in ?? () from /usr/lib/libc.so.6
#23 0x00007ffff5b18d8a in __libc_start_main () from /usr/lib/libc.so.6
#24 0x0000555555555135 in _start ()
pwndbg> ctx
LEGEND: STACK | HEAP | CODE | DATA | RWX | RODATA
────────────────────────────────────────────────────────────────────[ REGISTERS / show-flags off / show-compact-regs off ]────────────────────────────────────────────────────────────────────
*RAX  0x28
*RBX  0x5555555a5020 —▸ 0x55555557d3f0 ◂— '/home/unknown/srcds/palworld/Pal/Binaries/Linux/merged'
*RCX  0x5556efbc7c60 —▸ 0x5555860d8560 ◂— '0x4b6024b'
*RDX  0x28
*RDI  0x28
*RSI  0x5556efbc7c60 —▸ 0x5555860d8560 ◂— '0x4b6024b'
*R8   0xffffffff
*R9   0x0
*R10  0x55570dc40e50 ◂— 0x0
*R11  0x55570dc41000
*R12  0x0
*R13  0x7fffffffdcc0 —▸ 0x7fffffffe18d ◂— 'SHELL=/bin/bash'
*R14  0x7ffff7ffd000 (_rtld_global) —▸ 0x7ffff7ffe2d0 —▸ 0x555555554000 ◂— 0x10102464c457f
*R15  0x555555557c58 —▸ 0x5555555551b0 ◂— endbr64 
*RBP  0x7fffffffce40 —▸ 0x7fffffffcf00 —▸ 0x7fffffffcf20 —▸ 0x7fffffffcf60 —▸ 0x7fffffffcf90 ◂— ...
*RSP  0x7fffffffce08 —▸ 0x7ffff7cff328 (ht_pp_insert_kv+97) ◂— mov rax, qword ptr [rbp - 0x18]
*RIP  0x7ffff5c491a1 ◂— vmovdqu ymmword ptr [rdi], ymm0
─────────────────────────────────────────────────────────────────────────────[ DISASM / x86-64 / set emulate on ]─────────────────────────────────────────────────────────────────────────────
 ► 0x7ffff5c491a1    vmovdqu ymmword ptr [rdi], ymm0
   0x7ffff5c491a5    vmovdqu ymmword ptr [rdi + rdx - 0x20], ymm1
   0x7ffff5c491ab    vzeroupper 
   0x7ffff5c491ae    ret    
 
   0x7ffff5c491af    nop    
   0x7ffff5c491b0    cmp    edx, 0x10
   0x7ffff5c491b3    jae    0x7ffff5c491e2                <0x7ffff5c491e2>
    ↓
   0x7ffff5c491e2    vmovdqu xmm0, xmmword ptr [rsi]
   0x7ffff5c491e6    vmovdqu xmm1, xmmword ptr [rsi + rdx - 0x10]
   0x7ffff5c491ec    vmovdqu xmmword ptr [rdi], xmm0
   0x7ffff5c491f0    vmovdqu xmmword ptr [rdi + rdx - 0x10], xmm1
──────────────────────────────────────────────────────────────────────────────────────────[ STACK ]───────────────────────────────────────────────────────────────────────────────────────────
00:0000│ rsp 0x7fffffffce08 —▸ 0x7ffff7cff328 (ht_pp_insert_kv+97) ◂— mov rax, qword ptr [rbp - 0x18]
01:0008│-030 0x7fffffffce10 —▸ 0x55555561d548 —▸ 0x7ffff5c4b710 ◂— endbr64 
02:0010│-028 0x7fffffffce18 ◂— 0xdbb500ffffffff
03:0018│-020 0x7fffffffce20 —▸ 0x5556efbc7c60 —▸ 0x5555860d8560 ◂— '0x4b6024b'
04:0020│-018 0x7fffffffce28 —▸ 0x555641c41cd0 —▸ 0x5556c1107d00 —▸ 0x5555b700f8f0 —▸ 0x5555b3997750 ◂— ...
05:0028│-010 0x7fffffffce30 —▸ 0x5555bb54bed0 —▸ 0x5555726c5cf0 ◂— '0x5dc3f80'
06:0030│-008 0x7fffffffce38 ◂— 0x28 /* '(' */
07:0038│ rbp 0x7fffffffce40 —▸ 0x7fffffffcf00 —▸ 0x7fffffffcf20 —▸ 0x7fffffffcf60 —▸ 0x7fffffffcf90 ◂— ...
────────────────────────────────────────────────────────────────────────────────────────[ BACKTRACE ]─────────────────────────────────────────────────────────────────────────────────────────
 ► 0   0x7ffff5c491a1
   1   0x7ffff7cff328 ht_pp_insert_kv+97
   2   0x7ffff7cfef84 internal_ht_grow+230
   3   0x7ffff7cff13c check_growing+42
   4   0x7ffff7cff334 ht_pp_insert_kv+109
   5   0x7ffff7d08d74 sdb_ht_insert_kvp+44
   6   0x7ffff7d1470e sdb_set_internal+1039
   7   0x7ffff7d147bf sdb_set+54
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
pwndbg> frame 1
#1  0x00007ffff7cff328 in ht_pp_insert_kv () from /home/unknown/Development/radare2/prefix/lib/libr_util.so
pwndbg> ctx
LEGEND: STACK | HEAP | CODE | DATA | RWX | RODATA
────────────────────────────────────────────────────────────────────[ REGISTERS / show-flags off / show-compact-regs off ]────────────────────────────────────────────────────────────────────
*RAX  0x28
*RBX  0x5555555a5020 —▸ 0x55555557d3f0 ◂— '/home/unknown/srcds/palworld/Pal/Binaries/Linux/merged'
*RCX  0x5556efbc7c60 —▸ 0x5555860d8560 ◂— '0x4b6024b'
*RDX  0x28
*RDI  0x28
*RSI  0x5556efbc7c60 —▸ 0x5555860d8560 ◂— '0x4b6024b'
*R8   0xffffffff
*R9   0x0
*R10  0x55570dc40e50 ◂— 0x0
*R11  0x55570dc41000
*R12  0x0
*R13  0x7fffffffdcc0 —▸ 0x7fffffffe18d ◂— 'SHELL=/bin/bash'
*R14  0x7ffff7ffd000 (_rtld_global) —▸ 0x7ffff7ffe2d0 —▸ 0x555555554000 ◂— 0x10102464c457f
*R15  0x555555557c58 —▸ 0x5555555551b0 ◂— endbr64 
*RBP  0x7fffffffce40 —▸ 0x7fffffffcf00 —▸ 0x7fffffffcf20 —▸ 0x7fffffffcf60 —▸ 0x7fffffffcf90 ◂— ...
*RSP  0x7fffffffce10 —▸ 0x55555561d548 —▸ 0x7ffff5c4b710 ◂— endbr64 
*RIP  0x7ffff7cff328 (ht_pp_insert_kv+97) ◂— mov rax, qword ptr [rbp - 0x18]
─────────────────────────────────────────────────────────────────────────────[ DISASM / x86-64 / set emulate on ]─────────────────────────────────────────────────────────────────────────────
   0x7ffff7cff323 <ht_pp_insert_kv+92>	   call   0x7ffff7c547e0 <memcpy@plt>
 ► 0x7ffff7cff328 <ht_pp_insert_kv+97>     mov    rax, qword ptr [rbp - 0x18]
   0x7ffff7cff32c <ht_pp_insert_kv+101>    mov    rdi, rax
   0x7ffff7cff32f <ht_pp_insert_kv+104>    call   check_growing                <check_growing>
 
   0x7ffff7cff334 <ht_pp_insert_kv+109>    mov    eax, 1
   0x7ffff7cff339 <ht_pp_insert_kv+114>    jmp    ht_pp_insert_kv+121                <ht_pp_insert_kv+121>
 
   0x7ffff7cff33b <ht_pp_insert_kv+116>    mov    eax, 0
   0x7ffff7cff340 <ht_pp_insert_kv+121>    leave  
   0x7ffff7cff341 <ht_pp_insert_kv+122>    ret    
 
   0x7ffff7cff342 <insert_update>          push   rbp
   0x7ffff7cff343 <insert_update+1>        mov    rbp, rsp
   0x7ffff7cff346 <insert_update+4>        sub    rsp, 0x30
──────────────────────────────────────────────────────────────────────────────────────────[ STACK ]───────────────────────────────────────────────────────────────────────────────────────────
00:0000│ rsp 0x7fffffffce10 —▸ 0x55555561d548 —▸ 0x7ffff5c4b710 ◂— endbr64 
01:0008│-028 0x7fffffffce18 ◂— 0xdbb500ffffffff
02:0010│-020 0x7fffffffce20 —▸ 0x5556efbc7c60 —▸ 0x5555860d8560 ◂— '0x4b6024b'
03:0018│-018 0x7fffffffce28 —▸ 0x555641c41cd0 —▸ 0x5556c1107d00 —▸ 0x5555b700f8f0 —▸ 0x5555b3997750 ◂— ...
04:0020│-010 0x7fffffffce30 —▸ 0x5555bb54bed0 —▸ 0x5555726c5cf0 ◂— '0x5dc3f80'
05:0028│-008 0x7fffffffce38 ◂— 0x28 /* '(' */
06:0030│ rbp 0x7fffffffce40 —▸ 0x7fffffffcf00 —▸ 0x7fffffffcf20 —▸ 0x7fffffffcf60 —▸ 0x7fffffffcf90 ◂— ...
07:0038│+008 0x7fffffffce48 —▸ 0x7ffff7cfef84 (internal_ht_grow+230) ◂— add dword ptr [rbp - 0x94], 1
────────────────────────────────────────────────────────────────────────────────────────[ BACKTRACE ]─────────────────────────────────────────────────────────────────────────────────────────
   0   0x7ffff5c491a1
 ► 1   0x7ffff7cff328 ht_pp_insert_kv+97
   2   0x7ffff7cfef84 internal_ht_grow+230
   3   0x7ffff7cff13c check_growing+42
   4   0x7ffff7cff334 ht_pp_insert_kv+109
   5   0x7ffff7d08d74 sdb_ht_insert_kvp+44
   6   0x7ffff7d1470e sdb_set_internal+1039
   7   0x7ffff7d147bf sdb_set+54
   8   0x7ffff7d13843 sdb_add+76
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
@mcd1992
Copy link
Contributor Author

mcd1992 commented Feb 6, 2024

Note it doesn't happen immediately and there is some debug notes a few minutes prior to it segfaulting. Might add some extra R_LOG_DEBUGs to the libr/bin/dwarf.c or shlr/sdb/src/ht.inc.c and see where its happening. If I open with -nn it doesn't happen but doing a oob will then trigger it.

DEBUG: empty symbol name
DEBUG: Symbol name outside the strtab section
DEBUG: Truncated corrupted section name: conditional<false, const Eigen::CwiseUnaryOp<Eigen::internal::scalar_conjugate_op<double>, const Eigen::CwiseBinaryOp<Eigen::internal::scalar_sum_op<double, double>, const Eigen::CwiseBinaryOp<Eigen::internal::scalar_sum_op<double, double>, const Eigen::CwiseBinaryOp<Eigen::internal::scalar_product_op<double, double>, const Eigen::CwiseNullaryOp<Eigen::internal::scalar_constant_op<double>, const Eigen::Matrix<double, -1, 1, 0, -1, 1> >, const Eigen::Map<const Eigen::Matrix
DEBUG: Truncated corrupted section name: conditional<false, const Eigen::CwiseUnaryOp<Eigen::internal::scalar_real_op<double>, const Eigen::CwiseBinaryOp<Eigen::internal::scalar_sum_op<double, double>, const Eigen::CwiseBinaryOp<Eigen::internal::scalar_sum_op<double, double>, const Eigen::CwiseBinaryOp<Eigen::internal::scalar_product_op<double, double>, const Eigen::CwiseNullaryOp<Eigen::internal::scalar_constant_op<double>, const Eigen::Matrix<double, -1, 1, 0, -1, 1> >, const Eigen::Map<const Eigen::Matrix<doub
DEBUG: Truncated corrupted section name: conditional<false, Eigen::CwiseUnaryView<Eigen::internal::scalar_real_ref_op<double>, Eigen::CwiseBinaryOp<Eigen::internal::scalar_sum_op<double, double>, const Eigen::CwiseBinaryOp<Eigen::internal::scalar_sum_op<double, double>, const Eigen::CwiseBinaryOp<Eigen::internal::scalar_product_op<double, double>, const Eigen::CwiseNullaryOp<Eigen::internal::scalar_constant_op<double>, const Eigen::Matrix<double, -1, 1, 0, -1, 1> >, const Eigen::Map<const Eigen::Matrix<double, -1
DEBUG: Truncated corrupted section name: conditional<false, Eigen::CwiseUnaryOp<Eigen::internal::scalar_conjugate_op<double>, const Eigen::Transpose<const Eigen::CwiseBinaryOp<Eigen::internal::scalar_sum_op<double, double>, const Eigen::CwiseBinaryOp<Eigen::internal::scalar_sum_op<double, double>, const Eigen::CwiseBinaryOp<Eigen::internal::scalar_product_op<double, double>, const Eigen::CwiseNullaryOp<Eigen::internal::scalar_constant_op<double>, const Eigen::Matrix<double, -1, 1, 0, -1, 1> >, const Eigen::Map<co

@trufae
Copy link
Collaborator

trufae commented Feb 7, 2024

Well this is an out of memory problem caused by your kernel (debian / ubuntu?) despite we can optimize the memory usage in dwarf importing i think it could be better to find alternative solutions.

  • dont load the dwarf info -e bin.dbginfo=false
  • implement parsing this info everytime is needed instead of mapping it all in a hashtable in memory
  • use a better data storage instead of a hashtable

The assembly you pasted here shows a nullptr + 0x28 delta, so it seems there's an allocation that fails, but will be good to know which one it is. and as long as this depends on the system i will probably not be able to reproduce in here.

Can you upload that binary somewhere?

@mcd1992
Copy link
Contributor Author

mcd1992 commented Feb 7, 2024

It looks like just the split debug file triggers it as well, no need to eu-unstrip. https://gofile.io/d/DNUA0G

This is on arch's kernel 6.6.10-arch1-1. The whole reason I'm wanting to open this though is for the DWARF data so I can generate zignatures / FLIRT for the debug-less versions. I could probably just dump the symbols+address and manually af name them before making zigs.

The same thing happens with a stock Unreal Engine 5 game, the .debug/DWARF file is 2G+ and will cause radare to OOM even on a system with 32G RAM.

@trufae
Copy link
Collaborator

trufae commented Feb 8, 2024

I can't reproduce. on ubuntu i get the process killed because the kernel is picky and just kills the process when eats a lot of memory. but i can open this file without issues in macOS after consuming 50GB of ram. so this is not a sigsegv for me. Btw , after loading all the dwarf info, r2 eats about 16-20GB of ram.
photo_2024-02-08 02 08 28

@mcd1992
Copy link
Contributor Author

mcd1992 commented Feb 8, 2024

Ah I haven't tested with a system above 32G. Using ulimit -Sv 8000000 or any value smaller than radare needs to build the full hashtables should trigger it.

I guess if it's just a side-effect of how radare does DWARF parsing the issue is more about your last 2 bullet points on re-writing the parser/hashtable implementation.

@trufae
Copy link
Collaborator

trufae commented Feb 10, 2024

Agree, the current storage method for dwarf info does not perform well for large files like this, but afaik the crash is not caused by a bug in r2 code. So it will be better to redesign the way this information is stored in r2 instead of depending on a hashtable. FIxing things by throwing more metal is not the way to go.

@trufae
Copy link
Collaborator

trufae commented Feb 15, 2024

i've introduced a void* to hold a private data storage to replace the current hashtable approach without breaking the ABI promise, this way i can fix that without holding the release for more time. so moving this ticket for 5.9.2 or so :)

@trufae trufae added this to the 5.9.2 - codename neatrunner milestone Feb 15, 2024
@trufae trufae self-assigned this Feb 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants