Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support uImage format and/or manual arch specification #3

Open
skochinsky opened this issue Dec 31, 2019 · 3 comments
Open

Support uImage format and/or manual arch specification #3

skochinsky opened this issue Dec 31, 2019 · 3 comments

Comments

@skochinsky
Copy link

First, congrats on the awesome tool.

I decided to try it out and went to the OpenWRT release archive. The first one alphabetically was ARC and it failed:

  File "C:\Work\git\vmlinux-to-elf\vmlinux_to_elf\architecture_detecter.py", line 157, in guess_architecture
    raise ValueError('The architecture could not be guessed successfully')
ValueError: The architecture could not be guessed successfully

In fact, the uImage header already includes the architecture, load address and even entrypoint:

openwrt-18.06.4-arc770-generic-uImage: u-boot legacy uImage, ARC OpenWrt Linux-4.9.184, Linux/DesignWare ARC, OS Kernel Image (Not compressed), 4522192 bytes, Thu Jun 27 12:18:52 2019, Load Address: 0x80000000, Entry Point: 0x8000A000, Header CRC: 0xA11EF4A4, Data CRC: 0xAC4BE39B

Additionally, there is no need to know the architecture if not writing out the ELF file (e.g. when just dumping symbols), so this step could be skipped until required. You could also let user specify it manually or just write 0 to e_machine.

Note: uImage format may employ its own compression (seen at least gzip used).

@marin-m
Copy link
Owner

marin-m commented Dec 31, 2019

Hello,

Kudos for your work on IDA too.

I can see multiple things that I could improve from your post:

  • Supporting parsing the uImage header. Indeed, this could be useful for the architecture, but furthermore to correct the offset corresponding to the base address of the kernel: currently, it is considered that it is either the start of the raw input file, the start of the original ELF section contents, or the start of the compressed stream. However, I would note that I have rarely seen uncompressed uImage kernels in the wild.
  • I should add prologue detection for ARC (because why not).
  • Supporting customizing the output e_machine field of the ELF header, based on the command line arguments to vmlinux-to-elf. I could think also to set the detected architecture to a dummy value for the kallsyms-finder utility that provides a text representation of the symbols, however, it is not totally correct that it would be without consequence: in recent kernels, the size of all fields except kallsyms_addresses, kallsyms_offsets, kallsyms_relative_base has been trimmed to 4 bytes (the size of a GNU Assembler .long, even in x64) for optimization purposes, and as the addresses/offset fields lay at the edge of the kallsyms table, it is complicated to guess their size (except through pattern frequency matching or something): thus I rely in part on the know addressing bit size of the detected architecture. I guess that I could default to 32 bits in absence of detected architecture or extra flags (or require an explicit bit size).

In the end, it is possible that the best would be to add generic flags for information that are not 100 % sure to be inferred exactly by the tool (--kernel-offset, --base-address, --e-machine, --bit-size), even though the detection works well with my corpus of kernels.

I should get back at this soon. Other ideas are welcome.

Regards,

marin-m added a commit that referenced this issue Jan 1, 2020
marin-m added a commit that referenced this issue Jan 1, 2020
…ertain OpenWRT settings (from the sample in issue #3)
@marin-m
Copy link
Owner

marin-m commented Jan 1, 2020

Hello,

For your information, your kernel now reconstructs well without extra arguments. Also, I have added support for the extra arguments that I have mentioned in the previous message. These have been documented in the README.md.

Regards,

@skochinsky
Copy link
Author

Thanks!

FYI found an example of a compressed uImage which seeems to be not handled out-of-box: openwrt-18.06.4-lantiq-falcon-lantiq_easy98000-nand-squashfs-sysupgrade.bin
Also openwrt-18.06.4-ramips-rt305x-3g-6200n-initramfs-kernel.bin

However no symbols found even after manual decompression :( Making an ELF with just code section may be useful although without .bss the analysis will not be too great...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants