Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python exception UnicodeDecodeError #39

Open
brucey13 opened this issue Mar 21, 2024 · 5 comments
Open

Python exception UnicodeDecodeError #39

brucey13 opened this issue Mar 21, 2024 · 5 comments

Comments

@brucey13
Copy link

brucey13 commented Mar 21, 2024

I'm going to attempt to add a new SOC, unfortunately I wasn't expecting to hit this.

Any direction would be greatly appreciated.

              ___            __      _
-.     .-.   | __|(+) _ _ _ _\ \    / /(+) _ _ ___    .-.     .-
  \   /   \  | _|  | | '_| '  \ \/\/ /  | | '_/ -_)  /   \   /
   '-'     '-|_|   | |_| |_|_|_\_/\_/   | |_| \___|-'     '-'
             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~   v1.1.0
                A  baseband  analysis  platform
                   https://github.com/FirmWire

[INFO] firmwire.loader: Reading firmware using MTKLoader (mtk) and args {'nv_data': PurePosixPath('mnt')}
[INFO] firmwire.vendor.mtk.loader: Found new file md1rom at 0x0/0x0 with length 0x1446cd0
[INFO] firmwire.vendor.mtk.loader: Found new file cert1 at 0x1446ed0/0x12345678 with length 0x6ad
[INFO] firmwire.vendor.mtk.loader: Found new file cert2 at 0x1447780/0x12345678 with length 0x3bd
[INFO] firmwire.vendor.mtk.loader: Found new file md1drdi at 0x1447d40/0x0 with length 0xe6200
[INFO] firmwire.vendor.mtk.loader: Found new file cert1 at 0x152e140/0x12345678 with length 0x6ad
[INFO] firmwire.vendor.mtk.loader: Found new file cert2 at 0x152e9f0/0x12345678 with length 0x3bd
[INFO] firmwire.vendor.mtk.loader: Found new file md1dsp at 0x152efb0/0x0 with length 0x68411c
Traceback (most recent call last):
  File "./firmwire.py", line 312, in <module>
    sys.exit(main())
  File "./firmwire.py", line 228, in main
    loader = firmwire.loader.load_any(
  File "/firmwire/firmwire/loader.py", line 142, in load_any
    obj = _do_load(loader_cls, path, workspace, **loader_args)
  File "/firmwire/firmwire/loader.py", line 184, in _do_load
    if obj.try_load():
  File "/firmwire/firmwire/vendor/mtk/loader.py", line 103, in try_load
    self.sections = {s.name: s for s in self.iter_section_info()}
  File "/firmwire/firmwire/vendor/mtk/loader.py", line 103, in <dictcomp>
    self.sections = {s.name: s for s in self.iter_section_info()}
  File "/firmwire/firmwire/vendor/mtk/loader.py", line 329, in iter_section_info
    name = contents[2][
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc2 in position 0: invalid continuation byte

panda is here:
commit 01c9989b835535f78f5bffec165a39462c361a9c
Merge: b265e4c305 e8c177eca7
Author: Marius Muench m.muench@vu.nl
Date: Wed Mar 8 11:27:28 2023 +0100

main repo is:
commit 8b540a6
Author: Grant Hernandez grant.h.hernandez@gmail.com
Date: Wed Aug 16 10:06:43 2023 -0400

@mariusmue
Copy link
Contributor

Hi.

Can you provide a link to the firmware you are trying to load? It seems to be specific to the firmware loading part, so without firmware, we cannot reproduce the error.

That being said, it looks like a decoding error. I would throw in an import IPython; IPython.embed() just before the error to then dynamically inspect the live state/section contents to try to understand what is going wrong. Most likely it's mistreating some memory as utf-8 encoded bytes, rather than raw bytes, so maybe explicit loading as bytearray/binary may help, but one would need to look into that in detail to find the fix.

@brucey13
Copy link
Author

Hi,

So I did some digging and as you suspect there is something strange about the file I used. Each iteration expects there to be a sane "header" at the next location (in this case 0x152efb0 + 0x68411c). I looked there and it appears to pure binary data, definitely not the expected structure. I didn't collect that file myself so I have questions marks around its integrity.

I do have an unrelated question, how important is "md1_dbginfo"? I have another couple of files which appear correct but are missing that section. Is this going to be a dealbreaker for me being able to emulate those files?

@dklischies
Copy link

dklischies commented Mar 26, 2024

md1_dbginfo contains the symbol table (see https://github.com/FirmWire/FirmWire/blob/main/firmwire/vendor/mtk/loader.py#L110). The symbols are used throughout the emulation, mainly to hook specific functions. Without a symbol table you would have to figure out all addresses by hand. To get a feeling for how much work that is, search for symbol in https://github.com/FirmWire/FirmWire/blob/main/firmwire/vendor/mtk/machine.py and look at the pattern.py.

Depending on how important it is to you to work on this specific firmware, you could either do this by hand or maybe recover symbols using Ghidra and a FunctionID database from another firmware sample, hoping that the relevant functions were not changed, such that FunctionID can detect them.

Regarding the broken file headers: Mediatek changed the format of some of their log files during the 4G->5G transition. Maybe in this case they changed the format of their modem image, i.e., it might be that your file is not corrupted but uses a new format.

@grant-h
Copy link
Contributor

grant-h commented Mar 26, 2024

FirmWire needs debug data or very good symbol patterns like in https://github.com/FirmWire/FirmWire/blob/main/firmwire/vendor/shannon/pattern.py to work properly. Some OEMs ship debug data, but others don't. To handle the others, you need to develop patterns for every symbol of interest on a per-ISA basis.

@brucey13
Copy link
Author

So luckily (some of) the firmware I'm interested in has debug data. Unfortunately not all the patterns match (I assume they need to). I've loaded it up in Ghidra (using your patches because without, useless). Unfortunately I may have come to a dead end here because analyse_mtk_image.py doesn't work, it appears that the SPRAM symbols don't exist in my debug info. I will perhaps persevere with suggestions from dklischies. The issue is that I would really rather not disassemble by hand but it seems nothing can do it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants