Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reverse engineered ets_timers.o #285

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

sheinz
Copy link
Contributor

@sheinz sheinz commented Nov 17, 2016

Recreation of ets_timer.o from libmain.a
Preliminary testing passed.

We currently have two timers implementation in eor from Espressif: ets_timer.o and timers.o

ets_timer.o are based on FRC2 hardware timer.
timers.o are based on FreeRTOS timers implementation (which are also use FRC2 timer underneath)

They both provide identical interface, so can be used interchangeably.

The recent version of FreeRTOS SDK from Espressif completely removed timers.o from their libraries.

@ourairquality
Copy link
Contributor

Great work. I agree that the unknown structures are dead code. We now have the task notify support in FreeRTOS which is claimed to be lighter, and the suggestions in ourairquality@0dcd801 use this feature and remove the dead code. Also tried just using the os_timer for these and it seems to work in a quick test, and do you think there is a real need for the ets_timer now, was it more accurate etc?

@sheinz
Copy link
Contributor Author

sheinz commented Nov 17, 2016

@ourairquality I also though that task notification is a good fit for the purpose in this case.

I'm not sure about the accuracy but suspect it is not that great because of the event transfer from the ISR to the task and task switching. But it should be better than FreeRTOS timers as they have resolution of 10ms.

I agree that having two implementations of the same concept has no benefits and might be confusing for a newcomers.

There's at least one drawback of ets_timer.o implementation. As it running a task that handles timers processing if a task with more priority is consuming CPU and not allowing other tasks to run,
the timer event will not be delivered. I'm not sure what is the situation with FreeRTOS timers in this case.

Update:
FreeRTOS timers operate in a similar way. They have a running task underneath that processes timers. And the priority for eor is configured: configTIMER_TASK_PRIORITY ( tskIDLE_PRIORITY + 2 )

@sheinz
Copy link
Contributor Author

sheinz commented Nov 17, 2016

Some test results in microseconds:

ets_timer.o:

13000 delay: 13033
13000 delay: 12983
13000 delay: 12997
13000 delay: 12997
53000 delay: 53037
13000 delay: 12997

timers.o:

13000 delay: 8408
13000 delay: 9981
13000 delay: 9997
13000 delay: 9997
53000 delay: 48432
13000 delay: 10203

So, ets_timers.o is definitely more accurate.

@ourairquality
Copy link
Contributor

Interesting.

There appears to be only one ets_timer remaining in the sdk, and it is converted to source code here ourairquality@c3ff68a.

So we know it does not use the repeat argument and times are in milli-seconds which would simplify the computation of the number of ticks to just a multiplication by 5000. We also know there is only one timer, and this could simplify the code a lot too.

So do we need a high priority timer with that level of accuracy for general use, or should we simplify it as much as possible to the single use remaining?

Switch from FreeRTOS queue to task notification.
Removed unknown/unused code.
ticks = (value * 5000000) / 1000000;
}
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code was trying to manage the overflow, i.e. 859 * 5000000 = 0x100007FC0. Here's a suggestion:

    if (value_in_ms) {
        ticks = value * 5000;
    } else {
        ticks = value * 5;
    }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense. Also I made a mistake in the condition. It should be other way around. I've just checked.

vPortExitCritical();
}

void sdk_ets_timer_arm(ets_timer_t *timer, uint32_t milliseconds,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sdk has the following that might as well be included. Then perhaps make sdk_ets_timer_arm_ms_us static.

void sdk_ets_timer_arm_us(ets_timer_t *timer, uint32_t useconds,
        bool repeat_flag)
{
    sdk_ets_timer_arm_ms_us(timer, useconds, repeat_flag,
            /*value in ms=*/false);
}

* This function is not called from the interrupt regardless of its name.
*/
void IRAM sdk_ets_timer_handler_isr()
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps fold this function into the timer task which is static as this is the guts of the timer task, and that would avoid the confusing name too.

@pfalcon
Copy link
Contributor

pfalcon commented Nov 17, 2016

@sheinz : I wonder, how was decompilation handled here?

@sheinz
Copy link
Contributor Author

sheinz commented Nov 18, 2016

@pfalcon I just use xtobjdis to disassembler object files.
There's also useful build option FLAVOR = sdklike to build the project with the same compile argument as the original SDK libraries was built. So, the output of RE part can be compared on the assembler level.

I know that there's ScratchABit but unfortunately I haven't figured out how to use it.

I actually wanted to ask the same question other people involved in RE, @ourairquality in particular. What are the best practices to do it? How to do it easiest and quickest?
It usually takes a lot of time for me :(

@vlad-ivanov-name
Copy link
Contributor

vlad-ivanov-name commented Nov 18, 2016

radare2 supports xtensa analysis with emulation, and as a result it resolves some load/store locations (asm.cmtrefs) and register calls (https://asciinema.org/a/1n8wfswmadc5ly8jg9r9di5fn). I was hoping it would be useful with radeco, but unfortunately radeco-lib is in a very early stage of development, is written in Rust and doesn't produce any human-readable output.

Also, radare2 is kind of crashy and you may need to try several git revisions (perhaps with git bisect) if it crashes on your binary.

It would be great to partially automate the decompilation. Another idea I had was to translate xtensa code to mips code (since they are very similar) and then run it through Retargetable Decompiler, but I think the decompiler won't be happy about ABI changes.

@ourairquality
Copy link
Contributor

I've just been using xtobjdi and manually translating to C code. It is difficult were the data structures and function arguments and results are unknown. Some things that help: data flow analysis putting it into C blocks and looking for DAG patterns and diamond patterns and loops etc; work backwards renaming the registers SSA style so uses can be connected to definitions more easily in big functions; note if loads are doing sign or zero extensions and the sign of comparisons; look for the result register being loaded and unused as a clue that a function returns a result; note which argument passing locations are live at the start to determine which arguments are used which gives a good clue to the function arguments; look for pointer dereferencing to spot pointers; a memw indicates a load from a volatile location; match calls to their uses to get more clues about function results and arguments. Surely a lot of this could be automated, but are there tools to do this?

Another thought was if we could convert to assembler source code. Can the output of xtobjdi be converted to valid assembler code relatively easily. For some files we might only need the C code from a few function but many more are needed due to referencing static data, so could we convert most to assembler code mechanically and just a few to C code?

Could we embed the assembler code in C functions? There are examples in the source code already doing this but I have not looked into how to pass the arguments and results, and might the function stack handling be trouble.

Could we mechanically translate instructions to C code, coercing each argument as need, using goto and labels for branches. The code would be really ugly, but perhaps a more reliable start?

@pfalcon
Copy link
Contributor

pfalcon commented Nov 18, 2016

@pfalcon I just use xtobjdis to disassembler object files.

So, decompilation (to C code) is done manually, right? I kinda expected that to be the case, was just confused by things like if ((delta - 40) < 1) {.

I know that there's ScratchABit but unfortunately I haven't figured out how to use it.

Note that ScratchABit is still a disassembler, not a decompiler. Regarding how to use, @projectgus contributed this section to make it easier: https://github.com/pfalcon/ScratchABit#using-plugins when looked into it. Btw, ScratchABit is finally (1.5 years of intemittent development) feature-complete regarding the core features I wanted it to have, keeping in mind that one of such features is being able to offload any adhoc functionality to a plugin. As such, I actually started to use for incremental (not try-and-throw-away) work: https://github.com/pfalcon/xtensa-subjects .

It usually takes a lot of time for me :(

I fully agree, and as this isn't my first RE project, and I saw a lot from other people, I rejected an idea of manual decompilation (or adhoc RE in general) myself. That's why I set to write ScratchABit, and I also work on a decompilation tool, https://github.com/pfalcon/ScratchABlock . But comparing to ScratchABit, which I truly tried to make usable by other people (dunno how much I succeeded), ScratchABlock is a research tool and unlikely will be useful in the hands of someone else so far. But here's a kind of decompilation output I achieved with one of working versions:

From this PseudoC (which is C-like assembler, produced by ScratchABit, idea is due to Radare):

; Start of 'fun_40002f14' function
fun_40002f14:
    if ($a2 == 0x3) goto loc_40002f26
    if ($a2 == 0x6) goto loc_40002f2a
    $a4 = $a2 - 0xc
    $a3 = 0xd
    $a2 = 0x0
    if ($a4 == 0) $a2 = $a3
    return
loc_40002f26:
    $a2 = 0xb
    return
loc_40002f2a:
    $a2 = 0xc
    return
; End of 'fun_40002f14' function (as detected)

to:

if ($arg_a2 == 0x3) {
  $a2 = 0xb
} else if ($arg_a2 == 0x6) {
  $a2 = 0xc
} else if (($arg_a2 - 0xc) == 0) {
  $a2 = 0xd
} else {
  $a2 = 0x0
}

As you can see, it still doesn't do complete expression simplification, similar to the code you posted, but was able to recover switch-like semantics of the underlying code. Caveat: handholding via manually selected set of transformation passes to achieve that.

Anyway, great work, guys, just wanted to make sure that I didn't miss everyone using some magic decompiler already.

@pfalcon
Copy link
Contributor

pfalcon commented Nov 18, 2016

Surely a lot of this could be automated, but are there tools to do this?

A lot. My projects-3rdparty/RevEng/Decompilers/ subdirectory contains 22 projects. After looking thru them, I concluded that the only way someone can progress with a generic task of a decompilation is to write another one from scratch (and I kinda consider duplicating effort to be the 8th mortal sin wrt to Open Source). ScratchABlock does just what you write about, automatically. So far it's just a set of individual transformation passes, which need to be stitched together manually in a "decompilation script". Where I paused so far (last work session in August) is the global dataflow analysis for arguments/returns recovery.

Could we mechanically translate instructions to C code, coercing each argument as need, using goto and labels for branches. The code would be really ugly, but perhaps a more reliable start?

ScratchABit disassembler does just that: it produces C-like assembler (well, not ScratchABit itself, it's completely agnostic to any assembler or assembler syntax, but an Xtensa disassembling plugin: https://github.com/pfalcon/ida-xtensa2 , fork of https://github.com/themadinventor/ida-xtensa). A ScrtachABlock pass can output that as a (would-be) valid C function. No proof of: compiling AES, disassembling it to such PseudoC, compiling again, running unittests, repeating with 100 other algorithms - was done so far, but I would go there some time (perhaps when I retire, decompilation is a lifetime project).

@vlad-ivanov-name
Copy link
Contributor

vlad-ivanov-name commented Nov 19, 2016

@pfalcon

individual transformation passes

Are these passes xtensa-specific or generic? I wonder how this compares to radare2 decompilation where most transformations are abstracted from the target architecture details.

As a side note, it would be great if retdec had some kind of plugin API.

@sheinz
Copy link
Contributor Author

sheinz commented Nov 19, 2016

@resetnow I tried radare2 and it seems like a really nice tool but it doesn't resolve strings and function calls (or I miss something)
So, xtobjdis is still produce more readable output as for me.

here's how I used radare2

@vlad-ivanov-name
Copy link
Contributor

vlad-ivanov-name commented Nov 19, 2016

@sheinz I tried your example and radare doesn't mark the string for me as well. The address is resolved correctly, though: http://pastebin.com/raw/nbr1TiFU

Register call wasn't resolved either, which is interesting. It's possible a function pointer is used and it's initialized dynamically.

@sheinz
Copy link
Contributor Author

sheinz commented Nov 21, 2016

@resetnow How did you get whose comments to show up?

0x402126a8      21f8ff         l32r a2, 0x40212688        ; a2=0x40211570 -> 0x206d7200

@vlad-ivanov-name
Copy link
Contributor

vlad-ivanov-name commented Nov 21, 2016

@sheinz e asm.emustr=false

You can also try assembly stepping (s) while in visual mode (make sure to enable io.cache).

@ourairquality
Copy link
Contributor

Looks good.

Might it be appropriate to add etc_timer.o to lib/libmain.remove now? One problem would be if someone did not compile in the source code version, but is that an issue to hold back removing these.

Btw: I am looking at some of the mentioned tools to help with some more require code translations, thanks.

@sheinz
Copy link
Contributor Author

sheinz commented Nov 23, 2016

@ourairquality I would suggest not removing objects from sdk libs for now. Sometimes it's useful to check the behaviour without open implementation to rule out reverse engineering errors.

I also spent some time looking for other RE tools. radare2 looks pretty cool but the disassembly with comments are still the best in xtobjdis in my opinion. radare2 support for xtensa is not complete. radare2 can be handy to visually see the graph of branches in a function.

Another useful thing is running esp-gdbstub which allows stepping through the assembly code and analyse the data in real time.

Rename sdk_ets_handler_isr to process_pending_timers
Add function for microseconds
Simplify time to ticks conversion
@Maijin
Copy link

Maijin commented Dec 12, 2016

@sheinz that's probably a one-line fix CC @radare

@radare
Copy link

radare commented Dec 13, 2016 via email

@ourairquality
Copy link
Contributor

This seemed a great piece of work to have, and moves closer to completing the libmain, and it is often useful to have the code to help understand problems or just need to see how these worked etc. I've been using it for some time now and it seems well ready to land.

@ourairquality
Copy link
Contributor

This has now landed when the lwip v2 branch was merged, thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants