Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

quadratic blowup in export size due to tail call elimination #101

Open
comex opened this issue Oct 16, 2022 · 0 comments
Open

quadratic blowup in export size due to tail call elimination #101

comex opened this issue Oct 16, 2022 · 0 comments

Comments

@comex
Copy link

comex commented Oct 16, 2022

BinExport for Binary Ninja and IDA allows the same address to be processed independently as part of multiple functions, if they all contain jumps (not calls) to that address.

This behavior seems to be intentional, since in FlowGraph::CreateBasicBlocks, the check if (basic_block == nullptr) would only fail if a basic block has already been created for the same address in another function. And it could be beneficial in some cases – say, if two different functions have had a common tail merged.

But it's harmful in the presence of tail call elimination, which causes one function to end with an unconditional jump to another function. If func1 ends by jumping to func2, the export won't treat that as a call; instead it will include an entry for func1 with the instructions of both func1 and func2, followed by a separate entry for func2.

Attached is a pathological case containing 1000 functions, where each function jumps to the next one. BinExport creates an entry for func1 containing the instructions of functions 1 to 1000, an entry for func2 containing the instructions of functions 2 to 1000, an entry for func3 containing the instructions of functions 3 to 1000, etc. This results in quadratically many instructions being exported, increasing processing time and output file size (especially in text format, but also in binary format). Also included is a version with 10,000 functions, which demonstrates the output size bloat better.
quadratic.zip

(Note that on Binary Ninja, instruction duplication can happen not only because of tail call elimination, but also because of #99 and #100 creating false xrefs. That's how I noticed the issue in the first place. But duplication that is caused by tail call elimination is not specific to Binary Ninja.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant