Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

investigate feasibility of a BinExport2 backend #1755

Open
williballenthin opened this issue Aug 23, 2023 · 4 comments · May be fixed by #1950
Open

investigate feasibility of a BinExport2 backend #1755

williballenthin opened this issue Aug 23, 2023 · 4 comments · May be fixed by #1950
Labels
enhancement New feature or request question Further information is requested

Comments

@williballenthin
Copy link
Collaborator

BinExport is an intermediate representation of disassembly produced by various tools, like IDA, Binary Ninja, Ghidra, etc. The data is stored in a ProtoBuf format: https://github.com/google/binexport/blob/main/binexport2.proto

It includes many of the things that capa needs:

  • instructions
  • operands
  • functions, basic blocks
  • strings

Some other things are missing:

  • section names
  • data references

Investigate the feasibility of building a backend that relies upon BinExport. Consider the tradeoffs of requiring the original file (such as for missing metadata, like sections, or data references) versus self-contained protobuf.

@williballenthin williballenthin added enhancement New feature or request question Further information is requested labels Aug 23, 2023
@williballenthin
Copy link
Collaborator Author

williballenthin commented Aug 23, 2023

  • instruction features
    • api ✓
    • number ✓
    • string and substring ✓
    • offset ✓
    • mnemonic ✓
    • operand ✓
    • namespace ?
    • class ?
    • property ?
    • bytes ✗ (but ✓ alongside PE/ELF)
  • function features
    • function-name ✓
  • file features
    • namespace ✓
    • class ✓
    • string and substring ✗ (but ✓ alongside PE/ELF)
    • export ✗ (but ✓ alongside PE/ELF)
    • import ✗ (but ✓ alongside PE/ELF)
    • section ✗ (but ✓ alongside PE/ELF)
  • global features
    • arch ✓
    • os ✗ (but ✓ alongside PE/ELF)
  • characteristic
    • loop ✓
    • recursive call ✓
    • calls from ✓
    • calls to ✓
    • tight loop ✓
    • stack string ✓
    • nzxor ✓
    • peb access ✓
    • fs access ✓
    • gs access ✓
    • cross section flow ✓
    • indirect call ✓
    • call $+5 ✓
    • embedded pe ✗ (but ✓ alongside PE/ELF)
    • forwarded export ✗ (but ✓ alongside PE/ELF)
    • unmanaged call ✗
    • mixed mode ✗

@r0ny123
Copy link

r0ny123 commented Aug 26, 2023

Since the topic came up, maybe we consider this one https://github.com/quarkslab/quokka too?

@williballenthin
Copy link
Collaborator Author

Do you use Quokka or know of people that do? Seems very reasonable if so, though we don't want to maintain unused code.

@r0ny123
Copy link

r0ny123 commented Aug 30, 2023

Unfortunately, no. You're right that it's still in the early stages and not widely used at this moment.

@williballenthin williballenthin linked a pull request Jan 26, 2024 that will close this issue
24 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request question Further information is requested
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants