Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generate RZIL operations #69

Merged
merged 15 commits into from
Mar 22, 2024
Merged

Generate RZIL operations #69

merged 15 commits into from
Mar 22, 2024

Conversation

Rot127
Copy link
Member

@Rot127 Rot127 commented Jun 5, 2022

WIP

Requires: #67

The rzil-compile sub-module used here is at https://github.com/Rot127/rzil-compiler

closes #64

Copy link
Member

@DMaroo DMaroo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Glanced through the handwritten code, and it looks pretty good to me. Didn't really have the time to thoroughly read through it. If you have any files which you would want me to review throughly, I'd be happy to do so. Great stuff, btw!

LLVMImporter.py Outdated Show resolved Hide resolved
@Rot127
Copy link
Member Author

Rot127 commented Sep 7, 2023

Looking at the Python files is fine. No need to read too deep into it, if you haven't already.
The hand-written C code can be reviewed in the Rizin PR IMHO.

@DMaroo More importantly I would need feedback on the compiler (https://github.com/Rot127/rzil-compiler). Though only if you find time and are interested for using it in the future. Do not want to waste your time.

I designed it roughly in a way that we should be able to use it for other archs as well. "Roughly" means I only used it for Hexagon yet. And have not tested it yet with another arch.
Also it is not a good compiler. Its pretty naively implemented and slow. But does it's job.

@DMaroo
Copy link
Member

DMaroo commented Sep 8, 2023

Got it, thanks! I'll have a look at https://github.com/Rot127/rzil-compiler once I have time.

@thestr4ng3r
Copy link
Member

Do you think it would be possible to further reduce the code size? For example rizin/librz/analysis/arch/hexagon/il_ops/hexagon_il_A2_ops.c has 12k lines of code, which is absolutely massive. For example by moving more logic from the rzil compiler/code generator to the runtime.

@Rot127
Copy link
Member Author

Rot127 commented Sep 16, 2023

Yes. Already did it but haven't pushed yet. A2_ops is down to ~7700 lines. And it can be reduced a little further.
Also see: rizinorg/rz-rzilcompiler#16
Let me know if you have any more suggestions

@Rot127
Copy link
Member Author

Rot127 commented Sep 16, 2023

So for A2 the line count is now at roughly 8k. But this is currently as good as it gets (of cause everything but the effects could be inline, but this is not really the point).

To tackle instructions like these sub-routines need to be implemented.

Its not too difficult but needs time. So before that I'd like to get the currently 1500 instruction tested.

@XVilka
Copy link
Member

XVilka commented Sep 22, 2023

I was thinking about supporting subroutines/functions in RzIL for a long time too. It will help to simplify some of the output, will also make easier spotting stuff like input/output and emulating those, if necessary.

@Rot127
Copy link
Member Author

Rot127 commented Sep 22, 2023

Couldn't find a corresponding concept in BAPs IL. Hopefully I haven't just overlooked it.

But for now I had to implemented something which could be used for them.
Behavior of C syntax which has to be categorized as Effect and Pure (think of i++ but also sub-routines), get generated like this:

Effect *sub_routine_sequence = sub_routine(...); // Sets VARL("return_val_i") in its sequence
Pure *return_val = CAST(ret_width, ret_sign, VARL("return_val_i"))

Not smart, but this is how I would do those sub-routines for now.

@Rot127 Rot127 marked this pull request as ready for review November 13, 2023 20:03
@Rot127 Rot127 force-pushed the uplifting branch 2 times, most recently from 0695f1f to 0eabc38 Compare November 21, 2023 20:02
@Rot127 Rot127 changed the base branch from master to main January 14, 2024 15:05
Add rzil-compiler as submodule

Add cmd option for rzil generation.

Add structs which hold IL ops, instructions and RZIL getter.

Generate `hexagon_il.h` with insn il op getter declarations.

Add reuse license info and ignore VSCode folder.

Use lists of insn names instead of objects.

Init rzil-compiler.

Get compiled instrucition RZIL behavior.

Generate IL op getter.

Add option to skip pcpp step

Set il_ops structs for Duplex instructions.

Save getter along with rzil behavior

Build il files only if flag says so

Use getter saved in il_ops

Remove set il_ops from Duplex Instructions.

Fill complete HexILInsn struct

Fix: Actual declarations into hexagon_il.h not just names.

Update rizin files.

Fix type related build errros.

update rzil-compiler

Add progress bar while parsing/compiling isntructions

Run black

Update rzil-compiler

Extract `set_il_op` for instruction to method.

Add class and isa_id to HexOp.

Set `isa_id` char in template.

Add ISA2REG function.

Add ISA2IMM

Refactor: Extract method to generate register enums.

Rewrite `get_reg_name` functions to use lookup tables and allow to return .new names.

Fix build errors.

Move lookup table names to PluginInfo; Fix off by one error.

Add helper function to generate doxygen

Add ALIAS2REG function.

Set to latest HEAD.

Generate flags correctly

Update rzil-compiler

Update generated files.

Remove left duplex code from rebase.

Add helper functions for IL op shuffle functionality.

Add "dont compile option."

Add shuffle function for IL ops.

This function brings the IL ops in the correct liniar order.

Update rzil-compiler

Log not compiled instructions.

Temporal remove compiler submodule.

Clone submodule rzil-compiler again

Filter pseudo instructions before compiling to show real insn count.

Update rzil-compiler

Update rzil-compiler

Update generated files.

Generate lookup table for IL getter in analysis plugin not asm plugin

Update docs

Move register names lookup tables into own header file and include it only in hexagon.c

Update generated files.

Update rzil-compiler

Update generated il ops files.

Set hardware loop in packet.

Add getter for il conifg

Introduce get_il_op

Add docs

Formatting, set alias C9 to PC, exclude op_builder

Update generated files.

Fix bug which added LLVM header to each IL file if IL files were skipped during generation.

* The skippped IL files were not added to the `unchanged_files` list and were not skipped during LLVM header adding.

Clearify what --no-rzil-compiler means.

Add endloop RZIL ops

Fix il_config getter names.

Replace const with RZ_BORROW

Several cleanups, renaming etfc.

Add HEX_REGFIELD macro.

Update rzil-compiler

Fix function generation: Remove code duplicates and fix getter names.

Update rzil compiler

Use only chars to identifier the ISA register and immediates no strings.

Formatting

Fix variable names

Fix include of NOT_IMPLEMTENTED. It was undefed before it was used.

Move NOT_IMPLEMENTED macro into rz-il-op-builder

Don't try to include deleted file.

Add uplifted extract64 and sextract64.

Include `hexagon_arch.h` for ISA to real imm/reg conversion functions

Don't define hexagon instruction if not in use.

Define Macros for decrement and increment

Update rzil-compiler

Add DEPOSIT64

update rzil-compiler

Move hex_get_rf_property_val to IL code

Fix typo from Il to IL

Allow to return _tmp from alias lookup tables

Implement get_npc

Let IL qemu functions return effects.

Print stats how many HVX and normal instructions were compiled.

Fix fall through cases

Add missing argument to macros

Update hand written RZIL funcitons

Don't init hi if it isn't used by instruciton.

Check for hi variable before initializing it.

Fix typos

Fix endloops after empty sequences are allowed.

Create own file for static il getter table

Fix typos

Add else case to leave r not uninitialized

Fix visibility and read only properties

Fix include

Add clo and clz

Increase visibility of get_pkt

hex_write_pred

Update rz_hexagon_il_config

Add sync register getter

CHeck for invalid instruction. Not valid instructions

Always execute packet which is at VM init address.

Update rzil-compiler

Sort insn id enums and il getter lookup tables so they are aligned.

Fix build errors of redeined variable.

Make get_hic globally visible

Fix up DUplex il ops getter.

Replace rz-list with rz_vector.

Fix ff by one

 Replace old list code

Add NULL checks

Set isa_id frmo template

Pass reg class not type: Set correct size when init rz-vector; FOrmatting

Copy bug fixes from Rizin.

Update endloop instructions.

Generate register profile without overlapping registers.

Decrease address size to 32bits for now.

Always use PC instead of C9

Update endloop functions

Add NULL check

Implement proper il op switch. FIxes invalid read.

Check for operand type during ISA -> OP conversion.

Fix double free

Set unsigned of valid length.

Update enloops

Mark packet after jump as valid.

Fix invlaid value types

Update helper functions.

Print compiled instructions ni progress bar.

Fix typo

Update compiler

Remove rebase duplication.

Correct grammar

Update non instruction IL code

Add predicate written flags.

Update compiler

Add functions to write registers and update double regs.

Fix register write.

Reset Pred write flags after executed instructions

Fix C4 write

Ensure that tmp prediacte register is written.

Update compiler

Update generated files.

Update access time of packet on request.

Add warnings i IL op failed.

Check for jumps before buffered packet is returned.

Check for _tmp predicate register names.

Remove PRED_WRITE function

Update compiler

Remove empty file

Remove specific clang-format version

Use main branch for rzil compiler..

Update RzIL-Compiler

Imported and LLVM encodings if both are present.

Run black

Update compiler

Update generated files

Rename p -> pkt and formatting

Set slot in HexInsn.

Handle slot_cancelled during runtime

Rename rzil_compiler package

Update rzil_compiler

Fix type name

Replace functoin call with macro

Update rzil_compiler

Fix macros

Update compiler

Update setuptools

Remove debug printf

Update handwritten code from Rizin commits

Parse shortcode before instruction parsing.

Move constants to typedefs.h

Add macros and function declarations for operand read/write

Add TransformedInstruction.

Moves a lot of IL instruction logic to the compiler.

Update hand-written files.

Add missing C20-C29 registers

Access rzil getter correctly

Shorten log message.

Add missing semicolion.

Remove wrong alternative names for C20-C29

Add alias_to_op and explicit_to_op implementations.

Add implementation for IL regs read/writes

Add macros.h again to hexagon.h and move the previously defined macros to this file.

Implement canceling slots

Remove const from pkt initialization.

Implement compilation of instructions only defined used in QEMU and arbitray code.

Add known sub_routines to the compiler.

Add more sub-routines and add support for circ_add.

Set RxV in fcircadd

Pass the bundle to fcircadd

Fix tests

Add const qualifier remove bracket

Remove unused functions.

Change visibility to API since those functions need to be reachable from analysis.

Remove px_written pseudo registers. They are tracked in the packet.

Use tracked read/writes to check if reg needs a sync.

Update compiler

Finish implementation of register read/writes.

Rename to commit_packet and il_ops_stats

Log reads/writes from low registers as well.

Fix assert reached: Cast values before LOGAND

Move sub-routine defintions to compiler.

Use jump sub-rutines

Set C9 if no jump did it.

Don't buffer invalid 0x00000000 instructions due to IO intransparency.

append jumpt to packet end, if there is no direct jump instruction in the packet.

Use macro value

Check for invalid instructions to determine packet emu readiness.

Print correct error.

Simplify packet commit to not add additional EMPTY() every time.

Add a jump_flag to determine if the jump to the next packet should happen.

Only init op if needed.

Fix setup of final effect sequence.

The setup-order is now 0...n, n...0 which broke the register write tracing.

Replace output filepaths with Path objects and ignore them in git.

Add getter for N-regs

Fix flag check

Add syntax of instruction as comment for searching.

Add count leading zeros/ones as sub-routines.

Print missing functions

Remove unnecessary parameters from log read/write reg functions

Use correct shift amount for doub reg writes.

Remove unnecessary AND

Handle case when registers are read and written.

The returned value in this case must be the .new value after the first write.

Update compiler

Resolve register numbers from asm version to enum id before using them

Set correct flags.

Update rzil_compiler

Add operators like &= to syntax coloring

Track predicate writes for each slot.

Allow read of C9 (PC) which is not present in the VM.

Update compiler

Increase number of state packets since 8 are to little and break rz-tracetest.

Remove incorrect increment

Update compiler

Hopefully fix the annoying buffering poblem for the last time.

Parse assignments like &= in asm string

Return NULL or U32 from get_reg_field

Update compiler

Add debug printing function to trace acces to buffer of packets.

Use rizin mem read function to get int from buffer

Add more detail to debug state printing

Fix add to packet algorithm

The decision to what packet the instruction
should be added was too early. Instructions
which belonged to another packet were added to stale one.

Remove PS_ instructions.

Remove unused function.

Handle MOD registers in il reg read/write

Handle writes to immutable and partly immutable registers.

Handle read/write of alias P3:0 register

Add resolve function for Mod to CS regs

Add struct members for floats.

Add macros for floats

Update compiler

Run black

Update license info

Several tiny syntactical and style fixes

Reduce indentation by inverting NULL check.

Add missing newlines before function defintions.

Replace string path with Path().

Fix rw overlap check by only performing it on x register.

Add RzIL tests.

Update generated files

Add missing asm tests

Return NULL if register name could not be resolved and mark isntructoin as invalid.

Decrease log level as it spamms during aaa

Update analysis tests.

Update CC generation with upper case reg names

Fix UndefinedBehaviorSanitizer error for int is shifted by 31

Fix resource leaks

Reduce log level for invalid duplex classes.

Those get often hit by disassembling invalid instructions.

Only generate IL code if requested.

Enable load_align test.

Fix syntax: Remove trailing ; for invalid decodes with parse_bits == 0.

Add angle brakcets type annotations.

Add NULL check for asm_toks

Add updated pcre2 regex patterns

Sync from rizin branch

Sync with C source.

Fix includes with new RzArch refactoring

Remove useless comments
@Rot127 Rot127 merged commit d9955c1 into rizinorg:main Mar 22, 2024
3 checks passed
@Rot127 Rot127 deleted the uplifting branch March 22, 2024 07:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Virtual Python environment
4 participants