New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize shadow stack instruction sequences #928
base: master
Are you sure you want to change the base?
Conversation
This comment was marked as outdated.
This comment was marked as outdated.
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #928 +/- ##
==========================================
- Coverage 81.79% 81.79% -0.01%
==========================================
Files 262 262
Lines 24302 24499 +197
==========================================
+ Hits 19878 20039 +161
- Misses 4424 4460 +36 ☔ View full report in Codecov by Sentry. |
These calls were necessary since previously fuse algorithms missed some important checks to filter out invalid instruction fusion with non-instruction value producers such as i32.const etc.
This fixes a bug with a niche value when rhs=i16::MIN since there is no symmetric equivalent for -i16::MIN that fits inside an i16 value. This made the optimization fail for rhs=i16::MIN. By always compiling isub as iadd with negated rhs value we avoid this situation gracefully.
Comment on why this has not yet been merged despite passing all CI tests and working: it introduces a lot of complexity into the translation pipelined compared to the gains we see and can verify. The runtime gains are mostly single digits and restricted to sets of benchmarks that actually make use of the shadow stack. The So all in all the question is whether the gains are worth the added complexities. |
The |
Closes #920.
TODOs
i32.add_imm
withglobal.set 0
global.get 0
withi32.add_imm
global.get 0
+i32.add_imm
withi32.local_tee
andglobal.set 0
i32.add
with a function local constantrhs
register.