New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve default optimizer steps sequence in via-ir
#14406
Comments
Just to elaborate a bit: the current sequence transforms to SSA and back in a loop. Based on that, we should, generally, be able to transform to SSA and stay in SSA, up until code generation. However, that would require to ensure that none of the steps that perform meaningful optimizations destroy the SSA property. Actually changing any potential such steps to preserve SSA form I'd consider a separate tasks. This first task should merely investigate what can be achieved with modifying the sequence without actually touching the steps, keeping in mind that transforming back from SSA should not add value and contribute to "compilability" relative to the current code transform anymore. |
This comment was marked as resolved.
This comment was marked as resolved.
While we're at it, we should double-check if the effects of the |
For future reference, here are the initial results I showed last week: |
And here's a bit more analysis:
I haven't analyzed it all yet so here are the plots and some casual observations. I still need to go over those plots in more detail. ResultsContract with different sequencesSequence with different contractsObservationsContract
|
Since the Results look promising. 17-46% decrease in total compilation time. Gas benchmarks also aren't too bad. While I can see up to 25% increase in gas in a few tests, the majority is impacted very little and benchmark diffs for external tests look much more reasonable - most of them below 1.5% for both deployment and runtime. Also, in many cases costs actually decrease. ENS is an outlier with bytecode size increasing 5%, though on the other hand ElementFI had bytecode decrease by 4%. Overall I think that we might be able to tweak this sequence enough to get results very close to the current default. Also, worth noting that the running time decrease we're seeing with this sequence may very likely be the bottom line of what we can achieve by tweaking the sequence. The plots show that most of the gains happen in that first pass so we'll need to keep most of it in one form or another. |
|
And here's some analysis of the data I gathered so far. This is more or less what I presented on the call today. Analysis
ConclusionsIt seems that a single-pass sequence is the way to go. Such sequences are significantly faster than Even if we don't manage to consistently beat |
Some more loose observations from looking at seqbench results recently:
|
|
And here's the overall comparison of all the sequences I tested so far. Remarks
Contract deposit_contract
Contract FixedFeeRegistrar
Contract prbmath_unsigned
Contract ramanujan_pi
Contract strings
|
Final summary for
|
default runtime gas |
the-good-parts runtime gas |
default bytecode size |
the-good-parts bytecode size |
default compilation time (optimized) |
the-good-parts compilation time (optimized) |
|
---|---|---|---|---|---|---|
deposit_contract |
-15.5% | -15.3% | -37.0% | -39.3% | 430 ms | 259 ms |
FixedFeeRegistrar |
-5.6% | -5.6% | -33.8% | -32.4% | 193 ms | 135 ms |
prbmath_unsigned |
-44.4% | -44.5% | -30.8% | -29.6% | 600 ms | 409 ms |
ramanujan_pi |
-67.2% | -67.2% | -39.2% | -42.9% | 622 ms | 233 ms |
strings |
-11.9% | -11.9% | -26.1% | -25.9% | 430 ms | 239 ms |
Conclusions
The new sequence clearly beats the single-pass
sequence. Comparison with default
is more mixed.
The outliers seem a bit concerning. Still, they're not extreme and go both ways so the new sequence is not consistently worse in any metric. In many cases it's a little better.
Overall results seem close enough that we're probably fine using it. I'm pretty sure the sequence can still be refined a little if we spend more time on it. All the sequences I tested have cases where they're better than default
by a few percent, most of the time without doing as many repetitions. The trick is to get one that can do it consistently on most input.
Effects of removing StackCompressorThe results are oddly mixed: from completely neutral to very negative. On one hand, I see almost no change in gas usage in our semantic tests. I also ran seqbench on the On the other hand, results of external tests are terrible (
In some cases zero difference, but in others up to 10% increase in bytecode size and 6% increase in gas usage. And no cases where results improved. Finally, the change caused some new |
|
default runtime gas |
the-good-parts-mk2 runtime gas |
the-good-parts runtime gas |
default bytecode size |
the-good-parts-mk2 bytecode size |
the-good-parts bytecode size |
default compilation time |
the-good-parts-mk2 compilation time |
the-good-parts compilation time |
|
---|---|---|---|---|---|---|---|---|---|
deposit_contract |
-15.5% | -15.3% | -15.3% | -37.0% | -39.4% | -39.3% | 430 ms | 248 ms | 259 ms |
erc20 |
-4.2% | -4.2% | -4.1% | -37.0% | -35.9% | -34.2% | 149 ms | 113 ms | 94 ms |
FixedFeeRegistrar |
-5.6% | -5.6% | -5.6% | -33.8% | -32.8% | -32.4% | 193 ms | 120 ms | 135 ms |
prbmath_unsigned |
-44.4% | -44.5% | -44.5% | -30.8% | -29.6% | -29.6% | 600 ms | 403 ms | 364 ms |
ramanujan_pi |
-67.2% | -67.2% | -67.2% | -39.2% | -42.9% | -42.9% | 622 ms | 238 ms | 233 ms |
strings |
-11.9% | -11.9% | -11.9% | -26.1% | -27.2% | -25.9% | 430 ms | 236 ms | 239 ms |
Conclusions
The new sequence is same or better than the-good-parts
in nearly all cases. Speed is also pretty much the same.
It's still not as good as default
but also still much faster.
Comparison of
|
The default optimizer sequence we have at the moment is quite large, and thus causes bloating in compilation times.
The outcome should hopefully be a shorter sequence, that achieves the same level of optimization, while reducing the compile times (in the context of the optimizer pipeline). Special attention should be paid to steps that destroy the SSA form of its input (better suited for optimizing), and thus cause subsequent steps to perform poorly.
The text was updated successfully, but these errors were encountered: