New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[seqbench] the-good-parts-mk3
sequence
#15030
base: develop
Are you sure you want to change the base?
Conversation
the-good-parts-mk3
sequence
|
File name | IR optimized | Legacy optimized | Legacy |
---|---|---|---|
array/create_memory_array.sol |
9% | ||
array/array_storage_push_empty_length_address.sol |
8% | ||
externalContracts/base64.sol |
8% | -0% | |
array/array_storage_index_boundary_test.sol |
7% | ||
array/delete/bytes_delete_element.sol |
7% | -0% | |
array/copying/cleanup_during_multi_element_per_slot_copy.sol |
6% | ||
array/copying/storage_memory_packed_dyn.sol |
5% | ||
array/copying/array_copy_cleanup_uint40.sol |
5% | ||
abiEncoderV2/calldata_array.sol |
3% | ||
inheritance/inherited_function_calldata_memory_interface.sol |
3% | ||
abiEncoderV2/abi_encode_calldata_slice.sol |
3% | -0% | |
abiEncoderV1/abi_encode_calldata_slice.sol |
3% | -0% | |
array/array_storage_push_empty.sol |
2% | ||
array/array_storage_push_pop.sol |
2% | ||
array/array_storage_index_zeroed_test.sol |
2% | ||
isoltestTesting/balance_other_contract.sol |
2% | ||
externalContracts/prbmath_unsigned.sol |
2% | ||
array/dynamic_arrays_in_storage.sol |
2% | ||
externalContracts/FixedFeeRegistrar.sol |
1% | +0% | |
various/erc20.sol |
1% | +0% | |
abiEncoderV2/abi_encode_v2_in_function_inherited_in_v1_contract.sol |
1% | -0% | |
events/event_indexed_string.sol |
1% | -0% | |
array/fixed_arrays_as_return_type.sol |
1% | -0% | |
various/selfdestruct_pre_cancun_multiple_beneficiaries.sol |
1% | ||
various/selfdestruct_post_cancun_multiple_beneficiaries.sol |
1% | ||
userDefinedValueType/erc20.sol |
1% | +0% | |
array/copying/array_of_struct_calldata_to_storage.sol |
1% | ||
array/copying/array_copy_clear_storage.sol |
1% | ||
array/copying/memory_dyn_2d_bytes_to_storage.sol |
1% | -0% | |
constructor/bytes_in_constructors_packer.sol |
+0% | 1% | |
events/event_dynamic_array_storage_v2.sol |
1% | +0% | |
events/event_dynamic_array_storage.sol |
1% | +0% | |
array/array_storage_index_access.sol |
1% | ||
externalContracts/prbmath_signed.sol |
1% | ||
array/push/push_no_args_bytes.sol |
1% | ||
various/value_complex.sol |
1% | ||
various/value_insane.sol |
+0% | ||
salted_create/salted_create_with_value.sol |
+0% | ||
byte_array_to_storage_cleanup.sol |
1% | -0% | |
array/push/array_push.sol |
+0% | ||
structs/memory_structs_nested_load.sol |
+0% | ||
externalContracts/snark.sol |
+0% | ||
array/copying/array_of_struct_memory_to_storage.sol |
+0% | ||
array/copying/storage_memory_nested_from_pointer.sol |
+0% | ||
array/copying/storage_memory_nested.sol |
+0% | ||
various/selfdestruct_post_cancun_redeploy.sol |
+0% | ||
various/selfdestruct_pre_cancun_redeploy.sol |
+0% | ||
array/copying/array_copy_storage_storage_different_base.sol |
+0% | ||
array/copying/copy_byte_array_to_storage.sol |
+0% | -0% | |
array/copying/array_copy_different_packing.sol |
+0% | ||
externalContracts/deposit_contract.sol |
+0% | +0% | |
structs/struct_memory_to_storage_function_ptr.sol |
+0% | ||
array/copying/array_copy_storage_storage_dynamic_dynamic.sol |
+0% | ||
structs/struct_delete_storage_with_array.sol |
+0% | ||
array/push/nested_bytes_push.sol |
+0% | -0% | |
array/constant_var_as_array_length.sol |
+0% | +0% | |
array/copying/array_of_structs_containing_arrays_calldata_to_storage.sol |
+0% | ||
array/dynamic_multi_array_cleanup.sol |
+0% | ||
structs/struct_copy_via_local.sol |
+0% | ||
abiEncoderV1/struct/struct_storage_ptr.sol |
+0% | ||
libraries/using_library_mappings_public.sol |
+0% | ||
array/fixed_arrays_in_constructors.sol |
+0% | ||
array/push/array_push_struct.sol |
+0% | ||
array/copying/nested_array_element_storage_to_storage.sol |
+0% | ||
viaYul/copy_struct_invalid_ir_bug.sol |
+0% | ||
userDefinedValueType/calldata.sol |
+0% | -0% | |
array/copying/array_copy_storage_to_memory_nested.sol |
+0% | +0% | |
abiEncoderV2/abi_encode_v2.sol |
+0% | +0% | |
storage/packed_storage_structs_bytes.sol |
+0% | ||
array/copying/calldata_array_dynamic_to_storage.sol |
+0% | -0% | |
array/copying/bytes_storage_to_storage.sol |
+0% | -0% | |
array/dynamic_array_cleanup.sol |
+0% | ||
immutable/use_scratch.sol |
+0% | ||
array/copying/array_nested_calldata_to_storage.sol |
+0% | ||
constructor/constructor_static_array_argument.sol |
+0% | -0% | |
events/event_dynamic_nested_array_storage_v2.sol |
+0% | +0% | |
various/destructuring_assignment.sol |
+0% | -0% | |
array/copying/array_copy_target_simple.sol |
+0% | ||
functionCall/creation_function_call_with_salt.sol |
+0% | ||
array/copying/array_copy_storage_storage_different_base_nested.sol |
+0% | ||
array/push/array_push_struct_from_calldata.sol |
+0% | -0% | |
array/copying/array_copy_including_array.sol |
+0% | ||
array/copying/storage_memory_nested_struct.sol |
+0% | ||
structs/copy_struct_array_from_storage.sol |
+0% | ||
functionCall/creation_function_call_with_args.sol |
+0% | ||
array/copying/array_nested_memory_to_storage.sol |
+0% | ||
array/copying/nested_dynamic_array_element_calldata_to_storage.sol |
+0% | ||
array/fixed_array_cleanup.sol |
+0% | ||
array/copying/array_copy_storage_storage_static_dynamic.sol |
+0% | ||
array/copying/elements_of_nested_array_of_structs_calldata_to_storage.sol |
+0% | ||
array/copying/nested_array_of_structs_calldata_to_storage.sol |
+0% | ||
abiencodedecode/abi_decode_simple_storage.sol |
+0% | -0% | |
structs/calldata/calldata_struct_with_nested_array_to_storage.sol |
+0% | +0% | |
array/bytes_length_member.sol |
+0% | ||
array/reusing_memory.sol |
+0% | ||
array/pop/array_pop_array_transition.sol |
+0% | ||
types/mapping/copy_from_mapping_to_mapping.sol |
+0% | +0% | |
array/copying/array_of_structs_containing_arrays_memory_to_storage.sol |
+0% | ||
array/push/array_push_nested_from_calldata.sol |
+0% | -0% | |
abiEncoderV1/abi_decode_v2_storage.sol |
+0% | +0% | |
array/copying/nested_array_of_structs_memory_to_storage.sol |
+0% | ||
array/push/push_no_args_2d.sol |
+0% | ||
array/copying/array_storage_multi_items_per_slot.sol |
+0% | ||
structs/struct_copy.sol |
+0% | ||
abiEncoderV2/calldata_overlapped_dynamic_arrays.sol |
+0% | -0% | |
array/copying/copy_function_internal_storage_array.sol |
+0% | ||
array/copying/array_copy_storage_storage_dyn_dyn.sol |
+0% | ||
various/staticcall_for_view_and_pure.sol |
-0% | ||
array/array_storage_length_access.sol |
-0% | ||
calldata/copy_from_calldata_removes_bytes_data.sol |
-0% | ||
array/copying/array_copy_storage_storage_static_static.sol |
-0% | ||
array/copying/bytes_inside_mappings.sol |
-0% | ||
array/copying/nested_array_of_structs_storage_to_storage.sol |
-0% | ||
array/invalid_encoding_for_storage_byte_array.sol |
-0% | -0% | |
array/copying/array_of_function_external_storage_to_storage_dynamic_different_mutability.sol |
-0% | +0% | |
constructor/no_callvalue_check.sol |
-0% | ||
array/copying/copy_byte_array_in_struct_to_storage.sol |
-0% | -0% | |
array/copying/array_copy_nested_array.sol |
-0% | -0% | |
array/array_memory_index_access.sol |
-0% | ||
array/copying/copy_removes_bytes_data.sol |
-0% | ||
various/many_subassemblies.sol |
-0% | ||
array/copying/elements_of_nested_array_of_structs_memory_to_storage.sol |
-0% | ||
immutable/multi_creation.sol |
-0% | ||
constructor/bytes_in_constructors_unpacker.sol |
-0% | +0% | |
array/copying/array_of_function_external_storage_to_storage_dynamic.sol |
-0% | -0% | |
array/copying/storage_memory_nested_bytes.sol |
-0% | +0% | |
structs/struct_containing_bytes_copy_and_delete.sol |
-0% | -0% | |
array/copying/array_copy_storage_storage_struct.sol |
-0% | ||
array/copying/function_type_array_to_storage.sol |
-0% | -0% | |
abiEncoderV2/storage_array_encoding.sol |
-0% | -0% | |
structs/copy_substructures_from_mapping.sol |
-0% | +0% | |
various/skip_dynamic_types_for_structs.sol |
-0% | -0% | |
array/copying/array_copy_target_simple_2.sol |
-0% | ||
storage/empty_nonempty_empty.sol |
-0% | -0% | |
array/copying/array_to_mapping.sol |
-0% | +0% | |
array/copying/array_elements_to_mapping.sol |
-0% | +0% | |
array/copying/calldata_array_to_mapping.sol |
-0% | ||
array/push/byte_array_push_transition.sol |
-0% | ||
structs/copy_to_mapping.sol |
-0% | +0% | |
structs/copy_from_mapping.sol |
-0% | +0% | |
array/pop/byte_array_pop_long_storage_empty_garbage_ref.sol |
-0% | ||
array/copying/array_copy_calldata_storage.sol |
-0% | -0% | |
array/pop/array_pop_uint16_transition.sol |
-0% | ||
array/copying/arrays_from_and_to_storage.sol |
-0% | ||
libraries/internal_types_in_library.sol |
-0% | ||
array/pop/array_pop_uint24_transition.sol |
-0% | ||
functionCall/gas_and_value_brace_syntax.sol |
-0% | ||
functionCall/gas_and_value_basic.sol |
-0% | ||
structs/copy_substructures_to_mapping.sol |
-0% | +0% | |
functionCall/external_call_to_nonexisting_debugstrings.sol |
-1% | +0% | |
constructor/arrays_in_constructors.sol |
-1% | +0% | |
structs/structs.sol |
-0% | ||
functionCall/mapping_array_internal_argument.sol |
-0% | ||
array/pop/byte_array_pop_masking_long.sol |
-0% | ||
structs/conversion/recursive_storage_memory.sol |
-0% | ||
externalContracts/strings.sol |
-0% | +0% | |
libraries/using_library_mappings_return.sol |
-1% | ||
array/copying/array_copy_target_leftover.sol |
-0% | -0% | |
various/address_code.sol |
-0% | -0% | |
array/byte_array_transitional_2.sol |
-1% | ||
array/arrays_complex_from_and_to_storage.sol |
-1% | +0% | |
various/create_calldata.sol |
-1% | ||
functionCall/external_call_to_nonexisting.sol |
-1% | 1% | |
array/pop/byte_array_pop_long_storage_empty.sol |
-1% | ||
inlineAssembly/transient_storage_selfdestruct.sol |
-1% | ||
array/function_array_cross_calls.sol |
-3% | 1% | |
externalContracts/ramanujan_pi.sol |
-5% |
Time benchmark
|
File | Pipeline | Bytecode size | Time | Exit code |
---|---|---|---|---|
verifier.sol |
legacy | 4899 bytes | 0.14 s | 0 |
verifier.sol |
via-ir | 4324 bytes | 0.35 s | 0 |
OptimizorClub.sol |
legacy | 0 bytes | 0.33 s | 1 |
OptimizorClub.sol |
via-ir | 22175 bytes | 1.79 s | 0 |
chains.sol |
legacy | 5869 bytes | 0.11 s | 0 |
chains.sol |
via-ir | 21401 bytes | 10.72 s | 0 |
Compared to the-good-parts-mk2
, timing is pretty much identical. Bytecode size differences are in single digits (but in favor of mk2).
default sequence for reference
File | Pipeline | Bytecode size | Time | Exit code |
---|---|---|---|---|
verifier.sol |
legacy | 4874 bytes | 0.15 s | 0 |
verifier.sol |
via-ir | 4351 bytes | 0.58 s | 0 |
OptimizorClub.sol |
legacy | 0 bytes | 0.47 s | 1 |
OptimizorClub.sol |
via-ir | 22193 bytes | 3.68 s | 0 |
chains.sol |
legacy | 5845 bytes | 0.16 s | 0 |
chains.sol |
via-ir | 23043 bytes | 21.72 s | 0 |
External test benchmark diffCI run on
|
project | bytecode_size | deployment_gas | method_gas |
---|---|---|---|
brink | 0% |
||
colony | 0% |
||
elementfi | 0% |
||
ens | 0% |
||
euler | |||
gnosis | |||
gp2 | 0% |
||
perpetual-pools | 0% |
0% |
-0.01% ✅ |
pool-together | 0% |
||
uniswap | 0% |
||
yield_liquidator | 0% |
-0% |
0% |
zeppelin |
ir-optimize-evm+yul
project | bytecode_size | deployment_gas | method_gas |
---|---|---|---|
brink | +0.61% ❌ |
||
colony | +0.13% ❌ |
||
elementfi | -1.87% ✅ |
||
ens | -1.2% ✅ |
-2.96% ✅ |
-0.01% ✅ |
euler | +1% ❌ |
||
gnosis | |||
gp2 | +0.5% ❌ |
||
perpetual-pools | -0.23% ✅ |
-0.52% ✅ |
+0.02% ❌ |
pool-together | -1.29% ✅ |
||
uniswap | +1.67% ❌ |
||
yield_liquidator | +0.84% ❌ |
-0.2% ✅ |
+0.14% ❌ |
zeppelin | -0.48% ✅ |
-1.3% ✅ |
-0.01% ✅ |
ir-optimize-evm-only
project | bytecode_size | deployment_gas | method_gas |
---|---|---|---|
brink | 0% |
||
colony | 0% |
||
elementfi | 0% |
||
ens | 0% |
0% |
0% |
euler | |||
gnosis | |||
gp2 | 0% |
||
perpetual-pools | 0% |
+0% |
-0% |
pool-together | 0% |
||
uniswap | 0% |
||
yield_liquidator | 0% |
+0% |
0% |
zeppelin | 0% |
legacy-no-optimize
project | bytecode_size | deployment_gas | method_gas |
---|---|---|---|
brink | 0% |
||
colony | 0% |
||
elementfi | 0% |
||
ens | 0% |
||
euler | 0% |
||
gnosis | 0% |
||
gp2 | 0% |
||
perpetual-pools | 0% |
-0% |
+0.01% ❌ |
pool-together | 0% |
||
uniswap | 0% |
||
yield_liquidator | 0% |
+0% |
0% |
zeppelin | 0% |
+0% |
-0.06% ✅ |
legacy-optimize-evm+yul
project | bytecode_size | deployment_gas | method_gas |
---|---|---|---|
brink | -0.03% ✅ |
||
colony | +0.13% ❌ |
||
elementfi | +0.11% ❌ |
||
ens | +0.21% ❌ |
-0.01% ✅ |
-0% |
euler | +0.16% ❌ |
||
gnosis | +0.19% ❌ |
||
gp2 | +0.33% ❌ |
||
perpetual-pools | -0.01% ✅ |
+0.01% ❌ |
+0.01% ❌ |
pool-together | +0.04% ❌ |
||
uniswap | +0.04% ❌ |
||
yield_liquidator | +0.06% ❌ |
+0.02% ❌ |
-0.01% ✅ |
zeppelin | +0.17% ❌ |
+0.18% ❌ |
+0.02% ❌ |
legacy-optimize-evm-only
project | bytecode_size | deployment_gas | method_gas |
---|---|---|---|
brink | 0% |
||
colony | 0% |
||
elementfi | 0% |
||
ens | 0% |
0% |
0% |
euler | 0% |
||
gnosis | 0% |
||
gp2 | 0% |
||
perpetual-pools | 0% |
+0% |
-0% |
pool-together | 0% |
||
uniswap | 0% |
||
yield_liquidator | 0% |
+0% |
0% |
zeppelin | 0% |
-0% |
+0.13% ❌ |
!V
= version mismatch
!B
= no value in the "before" version
!A
= no value in the "after" version
!T
= one or both values were not numeric and could not be compared
-0
= very small negative value rounded to zero
+0
= very small positive value rounded to zero
External test compilation time
Table includes also timing of the default sequence ( |
4f23db2
to
2703127
Compare
This is now reviewable as the final candidate to replace the default sequence. See #14406 (comment) for optimization quality and timing comparison between sequences. Specifically the default one and different variants of |
@bshastry Can you do some fuzzing here? |
Here's my analysis of hard and soft prerequisites for each step in the new sequence. Also, the same thing for the current sequence for comparison.
|
// let p := mload(0x40) | ||
// mstore(0x40, add(p, 96)) | ||
// let _1 := add(p, 128) | ||
// mstore(_1, 2) | ||
// mstore(0x40, 0x20) | ||
// for { } iszero(1) { } | ||
// { } | ||
// sstore(0, _1) | ||
// sstore(1, mload(0x40)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This strikes as quite a bad regression - not that the case wasn't fully artificial to begin with, but the test case doesn't really make sense with this result anymore...
And it is rather suspicious that suddenly none of the things like removing empty for loops or resolving memory accesses happen anymore here :-). So the question is: can you easily explain why that is?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(The solution here may rather be changing the test case, rather than trying to tweak the sequence further, but it'd still be interesting to have an explanation for why this changes this drastically)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or well, maybe we don't even have to change the test case - we can still revisit this, if we're getting actual reports about real-world cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I investigated this regression for the previous sequence candidate: #14928 (review). I see that now the regression is a bit worse - that for
loop wasn't there back then - but the same tweak should still resolve it here.
I ultimately decided not to include that tweak though, because when I benchmarked it, it seemed to make both bytecode size and runtime gas slightly worse in external tests and gas diff: #14928 (comment). Same with a few other tweaks that solved regressions - see links in the PR description.
It wasn't that much worse, so I could bring it back if you think it's worth it but unfortunately the general pattern I see is that resolving these regressions on their own can often be detrimental, at least in the contracts we use for benchmarking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we instead change the test case, s.t. it optimizes well again :-D?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are artificial cases, so we shouldn't hand-tweak the optimizer for these over the external tests - but the test case is a bit weird - I mean, arguably the way it's now this is a failing test :-).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, I assume that the artificial part is the loop? I see it and the if
above it were added in #5584 to test StructuralSimplifier. I removed it from the test and now the new sequence can deal with resolving the memory loads.
Still, I'm not sure if removing this is the right thing to do just because the new sequence can't handle it. It is a bit artificial, but it worked with the old sequence and the fact that it regressed is also potentially useful information.
I mean, maybe the case no longer makes sense with it (assuming it was meant to test memory load resolution - the name does not indicate that though), but then we could have a test specifically for the simplification:
{
if sub(2, 1) {
mstore(0x40, 0x20)
for {} sub(1, 1) {} {}
}
sstore(1, mload(0x40))
}
I noticed that even this is a problem for the new sequence, even though it's much simpler. And to me it seems much less artificial. I could see some production code actually reducing to something close to this after passing through other steps. For now I added it as fullSuite/structural_simplification.sol
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By the way, I noticed that StructuralSimplifier (specifically the one used in the TOntnfDIul
segment) can deal with the loop if insert ExpressionSimplifier before it (TOnstnfDIul
). And it makes sense because it looks like StructuralSimplifier on its own can't deal with even simple constant expressions, e.g. for {} iszero(1) {} {}
. It needs the condition to be already resolved into false
. Given that, shouldn't ExpressionSimplifier be recommended before StructuralSimplifier?
Also, ExpressionSimplifier only really helps it deal with the loop. The memory load is not resolved because simplification runs after the last occurrence of LoadResolver. I was curious if running the resolver once more, after the simplifier helps, and it does (when paired with CSE). So making it TOnstcLnfDIul
lets the new sequence deal even with this artificial example. I did not add it to the sequence in the PR because it could again make things worse in other cases and would generally require rerunning all benchmarks. I can try it in a separate PR later though.
2703127
to
136540b
Compare
As requested, I compared the new sequence with the old one, looking for steps that used to always have some properties satisfied due to how the sequence was structured and now no longer do. This is the case for the following steps:
Those are the properties I could easily check. Note that "non-CSE code" is defined as in my previous comment, which I'm not sure is correct. But this property does not strike me like something that could have impact on correctness, and I only really checked it only because I already had the data at hand. By "weak expression-split form" I mean what's left of it after LiteralRematerialiser, ExpressionSimplifier and ForLoopConditionIntoBody run. Especially for the first one Daniel argues that the property is not destroyed in a meaningful way. I also updated the "diagrams" above to include this new information. Overall, after looking at those steps, I would not expect any of the above to cause problems. At least not ones that can be easily predicted based on how they operate. |
- The structural simplification here is pretty artificial while the main point of the test is resolution of memory loads. - In exchange add a simpler case covering structural simplification for the whole sequence.
…didate - `the-good-parts` variant found using seqbench. - Third refinement: - adjusted to improve results for the erc20.sol contract. - adjustments to address regressions visible in test output of the previous version.
136540b
to
e19abe4
Compare
Related to #14406
Depends on #15023 (only for benchmarking).MergedThis is a variant I built by addressing regressions that were still present in
the-good-parts-mk2
compared to the default sequence. It does not contain all the fixes - some of them make benchmarks worse, especially in external tests. I chose only ones that were positive (or at least neutral while removing some regressions):the-good-parts-mk2
sequence #14928 (comment) (r
beforep
)the-good-parts-mk2
sequence #14928 (comment) (M
aftergvif
)the-good-parts-mk2
sequence #14928 (comment) (x
before[scCTUt]
)the-good-parts-mk2
sequence #14928 (comment) (O
in cleanup)Those are the modifications I ultimately rejected:
the-good-parts-mk2
sequence #14928 (comment) (t
andc
beforeL
)the-good-parts-mk2
sequence #14928 (comment) (ciTFviu
;Tr
in cleanup)the-good-parts-mk2
sequence #14928 (comment) (Tr
in cleanup)the-good-parts-mk2
sequence #14928 (comment) (U
in cleanup)the-good-parts-mk2
sequence #14928 (comment) (U
+O
in cleanup)