Add missing dependencies and extra info to optimizer step docs #15054

cameel · 2024-04-25T17:12:17Z

~~Depends on #15053.~~ Merged.

The current optimizer docs are pretty incomplete when it comes to documenting prerequisites and recommended steps to run before/after. I need up to date info to verify that the new sequence (#15030) satisfies these requirements so I went over the docs and the docstrings of the steps in the code, compared them and amended the docs where necessary.

I also compiled a table listing them, which is what I will actually refer to when analyzing the sequence - doing it by reading the docs is too unwieldy.

cameel · 2024-04-25T17:15:43Z

docs/cheatsheet.rst

+=========================================== ===================================================== =============================================================== ===========================================
+Step                                        Hard prerequisites                                    Steps that should run before                                    Steps that should run after
+=========================================== ===================================================== =============================================================== ===========================================
+:ref:`a <ssa-transform>`                    Disambiguator, ForLoopInitRewriter                    ExpressionSplitter, CommonSubexpressionEliminator               UnusedAssignEliminator
+:ref:`C <conditional-simplifier>`           Disambiguator                                         SSATransform, DeadCodeEliminator
+:ref:`c <common-subexpression-eliminator>`  Disambiguator, ForLoopInitRewriter                    ExpressionSplitter, SSATransform                                UnusedPruner, UnusedAssignEliminator


Initially I wanted to just add the extra info to the table we already have in the optimizer docs, but our current CSS only allows very narrow tables. I even had to cut full names here to avoid the scrollbar hiding most of the info.

The best compromise I found was to create a new table in the cheatsheet. I don't like it very much but I like other options even less. I could maybe put it in the optimizer docs, but having two tables side by side seems weird. This is the kind of summarized information that actually fits the purpose of the cheatsheet, it's just that this is not the info users will be reaching for often, it will be mostly useful to us.

The nice thing is that here I can sort the steps alphabetically by their abbreviations, which is much more useful than the original order when decoding the sequences.

Now you have the large comment exactly on the first line I wanted to comment on :-D...
Why the SSATransform here? There could be different implementations for the data flow analyzer, which might mean that pseudo-SSA form helps - but currently, if anything, it may get more slower due to larger numbers of variables, won't it?

EDIT: Ah, it's in the previous docs about it. And I see the point made there - not sure that makes much of a difference in practice and in almost all cases having ever run an SSA transform will do that (rather than having to be in pseudo-SSA form), but alright.

Yeah, this table is not new information. It only summarizes info already available in step docs (in some cases amended with some extra info I pulled from the docstrings). This is what it says currently:

This step is especially efficient if the ExpressionSplitter is run
before. If the code is in pseudo-SSA form,
the values of variables are available for a longer time and thus we
have a higher chance of expressions to be replaceable.

So you're basically saying that there's some nuance missing in the current docs - because they do make it sound like you might want to use SSATransform here. In that case I'll try to adjust them based on what you said and then remove this from the table.

Now you have the large comment exactly on the first line I wanted to comment on :-D...

You know that you can make multiple separate review comments for the same line, right?

cameel · 2024-04-25T17:20:56Z

docs/internals/optimizer.rst

@@ -411,6 +411,8 @@ its visibility and it is impossible to reference variables defined in a differen
 The benefit of this stage is that function definitions can be looked up more easily
 and functions can be optimized in isolation without having to traverse the AST completely.

+Prerequisites: Disambiguator.


Note that not all steps list disambiguator as a prerequisite even though its own description above says that all of them depend on it.

BTW, not sure if listing prerequisites is all that useful since we have very few hard prerequisites and those are hard-coded anyway. Maybe it would be fine to remove this info. For now though we do list that info so I amended it for steps where it was available in the docstrings.

cameel · 2024-04-25T17:23:24Z

docs/internals/optimizer.rst

+
+Prerequisites: Disambiguator.
+
+Important: Can only be used on EVM code.


Not sure about usefulness of this either now that we only have an EVM backend, but since other steps already list this kind of info (e.g. Important: Introduces EVM opcodes and thus can only be used on EVM code for now for ControlFlowSimplifier) I added it where it was available for now.

The code only has that, because we haven't removed all of it when removing the EWASM parts :-).

I mean, this still info makes some sense unless we know for sure that there will never be backends other than EVM. I suspect it's incomplete so it may be not very useful even then, but I don't know for sure so I didn't want to remove it without some confirmation.

cameel · 2024-04-25T17:34:29Z

docs/internals/optimizer.rst

+Works best on code without literal arguments, which is the case when the code is in the
+:ref:`expression-split <expression-splitter>` form.


I based this addition on the docstrings, which says this:

solidity/libyul/optimiser/EqualStoreEliminator.h

Line 35 in fb11553

* Works best if the code is in SSA form - without literal arguments.

Since SSATransform on its own does nothing about literal arguments, this basically means that that ExpressionSplitter is recommended before, right?

cameel · 2024-04-25T17:41:35Z

docs/cheatsheet.rst

+:ref:`o <for-loop-init-rewriter>`           Disambiguator
+:ref:`p <unused-function-parameter-pruner>` Disambiguator, FunctionHoister                        LiteralRematerialiser                                           ExpressionInliner
+:ref:`r <unused-assign-eliminator>`         Disambiguator, ForLoopInitRewriter
+:ref:`S <unused-store-eliminator>`          Disambiguator, ForLoopInitRewriter                    SSATransform


Based on the discussion we had with @nikola-matic in #14779 (comment), I think that CommonSubexpressionEliminator should be recommended before UnusedStoreEliminator. Anyone can confirm/deny?

cameel · 2024-04-25T17:45:44Z

docs/internals/optimizer.rst

 The inliner should be run afterwards to make sure that all references to ``f2`` are replaced by
 ``f``.


I'm assuming this means ExpressionInliner, because it inlines functions of exactly this form that this step introduces. Is that correct?

No, that's just coincidence here - if you have a function with three return arguments and only one of them is unused, so two remain, only the full inliner can inline this.

Thanks. That's very useful to know. I'll update the docs here.

cameel · 2024-04-27T20:08:32Z

docs/internals/optimizer.rst

+
+To avoid unnecessary rewriting, it is recommended to run this step after StructuralSimplifier.
+
+May destroy the :ref:`expression-split <expression-splitter>` form.


I inferred the info about which steps preserve the expression-split and SSA forms from the current docs. Please double check that I did not make a mistake here. Since most steps seem to preserve the forms, I used the convention of stating it only if the step can destroy it.

For example this step (ForLoopConditionIntoBody) may destroy expression-split form because it would transform something like this:

for { ... } iszero(a) { ... } { ... }

into this:

for { ... } 1 { ... } { if iszero(iszero(a)) { break } }

ExpressionSimplifier also seems to mostly preserve it but also has some rules that don't. For example the exp rule will transform this:

let _0 := 0 let a := calldataload(_0) let b := not(_0) let c := exp(b, a) sstore(a, c)

into this:

let _0 := 0 let a := calldataload(_0) let b := 0xffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff let c := sub(iszero(and(a, 1)), and(a, 1)) sstore(a, c)

cameel · 2024-04-27T20:11:49Z

docs/internals/optimizer.rst

+During inlining, a heuristic is used to tell if the function call should be inlined or not.
+
+A function call will never be inlined if:
+
+- The function definition contains the ``leave`` opcode.
+- The function is large and inlining is not *aggressive* (details below).
+- The function definition contains any direct, recursive calls to itself or the call being inlined is recursive.
+  Note that indirect recursive calls (going through other functions) do not disqualify the function from inlining.
+- Argument expressions passed into the call can potentially have side effects.
+  This check is very broad, rejecting anything that is not simply a literal or an identifier, since
+  the condition is easy to satisfy by running ExpressionSplitter beforehand.
+
+Otherwise, if any of the following conditions is met, the call is inlined unconditionally:
+
+- The function is tiny.
+- The function is only ever called once.
+
+In the remaining cases the decision to inline or not depends on the size of the function.
+The size used here is not the actual bytecode size (which is not known to the optimizer), but rather
+a score calculated at the Yul level, which approximates that size.
+Inlining is performed if the size does not exceed a predefined threshold, which is a little higher
+if the inlining is aggressive and yet higher when the call has some constant arguments.

+The aggressive inlining is performed when:
+
+- The :ref:`yul-memoryguard` is present.
+- The function is not recursive (not even indirectly).
+- The legacy Yul->EVM transform is not used (i.e. the target is not the ``homestead`` :ref:`EVM version <evm-version>`).


This is a summary of what FullInliner::shallInline() does.

I really don't think it is good to document things into this level of detail :-). There is little merit in it, it only leads to people relying on some behaviour we want to be able to change and it's a maintenance hassle.

This is something I needed to know multiple times in the past and each time I just looked at the code of shallInline(), but it's nowhere as clear as having it spelled out in the docs and I never had time to dive into details so I actually never fully understood how it really worked. For example, that the m_noInlineFunctions check is simply a check for functions using leave. Or how some indirect recursive calls may actually be subject to inlining (there are three separate checks for recursive calls and it's not obvious how they differ). This time I needed it again to figure out why a function was not being inlined in one of my snippets. It always felt like something that should have been documented somewhere.

I could put it the docstring or inside FullInliner::shallInline(), but I really don't see much problem with having it in the docs proper. It's our main resource to understand how the optimizer works, not just for users but also for us and that fully justifies some extra effort spent keeping it up to date.

At the same time I don't think the extra detail really hurts users. It does not go into tiny details of how the inliner is implemented. It just describes the heuristic, which is useful to know when you're optimizing your code. But even then it omits all the specifics that could easily change, like numbers or details of the metric. There are step descriptions that are much more detailed than that. So if you're worried about users relying on them - we already have that problem. I don't think it's a good reason to avoid documenting things. I think that we in general lean too hard on that and I think it's one of the reasons why the code is so hard to get into.

In any case, I feel very strongly that the inliner heuristic should be documented.

ekpyron · 2024-05-06T21:01:09Z

The main problem is: do you really want to guarantee that, assuming we satisfy these ordering requirements, arbitrary sequences don't lead to buggy cases :-)? We did miss one such case before and filed it as low severity bug, since custom sequences are rarely used and a "you-need-to-know-what-you-do" case reflected in lack of docs. If we document this in detail now, the situation changes - and we don't test any of this for correctness (and also can't - there's no way we can cover the space here with tests - we do fuzz this, but the fuzzer here can't catch everything...)

I'll in any case, still have a look and check if anything strikes me as odd since this affects #15030.

We could also consider documenting this, but clearly stating that custom sequences are "at your own risk" anyways - even though that'd be a bit weird to start doing now out of the blue :-)... not sure yet :-).

ekpyron · 2024-05-06T21:19:23Z

docs/cheatsheet.rst

+:ref:`u <unused-pruner>`                    Disambiguator
+:ref:`V <ssa-reverser>`                     Disambiguator                                                                                                         CommonSubexpressionEliminator, UnusedPruner
+:ref:`v <equivalent-function-combiner>`     Disambiguator, FunctionHoister                                                                                        UnusedPruner
+:ref:`x <expression-splitter>`                                                                    ForLoopConditionIntoBody


While it may be true, it's a stretch to assume that this doesn't depend on the Disambiguator - the idea is that all steps do and there's no value on trying to be more precise than that. The same goes for the block flattener and the function hoister and the for loop init rewriter - basically the hgfo sequence we have due to that hardcoded and non-configurable in the sequence.
There's also not much point in omitting them for some steps as hard dependency, since disabling them is not a configuration option anyways :-).

And: slightly overapproximating is fully harmless here - while understating may lead to damage :-).

Note that the info about prerequisites is not something my PR introduces. It only fills in the missing bits from the docstrings and also organizes them in a table so that it's now easy to how spotty this is, but most of that info was already there. Anyone reading the docs now will already come away with an impression that some steps don't have this prerequisite, because otherwise, why would it be mentioned for some of them.

So I guess I should take it as a confirmation of my suggestion in #15054 (comment) that it would be better to remove this info instead and add a general note that all steps should be assumed to depend on hgfo?

cameel

The main problem is: do you really want to guarantee that, assuming we satisfy these ordering requirements, arbitrary sequences don't lead to buggy cases :-)?

I really don't see how you can think that this PR changes anything in that regard. This long and detailed page with descriptions of optimizer steps already exists. We even have documented options for supplying arbitrary sequences, we already mostly document prerequisites, etc. The PR merely makes the info more consistent and summarizes it in a table. It does add some more detailed info from docstrings, but not that much of it and some steps already have even more detailed descriptions.

We could also consider documenting this, but clearly stating that custom sequences are "at your own risk" anyways - even though that'd be a bit weird to start doing now out of the blue :-)... not sure yet :-).

Then we should just add such a warning. I think that the page is already more detailed than a typical user needs, but I don't think we should remove all the useful info from it. That would just hurt us. For me, for example, it's the primary resource I refer to when I need to understand how the optimizer works. Maybe it should not be a part of the main docs if that prevents us from documenting things, but having this info collected in one place like that is immensely useful. Thanks to this I know more about the optimizer than about some other components, e.g. the EVM transform. I wish we had even more pages about internals like that somewhere.

For the time being it's a part of the docs, so I think it should be kept up to date there until/if we decide to move it elsewhere.

cameel · 2024-05-07T14:24:08Z

docs/cheatsheet.rst

+:ref:`u <unused-pruner>`                    Disambiguator
+:ref:`V <ssa-reverser>`                     Disambiguator                                                                                                         CommonSubexpressionEliminator, UnusedPruner
+:ref:`v <equivalent-function-combiner>`     Disambiguator, FunctionHoister                                                                                        UnusedPruner
+:ref:`x <expression-splitter>`                                                                    ForLoopConditionIntoBody


Note that the info about prerequisites is not something my PR introduces. It only fills in the missing bits from the docstrings and also organizes them in a table so that it's now easy to how spotty this is, but most of that info was already there. Anyone reading the docs now will already come away with an impression that some steps don't have this prerequisite, because otherwise, why would it be mentioned for some of them.

So I guess I should take it as a confirmation of my suggestion in #15054 (comment) that it would be better to remove this info instead and add a general note that all steps should be assumed to depend on hgfo?

cameel · 2024-05-07T14:38:42Z

docs/cheatsheet.rst

+=========================================== ===================================================== =============================================================== ===========================================
+Step                                        Hard prerequisites                                    Steps that should run before                                    Steps that should run after
+=========================================== ===================================================== =============================================================== ===========================================
+:ref:`a <ssa-transform>`                    Disambiguator, ForLoopInitRewriter                    ExpressionSplitter, CommonSubexpressionEliminator               UnusedAssignEliminator
+:ref:`C <conditional-simplifier>`           Disambiguator                                         SSATransform, DeadCodeEliminator
+:ref:`c <common-subexpression-eliminator>`  Disambiguator, ForLoopInitRewriter                    ExpressionSplitter, SSATransform                                UnusedPruner, UnusedAssignEliminator


Yeah, this table is not new information. It only summarizes info already available in step docs (in some cases amended with some extra info I pulled from the docstrings). This is what it says currently:

This step is especially efficient if the ExpressionSplitter is run
before. If the code is in pseudo-SSA form,
the values of variables are available for a longer time and thus we
have a higher chance of expressions to be replaceable.

So you're basically saying that there's some nuance missing in the current docs - because they do make it sound like you might want to use SSATransform here. In that case I'll try to adjust them based on what you said and then remove this from the table.

Now you have the large comment exactly on the first line I wanted to comment on :-D...

You know that you can make multiple separate review comments for the same line, right?

cameel · 2024-05-07T14:45:19Z

docs/internals/optimizer.rst

+
+Prerequisites: Disambiguator.
+
+Important: Can only be used on EVM code.


I mean, this still info makes some sense unless we know for sure that there will never be backends other than EVM. I suspect it's incomplete so it may be not very useful even then, but I don't know for sure so I didn't want to remove it without some confirmation.

cameel · 2024-05-07T14:46:32Z

docs/internals/optimizer.rst

 The inliner should be run afterwards to make sure that all references to ``f2`` are replaced by
 ``f``.


Thanks. That's very useful to know. I'll update the docs here.

cameel · 2024-05-07T15:28:53Z

docs/internals/optimizer.rst

+During inlining, a heuristic is used to tell if the function call should be inlined or not.
+
+A function call will never be inlined if:
+
+- The function definition contains the ``leave`` opcode.
+- The function is large and inlining is not *aggressive* (details below).
+- The function definition contains any direct, recursive calls to itself or the call being inlined is recursive.
+  Note that indirect recursive calls (going through other functions) do not disqualify the function from inlining.
+- Argument expressions passed into the call can potentially have side effects.
+  This check is very broad, rejecting anything that is not simply a literal or an identifier, since
+  the condition is easy to satisfy by running ExpressionSplitter beforehand.
+
+Otherwise, if any of the following conditions is met, the call is inlined unconditionally:
+
+- The function is tiny.
+- The function is only ever called once.
+
+In the remaining cases the decision to inline or not depends on the size of the function.
+The size used here is not the actual bytecode size (which is not known to the optimizer), but rather
+a score calculated at the Yul level, which approximates that size.
+Inlining is performed if the size does not exceed a predefined threshold, which is a little higher
+if the inlining is aggressive and yet higher when the call has some constant arguments.

+The aggressive inlining is performed when:
+
+- The :ref:`yul-memoryguard` is present.
+- The function is not recursive (not even indirectly).
+- The legacy Yul->EVM transform is not used (i.e. the target is not the ``homestead`` :ref:`EVM version <evm-version>`).


This is something I needed to know multiple times in the past and each time I just looked at the code of shallInline(), but it's nowhere as clear as having it spelled out in the docs and I never had time to dive into details so I actually never fully understood how it really worked. For example, that the m_noInlineFunctions check is simply a check for functions using leave. Or how some indirect recursive calls may actually be subject to inlining (there are three separate checks for recursive calls and it's not obvious how they differ). This time I needed it again to figure out why a function was not being inlined in one of my snippets. It always felt like something that should have been documented somewhere.

I could put it the docstring or inside FullInliner::shallInline(), but I really don't see much problem with having it in the docs proper. It's our main resource to understand how the optimizer works, not just for users but also for us and that fully justifies some extra effort spent keeping it up to date.

At the same time I don't think the extra detail really hurts users. It does not go into tiny details of how the inliner is implemented. It just describes the heuristic, which is useful to know when you're optimizing your code. But even then it omits all the specifics that could easily change, like numbers or details of the metric. There are step descriptions that are much more detailed than that. So if you're worried about users relying on them - we already have that problem. I don't think it's a good reason to avoid documenting things. I think that we in general lean too hard on that and I think it's one of the reasons why the code is so hard to get into.

In any case, I feel very strongly that the inliner heuristic should be documented.

…nfo from step docstrings

… expression-split forms

cameel added documentation 📖 optimizer has dependencies The PR depends on other PRs that must be merged first labels Apr 25, 2024

cameel requested review from ekpyron and nikola-matic April 25, 2024 17:12

cameel self-assigned this Apr 25, 2024

cameel marked this pull request as draft April 25, 2024 17:12

cameel commented Apr 25, 2024

View reviewed changes

cameel changed the title ~~Optimizer docs add missing dependencies and info~~ Add missing dependencies and extra info to optimizer step docs Apr 25, 2024

cameel force-pushed the optimizer-docs-style-cleanup branch from af0af5c to 125225c Compare April 27, 2024 12:06

cameel force-pushed the optimizer-docs-add-missing-dependencies-and-info branch 2 times, most recently from a7498de to cd9c124 Compare April 27, 2024 19:54

cameel force-pushed the optimizer-docs-style-cleanup branch from 125225c to 6702af2 Compare April 27, 2024 19:54

cameel force-pushed the optimizer-docs-add-missing-dependencies-and-info branch from cd9c124 to 9747d70 Compare April 27, 2024 20:02

cameel commented Apr 27, 2024

View reviewed changes

cameel mentioned this pull request Apr 27, 2024

[seqbench] the-good-parts-mk3 sequence #15030

Open

ekpyron reviewed May 6, 2024

View reviewed changes

cameel commented May 7, 2024

View reviewed changes

cameel force-pushed the optimizer-docs-style-cleanup branch from 6702af2 to 9599d85 Compare May 8, 2024 15:17

cameel added 4 commits May 8, 2024 17:25

Optimizer docs: add missing prerequisites and missing bits of extra i…

bea4e25

…nfo from step docstrings

cheatsheet: List optimizer step dependencies

a93698d

Optimizer docs: Document LiteralRematerialiser

ab1a67c

Optimizer docs: More detailed description of FullInliner's heuristic

7b9540d

Optimizer docs: Add missing info about steps that destroy the SSA and…

098075d

… expression-split forms

cameel force-pushed the optimizer-docs-add-missing-dependencies-and-info branch from 9747d70 to 098075d Compare May 8, 2024 15:25

Base automatically changed from optimizer-docs-style-cleanup to develop May 8, 2024 15:57

cameel removed the has dependencies The PR depends on other PRs that must be merged first label May 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add missing dependencies and extra info to optimizer step docs #15054

Add missing dependencies and extra info to optimizer step docs #15054

cameel commented Apr 25, 2024 •

edited

cameel Apr 25, 2024 •

edited

ekpyron May 6, 2024 •

edited

cameel May 7, 2024

cameel Apr 25, 2024 •

edited

cameel Apr 25, 2024

ekpyron May 6, 2024

cameel May 7, 2024

cameel Apr 25, 2024 •

edited

cameel Apr 25, 2024

cameel Apr 25, 2024 •

edited

ekpyron May 6, 2024

cameel May 7, 2024

cameel Apr 27, 2024

cameel Apr 27, 2024

ekpyron May 6, 2024

cameel May 7, 2024 •

edited

ekpyron commented May 6, 2024

ekpyron May 6, 2024

ekpyron May 6, 2024

cameel May 7, 2024

cameel left a comment •

edited

cameel May 7, 2024

cameel May 7, 2024

cameel May 7, 2024

cameel May 7, 2024

cameel May 7, 2024 •

edited


		Prerequisites: Disambiguator.

		Important: Can only be used on EVM code.

		Works best on code without literal arguments, which is the case when the code is in the
		:ref:`expression-split <expression-splitter>` form.

		The inliner should be run afterwards to make sure that all references to ``f2`` are replaced by
		``f``.


		To avoid unnecessary rewriting, it is recommended to run this step after StructuralSimplifier.

		May destroy the :ref:`expression-split <expression-splitter>` form.

Add missing dependencies and extra info to optimizer step docs #15054

Are you sure you want to change the base?

Add missing dependencies and extra info to optimizer step docs #15054

Conversation

cameel commented Apr 25, 2024 • edited

cameel Apr 25, 2024 • edited

Choose a reason for hiding this comment

ekpyron May 6, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cameel Apr 25, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cameel Apr 25, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cameel Apr 25, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cameel May 7, 2024 • edited

Choose a reason for hiding this comment

ekpyron commented May 6, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cameel left a comment • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cameel May 7, 2024 • edited

Choose a reason for hiding this comment

cameel commented Apr 25, 2024 •

edited

cameel Apr 25, 2024 •

edited

ekpyron May 6, 2024 •

edited

cameel Apr 25, 2024 •

edited

cameel Apr 25, 2024 •

edited

cameel Apr 25, 2024 •

edited

cameel May 7, 2024 •

edited

cameel left a comment •

edited

cameel May 7, 2024 •

edited