Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make CommonSubExpression Eliminator consider gas costs as well as code size #15048

Draft
wants to merge 2 commits into
base: develop
Choose a base branch
from

Conversation

matheusaaguiar
Copy link
Collaborator

The CSE only checks for the code size when trying to decide if it should apply its optimizations.
This PR has the objective of improving this decision process by also considering the gas costs together with the code size.
Suggested here.

Copy link
Member

@ekpyron ekpyron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's exactly happening with the tests here?

Comment on lines +123 to +126
shouldReplace = (
optimisedChunk.size() < static_cast<size_t>(iter - orig) &&
!(runGas(AssemblyItems(orig, iter)) < runGas(optimisedChunk))
);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not quite the same as actually happens in Assembly.cpp now, is it :-)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should extract this bit in Assembly.cpp into a helper and use the helper here. Otherwise we're not even testing the changes made by the PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stop suggesting to refactor things. This is an intentional copy of Assembly.cpp, as explicitly stated above - the fact that this is weird, doesn't mean we should start taking half the codebase apart refactoring things randomly. The PR is a minor fix in something that's generally obsolete, to prevent it from doing more harm than good before it's finally removed - it's literally a waste of time to do this kind of refactoring work on it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an intentional copy of Assembly.cpp

Except that it no longer is, you just said that yourself above.

I'm not suggesting a refactor for the sake of a refactor. I'm saying that due to this being a copy we're not testing the actual implementation here and IMO it's a problem if we're interested in it being correct. I mean, if the code is, as you say, obsolete then maybe we're not and maybe just recopying it is enough, but I don't think the suggestion was that outrageous. It shouldn't have been a copy to begin with.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this PR introduce a new copy of something that wasn't there before? Was the scope and intention of this PR fixing and refactoring the testing setup? It was neither. Hence what you're suggesting is extending the scope of the PR to something that's an entirely independent issue - and one at that, that we'd never individually have solved. That's the very definition of a distraction and wasting time. Not too outrageous in isolation, but this is a pattern that results in overall throughput being near zero.

Comment on lines +833 to +834
auto optimisedItemsCost = static_cast<bigint>(optimisedItems.size()) + runs * runGas(optimisedItems, KnownState{}).value;
auto itemsCost = static_cast<bigint>(m_items.size()) + runs * runGas(m_items, KnownState{}).value;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to try and be accurate about this, the code size component of the cost should be calculated similarly to

bigint inlinedDepositCost = GasMeter::dataGas(
, i.e. based on an estimate of the actual number of bytes required for each assembly item, times the actual cost of a byte of code...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we really should do that. I'd even go as far as to say that not doing it makes the calculation here broken. As is, the run cost will completely overshadow the data cost.

For example with the standard settings of 200 runs and assuming non-creation data (cost: 200 gas per byte) we'd expect each byte count as much as each unit of gas used by the instruction. But in the current implementation the instruction cost will count 200x more, making the data cost almost insignificant.

Comment on lines +788 to +789
auto runGas = [&](AssemblyItems const& items, KnownState _state) {
GasMeter gasMeter{std::make_shared<KnownState>(_state), _settings.evmVersion};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure you want to make a copy of the whole KnownState here? Two copies in fact, because make_shared() will allocate another one.

Suggested change
auto runGas = [&](AssemblyItems const& items, KnownState _state) {
GasMeter gasMeter{std::make_shared<KnownState>(_state), _settings.evmVersion};
auto runGas = [&](AssemblyItems const& items, std::shared_ptr<KnownState> _state) {
GasMeter gasMeter{_state, _settings.evmVersion};

Also, getKnownState() should probably be returning shared_ptr as well.

Comment on lines +833 to +834
auto optimisedItemsCost = static_cast<bigint>(optimisedItems.size()) + runs * runGas(optimisedItems, KnownState{}).value;
auto itemsCost = static_cast<bigint>(m_items.size()) + runs * runGas(m_items, KnownState{}).value;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we really should do that. I'd even go as far as to say that not doing it makes the calculation here broken. As is, the run cost will completely overshadow the data cost.

For example with the standard settings of 200 runs and assuming non-creation data (cost: 200 gas per byte) we'd expect each byte count as much as each unit of gas used by the instruction. But in the current implementation the instruction cost will count 200x more, making the data cost almost insignificant.

Comment on lines -820 to +835
if (optimisedItems.size() < m_items.size())
auto optimisedItemsCost = static_cast<bigint>(optimisedItems.size()) + runs * runGas(optimisedItems, KnownState{}).value;
auto itemsCost = static_cast<bigint>(m_items.size()) + runs * runGas(m_items, KnownState{}).value;
if (optimisedItemsCost < itemsCost)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we should keep the old condition here. It seems to me that the purpose of the check was not to ensure that new items are not more expensive - that will always be the case - but just to avoid bumping count if we did not replace anything. The old condition still accomplishes that and is much simpler (and cheaper to calculate).

In fact, the new condition could actually be wrong in some cases. KnownState you're passing in is different (empty here), so theoretically you could even get inconsistent results.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, the way count is being bumped here seems wrong in the first place. The loop is bumping it for every replacement, which makes sense only if the condition will always pass when there are replacements. And bumping it again in the condition makes no sense - it's not an additional optimization, just an application of already performed optimizations. It's not broken in practice only because the outer loop only cares whether it's zero or not. I think we'd be better off removing the bump and the condition and do the assignment unconditionally.

Comment on lines +788 to +794
auto runGas = [&](AssemblyItems const& items, KnownState _state) {
GasMeter gasMeter{std::make_shared<KnownState>(_state), _settings.evmVersion};
GasMeter::GasConsumption gas;
for (auto const& item: items)
gas += gasMeter.estimateMax(item);
return gas;
};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your implementation ignores GasConsumption.isInfinite so it will give us wrong results if any of the items has an infinite estimate. If you're assuming it won't happen, you should at least have an assert against it.

There's a correct implementation in Inliner.cpp:

/// @returns an estimation of the runtime gas cost of the AssemblyItems in @a _itemRange.
template<typename RangeType>
u256 executionCost(RangeType const& _itemRange, langutil::EVMVersion _evmVersion)
{
GasMeter gasMeter{std::make_shared<KnownState>(), _evmVersion};
auto gasConsumption = ranges::accumulate(_itemRange | ranges::views::transform(
[&gasMeter](auto const& _item) { return gasMeter.estimateMax(_item, false); }
), GasMeter::GasConsumption());
if (gasConsumption.isInfinite)
return std::numeric_limits<u256>::max();
else
return gasConsumption.value;
}

I think it would be to reuse it here too. It looks general enough that perhaps we could just move it to GasMeter.h.

@@ -77,6 +77,8 @@ class CommonSubexpressionEliminator
/// @returns the resulting items after optimization.
AssemblyItems getOptimizedItems();

KnownState const& getKnownState() const { return m_state; }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
KnownState const& getKnownState() const { return m_state; }
KnownState const& knownState() const { return m_state; }

Comment on lines +123 to +126
shouldReplace = (
optimisedChunk.size() < static_cast<size_t>(iter - orig) &&
!(runGas(AssemblyItems(orig, iter)) < runGas(optimisedChunk))
);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should extract this bit in Assembly.cpp into a helper and use the helper here. Otherwise we're not even testing the changes made by the PR.

Comment on lines -36 to +38
// gas irOptimized: 111395
// gas legacy: 112709
// gas legacyOptimized: 111852
// gas irOptimized: 111642
// gas legacy: 112944
// gas legacyOptimized: 112090
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This got more expensive. It's not supposed to happen with this kind of change, is it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should add some cases to test that show that the runtime cost an the runs parameter are now properly taken into account.

A test with something that gets an infinite estimate would not hurt either.

Copy link

github-actions bot commented May 9, 2024

This pull request is stale because it has been open for 14 days with no activity.
It will be closed in 7 days unless the stale label is removed.

@github-actions github-actions bot added the stale The issue/PR was marked as stale because it has been open for too long. label May 9, 2024
@matheusaaguiar matheusaaguiar removed the stale The issue/PR was marked as stale because it has been open for too long. label May 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: In Progress
Development

Successfully merging this pull request may close these issues.

None yet

3 participants