Make CommonSubExpression Eliminator consider gas costs as well as code size #15048

matheusaaguiar · 2024-04-22T13:52:08Z

The CSE only checks for the code size when trying to decide if it should apply its optimizations.
This PR has the objective of improving this decision process by also considering the gas costs together with the code size.
Suggested here.

ekpyron

What's exactly happening with the tests here?

ekpyron · 2024-04-22T18:42:49Z

test/libevmasm/Optimiser.cpp

+			shouldReplace = (
+				optimisedChunk.size()  < static_cast<size_t>(iter - orig) &&
+				!(runGas(AssemblyItems(orig, iter)) < runGas(optimisedChunk))
+			);


This is not quite the same as actually happens in Assembly.cpp now, is it :-)?

I think we should extract this bit in Assembly.cpp into a helper and use the helper here. Otherwise we're not even testing the changes made by the PR.

Stop suggesting to refactor things. This is an intentional copy of Assembly.cpp, as explicitly stated above - the fact that this is weird, doesn't mean we should start taking half the codebase apart refactoring things randomly. The PR is a minor fix in something that's generally obsolete, to prevent it from doing more harm than good before it's finally removed - it's literally a waste of time to do this kind of refactoring work on it.

This is an intentional copy of Assembly.cpp

Except that it no longer is, you just said that yourself above.

I'm not suggesting a refactor for the sake of a refactor. I'm saying that due to this being a copy we're not testing the actual implementation here and IMO it's a problem if we're interested in it being correct. I mean, if the code is, as you say, obsolete then maybe we're not and maybe just recopying it is enough, but I don't think the suggestion was that outrageous. It shouldn't have been a copy to begin with.

Does this PR introduce a new copy of something that wasn't there before? Was the scope and intention of this PR fixing and refactoring the testing setup? It was neither. Hence what you're suggesting is extending the scope of the PR to something that's an entirely independent issue - and one at that, that we'd never individually have solved. That's the very definition of a distraction and wasting time. Not too outrageous in isolation, but this is a pattern that results in overall throughput being near zero.

ekpyron · 2024-04-22T18:47:27Z

libevmasm/Assembly.cpp

+			auto optimisedItemsCost = static_cast<bigint>(optimisedItems.size()) + runs * runGas(optimisedItems, KnownState{}).value;
+			auto itemsCost = static_cast<bigint>(m_items.size()) + runs * runGas(m_items, KnownState{}).value;


If we want to try and be accurate about this, the code size component of the cost should be calculated similarly to

solidity/libevmasm/Inliner.cpp

Line 178 in 272892e

bigint inlinedDepositCost = GasMeter::dataGas(

, i.e. based on an estimate of the actual number of bytes required for each assembly item, times the actual cost of a byte of code...

Yeah, we really should do that. I'd even go as far as to say that not doing it makes the calculation here broken. As is, the run cost will completely overshadow the data cost.

For example with the standard settings of 200 runs and assuming non-creation data (cost: 200 gas per byte) we'd expect each byte count as much as each unit of gas used by the instruction. But in the current implementation the instruction cost will count 200x more, making the data cost almost insignificant.

cameel · 2024-04-23T07:13:19Z

libevmasm/Assembly.cpp

+			auto runGas = [&](AssemblyItems const& items, KnownState _state) {
+				GasMeter gasMeter{std::make_shared<KnownState>(_state), _settings.evmVersion};


Are you sure you want to make a copy of the whole KnownState here? Two copies in fact, because make_shared() will allocate another one.

Suggested change

auto runGas = [&](AssemblyItems const& items, KnownState _state) {

GasMeter gasMeter{std::make_shared<KnownState>(_state), _settings.evmVersion};

auto runGas = [&](AssemblyItems const& items, std::shared_ptr<KnownState> _state) {

GasMeter gasMeter{_state, _settings.evmVersion};

Also, getKnownState() should probably be returning shared_ptr as well.

cameel · 2024-04-23T08:39:34Z

libevmasm/Assembly.cpp

+			auto optimisedItemsCost = static_cast<bigint>(optimisedItems.size()) + runs * runGas(optimisedItems, KnownState{}).value;
+			auto itemsCost = static_cast<bigint>(m_items.size()) + runs * runGas(m_items, KnownState{}).value;


Yeah, we really should do that. I'd even go as far as to say that not doing it makes the calculation here broken. As is, the run cost will completely overshadow the data cost.

For example with the standard settings of 200 runs and assuming non-creation data (cost: 200 gas per byte) we'd expect each byte count as much as each unit of gas used by the instruction. But in the current implementation the instruction cost will count 200x more, making the data cost almost insignificant.

cameel · 2024-04-23T09:07:15Z

libevmasm/Assembly.cpp

-			if (optimisedItems.size() < m_items.size())
+			auto optimisedItemsCost = static_cast<bigint>(optimisedItems.size()) + runs * runGas(optimisedItems, KnownState{}).value;
+			auto itemsCost = static_cast<bigint>(m_items.size()) + runs * runGas(m_items, KnownState{}).value;
+			if (optimisedItemsCost < itemsCost)


I think that we should keep the old condition here. It seems to me that the purpose of the check was not to ensure that new items are not more expensive - that will always be the case - but just to avoid bumping count if we did not replace anything. The old condition still accomplishes that and is much simpler (and cheaper to calculate).

In fact, the new condition could actually be wrong in some cases. KnownState you're passing in is different (empty here), so theoretically you could even get inconsistent results.

Actually, the way count is being bumped here seems wrong in the first place. The loop is bumping it for every replacement, which makes sense only if the condition will always pass when there are replacements. And bumping it again in the condition makes no sense - it's not an additional optimization, just an application of already performed optimizations. It's not broken in practice only because the outer loop only cares whether it's zero or not. I think we'd be better off removing the bump and the condition and do the assignment unconditionally.

cameel · 2024-04-23T09:20:19Z

libevmasm/Assembly.cpp

+			auto runGas = [&](AssemblyItems const& items, KnownState _state) {
+				GasMeter gasMeter{std::make_shared<KnownState>(_state), _settings.evmVersion};
+				GasMeter::GasConsumption gas;
+				for (auto const& item: items)
+					gas += gasMeter.estimateMax(item);
+				return gas;
+			};


Your implementation ignores GasConsumption.isInfinite so it will give us wrong results if any of the items has an infinite estimate. If you're assuming it won't happen, you should at least have an assert against it.

There's a correct implementation in Inliner.cpp:

solidity/libevmasm/Inliner.cpp

Lines 47 to 59 in ebdce26

/// @returns an estimation of the runtime gas cost of the AssemblyItems in @a _itemRange.

template<typename RangeType>

u256 executionCost(RangeType const& _itemRange, langutil::EVMVersion _evmVersion)

{

GasMeter gasMeter{std::make_shared<KnownState>(), _evmVersion};

auto gasConsumption = ranges::accumulate(_itemRange | ranges::views::transform(

[&gasMeter](auto const& _item) { return gasMeter.estimateMax(_item, false); }

), GasMeter::GasConsumption());

if (gasConsumption.isInfinite)

return std::numeric_limits<u256>::max();

else

return gasConsumption.value;

}

I think it would be to reuse it here too. It looks general enough that perhaps we could just move it to GasMeter.h.

cameel · 2024-04-23T09:23:12Z

libevmasm/CommonSubexpressionEliminator.h

@@ -77,6 +77,8 @@ class CommonSubexpressionEliminator
 	/// @returns the resulting items after optimization.
 	AssemblyItems getOptimizedItems();

+	KnownState const& getKnownState() const { return m_state; }


Suggested change

KnownState const& getKnownState() const { return m_state; }

KnownState const& knownState() const { return m_state; }

cameel · 2024-04-23T09:29:39Z

test/libevmasm/Optimiser.cpp

+			shouldReplace = (
+				optimisedChunk.size()  < static_cast<size_t>(iter - orig) &&
+				!(runGas(AssemblyItems(orig, iter)) < runGas(optimisedChunk))
+			);


I think we should extract this bit in Assembly.cpp into a helper and use the helper here. Otherwise we're not even testing the changes made by the PR.

cameel · 2024-04-23T09:31:42Z

test/libsolidity/semanticTests/abiEncoderV2/calldata_overlapped_dynamic_arrays.sol

-// gas irOptimized: 111395
-// gas legacy: 112709
-// gas legacyOptimized: 111852
+// gas irOptimized: 111642
+// gas legacy: 112944
+// gas legacyOptimized: 112090


This got more expensive. It's not supposed to happen with this kind of change, is it?

cameel · 2024-04-23T09:33:34Z

test/libevmasm/Optimiser.cpp

You should add some cases to test that show that the runtime cost an the runs parameter are now properly taken into account.

A test with something that gets an infinite estimate would not hurt either.

github-actions · 2024-05-09T12:04:42Z

This pull request is stale because it has been open for 14 days with no activity.
It will be closed in 7 days unless the stale label is removed.

matheusaaguiar self-assigned this Apr 22, 2024

matheusaaguiar added the optimizer label Apr 22, 2024

matheusaaguiar added 2 commits April 22, 2024 10:53

Make CSE block selection consider gas cost

ca02022

Use KnownState

66939ce

matheusaaguiar force-pushed the MakeCSEConsidersGasCost branch from c28bc90 to 66939ce Compare April 22, 2024 13:53

ekpyron reviewed Apr 22, 2024

View reviewed changes

cameel requested changes Apr 23, 2024

View reviewed changes

github-actions bot added the stale The issue/PR was marked as stale because it has been open for too long. label May 9, 2024

matheusaaguiar removed the stale The issue/PR was marked as stale because it has been open for too long. label May 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make CommonSubExpression Eliminator consider gas costs as well as code size #15048

Make CommonSubExpression Eliminator consider gas costs as well as code size #15048

matheusaaguiar commented Apr 22, 2024

ekpyron left a comment

ekpyron Apr 22, 2024

cameel Apr 23, 2024

ekpyron Apr 23, 2024

cameel Apr 23, 2024

ekpyron Apr 24, 2024

ekpyron Apr 22, 2024

cameel Apr 23, 2024

cameel Apr 23, 2024

cameel Apr 23, 2024

cameel Apr 23, 2024

cameel Apr 23, 2024

cameel Apr 23, 2024

cameel Apr 23, 2024

cameel Apr 23, 2024

cameel Apr 23, 2024

cameel Apr 23, 2024

github-actions bot commented May 9, 2024

		auto optimisedItemsCost = static_cast<bigint>(optimisedItems.size()) + runs * runGas(optimisedItems, KnownState{}).value;
		auto itemsCost = static_cast<bigint>(m_items.size()) + runs * runGas(m_items, KnownState{}).value;

		auto runGas = [&](AssemblyItems const& items, KnownState _state) {
		GasMeter gasMeter{std::make_shared<KnownState>(_state), _settings.evmVersion};

	/// @returns an estimation of the runtime gas cost of the AssemblyItems in @a _itemRange.
	template<typename RangeType>
	u256 executionCost(RangeType const& _itemRange, langutil::EVMVersion _evmVersion)
	{
	GasMeter gasMeter{std::make_shared<KnownState>(), _evmVersion};
	auto gasConsumption = ranges::accumulate(_itemRange \| ranges::views::transform(
	[&gasMeter](auto const& _item) { return gasMeter.estimateMax(_item, false); }
	), GasMeter::GasConsumption());
	if (gasConsumption.isInfinite)
	return std::numeric_limits<u256>::max();
	else
	return gasConsumption.value;
	}

	KnownState const& getKnownState() const { return m_state; }
	KnownState const& knownState() const { return m_state; }

Make CommonSubExpression Eliminator consider gas costs as well as code size #15048

Are you sure you want to change the base?

Make CommonSubExpression Eliminator consider gas costs as well as code size #15048

Conversation

matheusaaguiar commented Apr 22, 2024

ekpyron left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

github-actions bot commented May 9, 2024