Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compare COutPoints lexicographically #30046

Closed

Conversation

josibake
Copy link
Member

@josibake josibake commented May 6, 2024

Broken out from #28122


BIP352 requires the smallest output lexicographically, which is my primary motivation for adding this as the default way of compare COutPoints (e.g.

auto smallest_outpoint = std::min_element(tx_outpoints.begin(), tx_outpoints.end());
)

Outside of needing this for #28122 , I think it makes more sense to compare COutPoints by their serialization for any use case that needs ordering since the serialization is what appears in the final transaction. I also didn't see any existing places in the code where we order outpoints, so this change doesn't conflict with any current use cases that I can see. If there is a need to order inputs by their vout field (e.g. for an RPC output), something like

std::sort(outpoints.begin(), outpoints.end(), [](const COutPoint& a, const COutPoint& b) { return a.vout < b.vout; });

seems preferable, rather than first comparing a.hash < b.hash.

@DrahtBot
Copy link
Contributor

DrahtBot commented May 6, 2024

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage

For detailed information about the code coverage, see the test coverage report.

Reviews

See the guideline for information on the review process.
A summary of reviews will appear here.

Comment on lines 47 to 50
DataStream ser_a, ser_b;
ser_a << a;
ser_b << b;
return Span{ser_a} < Span{ser_b};
Copy link
Member

@darosior darosior May 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fail to see how it's not functionally equivalent? (but much less efficient)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See

"comment": "Outpoint ordering byte-lexicographically vs. vout-integer",
for an example of how the old method (a.hash < b.hash, a.vout < b.vout) will yield a different ordering than comparing outpoints lexicographically.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although, to your point, a.hash < b.hash is doing a lexicographic comparison:

constexpr int Compare(const base_blob& other) const { return std::memcmp(m_data.data(), other.m_data.data(), WIDTH); }

So maybe it would be more efficient to just serialize a.vout, b.vout as little-endian here, rather than serializing the whole COutPoint?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But 1 sorts before 256 lexicographically? So it's lexicographical serialized order, i see.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, good point. I added a comment to clarify that this is a lexicographic sort on the serialized outpoint. I think this is preferable to what we were doing before since its based on the protocol definition of an outpoint: <32-byte txid, little-endian>:<4-byte vout, little-endian>.

@Sjors
Copy link
Member

Sjors commented May 6, 2024

Don't forget to add a source comment that BIP352 depends on this exact ordering. Either in this PR or as a commit in #28122.

Comparing outpoints lexicographically is required for BIP352, but also
generally a less ambigious way to compare outpoints considering this
will match any orderings applied to the serialized transaction.
@josibake josibake force-pushed the lexicographically-sort-outpoints branch from e623d8a to e11d764 Compare May 6, 2024 12:52
@josibake
Copy link
Member Author

josibake commented May 6, 2024

Don't forget to add a source comment that BIP352 depends on this exact ordering. Either in this PR or as a commit in #28122.

Hm, I don't think we need a BIP352 specific comment here? I'd argue if we are going to sort outpoints, we should be sorting them based on their protocol definition, regardless of BIP352. In fact, when writing BIP352 we already assumed this was the case which is why we defined the outpoint sorting in BIP352 to be done lexicographically on the outpoint serialization. It was only recently that someone pointed out that Bitcoin Core wasn't sorting lexicographically based on the serialization and instead was comparing the txids lexicographically but comparing the vouts as integers.

@sipa
Copy link
Member

sipa commented May 6, 2024

This seems unnecessarily unnatural and inefficient to use as a default sort order for this type.

If BIP352 defines its own normative sort order, I think it's better to have a specialized BIP352Comparator class that implements it, and use that in the BIP352 code?

That also has the advantage of not tying the two together; someone might come in and change the order back to the current ordering for performance or simplicity reasons, and break the BIP352 code.

@josibake
Copy link
Member Author

josibake commented May 6, 2024

This seems unnecessarily unnatural and inefficient to use as a default sort order for this type.

If BIP352 defines its own normative sort order, I think it's better to have a specialized BIP352Comparator class that implements it, and use that in the BIP352 code?

That also has the advantage of not tying the two together; someone might come in and change the order back to the current ordering for performance or simplicity reasons, and break the BIP352 code.

Personally, I find it more confusing that we have three different ways we can compare outpoints, so lexicographic comparison based on the 36-byte serialization seemed the simplest default. Fair points regarding efficiency and not tightly coupling the default sort with a specific use case, though. Happy to close this in favor of a BIP352Comparator.

@josibake josibake closed this May 6, 2024
@josibake josibake deleted the lexicographically-sort-outpoints branch May 6, 2024 14:52
@bitcoin bitcoin deleted a comment from mannyg253 May 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants