New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compare COutPoints
lexicographically
#30046
Conversation
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. Code CoverageFor detailed information about the code coverage, see the test coverage report. ReviewsSee the guideline for information on the review process. |
src/primitives/transaction.h
Outdated
DataStream ser_a, ser_b; | ||
ser_a << a; | ||
ser_b << b; | ||
return Span{ser_a} < Span{ser_b}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I fail to see how it's not functionally equivalent? (but much less efficient)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See
"comment": "Outpoint ordering byte-lexicographically vs. vout-integer", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although, to your point, a.hash < b.hash
is doing a lexicographic comparison:
Line 54 in e623d8a
constexpr int Compare(const base_blob& other) const { return std::memcmp(m_data.data(), other.m_data.data(), WIDTH); } |
So maybe it would be more efficient to just serialize a.vout, b.vout
as little-endian here, rather than serializing the whole COutPoint
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But 1 sorts before 256 lexicographically? So it's lexicographical serialized order, i see.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, good point. I added a comment to clarify that this is a lexicographic sort on the serialized outpoint. I think this is preferable to what we were doing before since its based on the protocol definition of an outpoint: <32-byte txid, little-endian>:<4-byte vout, little-endian>
.
Don't forget to add a source comment that BIP352 depends on this exact ordering. Either in this PR or as a commit in #28122. |
Comparing outpoints lexicographically is required for BIP352, but also generally a less ambigious way to compare outpoints considering this will match any orderings applied to the serialized transaction.
e623d8a
to
e11d764
Compare
Hm, I don't think we need a BIP352 specific comment here? I'd argue if we are going to sort outpoints, we should be sorting them based on their protocol definition, regardless of BIP352. In fact, when writing BIP352 we already assumed this was the case which is why we defined the outpoint sorting in BIP352 to be done lexicographically on the outpoint serialization. It was only recently that someone pointed out that Bitcoin Core wasn't sorting lexicographically based on the serialization and instead was comparing the txids lexicographically but comparing the vouts as integers. |
This seems unnecessarily unnatural and inefficient to use as a default sort order for this type. If BIP352 defines its own normative sort order, I think it's better to have a specialized BIP352Comparator class that implements it, and use that in the BIP352 code? That also has the advantage of not tying the two together; someone might come in and change the order back to the current ordering for performance or simplicity reasons, and break the BIP352 code. |
Personally, I find it more confusing that we have three different ways we can compare outpoints, so lexicographic comparison based on the 36-byte serialization seemed the simplest default. Fair points regarding efficiency and not tightly coupling the default sort with a specific use case, though. Happy to close this in favor of a BIP352Comparator. |
Broken out from #28122
BIP352 requires the smallest output lexicographically, which is my primary motivation for adding this as the default way of compare
COutPoint
s (e.g.bitcoin/src/common/bip352.cpp
Line 149 in 5b06ccf
Outside of needing this for #28122 , I think it makes more sense to compare
COutPoints
by their serialization for any use case that needs ordering since the serialization is what appears in the final transaction. I also didn't see any existing places in the code where we order outpoints, so this change doesn't conflict with any current use cases that I can see. If there is a need to order inputs by theirvout
field (e.g. for an RPC output), something likeseems preferable, rather than first comparing
a.hash < b.hash
.