New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
crypto: add NUMS_H
const
#30048
crypto: add NUMS_H
const
#30048
Conversation
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. Code CoverageFor detailed information about the code coverage, see the test coverage report. ReviewsSee the guideline for information on the review process.
If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concept ACK
Found via $ git grep 5092929b74
that we already define this byte-string constant twice for the miniscript fuzz and unit tests (src/test/fuzz/miniscript.cpp
and src/test/miniscript_tests.cpp
, called NUMS_PK
there), that's a good opportunity to deduplicate.
Nice! Pushed a second commit to update the existing tests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concept ACK
src/pubkey.cpp
Outdated
@@ -181,6 +181,9 @@ int ecdsa_signature_parse_der_lax(secp256k1_ecdsa_signature* sig, const unsigned | |||
return 1; | |||
} | |||
|
|||
static const std::vector<unsigned char> NUMS_H_DATA = {0x50, 0x92, 0x9b, 0x74, 0xc1, 0xa0, 0x49, 0x54, 0xb7, 0x8b, 0x4b, 0x60, 0x35, 0xe9, 0x7a, 0x5e, 0x07, 0x8a, 0x5a, 0x0f, 0x28, 0xec, 0x96, 0xd5, 0x47, 0xbf, 0xee, 0x9a, 0xce, 0x80, 0x3a, 0xc0}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a comment indicating where this comes from and why it's believed no one knows the discrete log?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a comment for where the point comes from. Tried to write a sentence for the "why" but don't really understand it well enough to articulate succinctly 😅.
From what I understand, because H
is derived from a known public value G
via a hash, we learn nothing about the discrete log of H
due to the hash function providing us a uniformly random value. But not sure if that's entirely correct; if you have a suggestion on how to better (or more correctly!) phrase this, happy to add!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, secp256k1-zkp has a straight forward description: https://github.com/BlockstreamResearch/secp256k1-zkp/blob/11af7015de624b010424273be3d91f117f172c82/src/modules/rangeproof/main_impl.h#L16
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perfect, will use this!
src/pubkey.cpp
Outdated
* | ||
* See https://github.com/bitcoin/bips/blob/master/bip-0341.mediawiki#constructing-and-spending-taproot-outputs | ||
*/ | ||
static const std::vector<unsigned char> NUMS_H_DATA = {0x50, 0x92, 0x9b, 0x74, 0xc1, 0xa0, 0x49, 0x54, 0xb7, 0x8b, 0x4b, 0x60, 0x35, 0xe9, 0x7a, 0x5e, 0x07, 0x8a, 0x5a, 0x0f, 0x28, 0xec, 0x96, 0xd5, 0x47, 0xbf, 0xee, 0x9a, 0xce, 0x80, 0x3a, 0xc0}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason to drop the ParseHex, which allows to write this as a string literal?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seemed unnecessary? Representing this as a byte vector allows us to create an XOnlyPublicKey
directly without an extra function call. Is there a reason to prefer having this as a string literal?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a style-nit, as it makes it easier to grep and compare the string literal. The function call could be moved to compile time, but I haven't implemented it yet. Calling it once at runtime should be fine also, for now?
Just a nit in any case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah cool, my slight aversion to ParseHex
was the call at runtime for a static const, but if we are going to move that to compile time and the string literal makes it easier to compare (e.g. with the value in BIP341, which is also a string literal), then happy to change it.
ed98b91
to
cb384b5
Compare
src/pubkey.cpp
Outdated
G2 = EllipticCurve ([F (0), F (7)]).lift_x(F(int(hashlib.sha256(G_DER.decode('hex')).hexdigest(),16))) | ||
print('%x %x' % G2.xy()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code snippet doesn't work for me on Sage 9.5 (see BlockstreamResearch/secp256k1-zkp#292). Also, we only need to output the x-coordinate here:
G2 = EllipticCurve ([F (0), F (7)]).lift_x(F(int(hashlib.sha256(G_DER.decode('hex')).hexdigest(),16))) | |
print('%x %x' % G2.xy()) | |
G2 = EllipticCurve ([F (0), F (7)]).lift_x(F(int(hashlib.sha256(bytes.fromhex(G_DER)).hexdigest(),16))) | |
print('%x' % G2.xy()[0]) |
(That said, I think it's also completely fine to not include any sage code here and instead just keep the first paragraph.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops! Should have ran it myself before including it. I have a slight preference for keeping the sage code, since its nice to be able to run that snippet and get the same value as a way of verifying. That being said, "be able to run that snippet" implies keeping the comment up to date, but I doubt that will be much (if any) of a maintenance burden.
Also think its more correct to output the x,y coordinate pair since in the comment we are referring to NUMS_H
as a point, whereas the x-only value is a byte encoding of a point. I did check that the y-coordinate printed out is even (i.e. int("31d3c6863973926e049e637cb1b5f40a36dac28af1766968c30c2313f3a38904", 16) % 2 == 0
), so the x-only encoding is equivalent to the point printed out by the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to fix the hex decoding. Went ahead and left the (x,y) print out, but can drop the y-value or drop the whole sage snippet if others feel strongly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree that keeping the sage code and printing out the full point makes sense w.r.t. your point vs byte encoding argument. (By the way, at least in our test-framework implementation of GE, lift_x
always produces a point with even y coordinate, but no idea if this is also the case in the GE of sage.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code-review ACK a5867a3
7040a20 doc: fix sage code for deriving alternative generator H (Sebastian Falbesoner) Pull request description: The line calculating H (in particular, the expression `G.decode('hex')`) fails with the following error message on Sage 9.5: ``` AttributeError: 'str' object has no attribute 'decode' ``` Fix that by converting the hex-string to bytes using `bytes.fromhex`. (Noticed while reviewing bitcoin/bitcoin#30048 which picks this code snippet comment up.) ACKs for top commit: josibake: ACK 7040a20 real-or-random: utACK 7040a20 Tree-SHA512: 0a44f399b103c2f5840056d163c1483a1d4f032bc0f8d3822507ac6da9d567f46e36caa79c7f5016aebcc8827b79e9aec7ebdb4f21c3c0242dc6875be140f289
src/pubkey.h
Outdated
@@ -300,6 +300,8 @@ class XOnlyPubKey | |||
SERIALIZE_METHODS(XOnlyPubKey, obj) { READWRITE(obj.m_keydata); } | |||
}; | |||
|
|||
extern const XOnlyPubKey NUMS_H; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Would it be better to make this a static const
member of the XOnlyPubKey
class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you elaborate? Not clear to me what the benefit would be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly that NUMS_H
is an XOnlyPubkey (rather than a CPubKey or the hex/bytes encoding of the point or something else), so maybe it would make sense to keep it tied to that class. Bit of a knee-jerk "does this really need to be in the global namespace" reaction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see. In my head, there is nothing about H
that requires it to be an XOnlyPubKey, but you're right that we are setting it that way (and thats the only way its used in our codebase).
Seems fine to make it a member, altho needing to call something like XOnlyPubKey{XOnlyPubKey::NUMS_H}
to use it seems kinda weird to me?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You'd write builder.Finalize(XOnlyPubKey::NUMS_H);
which seems fine?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh! I see , I misinterpreted what you were suggesting. This does seem better, will update.
tACK 57a0664 I checked that Sage produces the number, and ran the tests. It seems that BIP341 introduced H as just an example of a NUMS point:
The BIP got it from libsecp256k1 which IIUC only uses it for tests. As does Bitcoin Core so far. BIP352 gives it special status:
|
Code Review ACK 57a0664 The constant definition is consistent with BIP-341. PR removes code duplication and provides a single place to refer to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code review 57a0664. I didn't verify with sage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re-ACK 57a0664
src/pubkey.cpp
Outdated
import hashlib | ||
F = FiniteField (0xFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFEFFFFFC2F) | ||
G_DER = '0479be667ef9dcbbac55a06295ce870b07029bfcdb2dce28d959f2815b16f81798483ada7726a3c4655da4fbfc0e1108a8fd17b448a68554199c47d08ffb10d4b8' | ||
G2 = EllipticCurve ([F (0), F (7)]).lift_x(F(int(hashlib.sha256(bytes.fromhex(G_DER)).hexdigest(),16))) | ||
print('%x %x' % G2.xy()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would it make sense to add a Python test which runs this exact script and checks that the result equals NUMS_H_DATA
?
Maybe there's something like this already, found parts of it, could you please check? In that case we wouldn't need to put code into a comment, we could just point to the test instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that would require adding sage to our test dependencies (which I don't think we should do).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see the value of running a test like this? The code here is not the definition of H
, just one way of calculating the value, based on the definition in BIP341. For example, if the test were to fail, that would just mean the comment is out of date, not that our value for H
is incorrect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if we just extract it to code like it was done with the rest of the sage files, and reference the file instead of the code (I don't like dead code in comments), e.g. https://github.com/bitcoin/bitcoin/blob/master/src/secp256k1/src/ecdsa_impl.h#L19 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Going to leave this as is for now. The sage code in the comment is meant to provide background on where the constant comes from, so I'd prefer to leave it since it is more precise than a written out explanation. However, the code is not essential for checking the "correctness" of the value in that it would be sufficient for someone to simply check that this value matches BIP341. This value will never change, either, so I don't see any advantage to having a sage file in this repo for generating the constant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you want, all the EC logic for point decompression is in test/functional/test_framework/crypto/secp256k1.py
, no need for Sage:
from test_framework.crypto.secp256k1 import G
from hashlib import sha256
print(sha256(G.to_bytes_uncompressed()).digest().hex())
(and to test it's actually an X coordinate):
from test_framework.crypto.secp256k1 import FE, G, GE
from hashlib import sha256
assert GE.lift_x(FE.from_bytes(sha256(G.to_bytes_uncompressed()).digest())) is not None
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the code snippets @sipa . Seems like there is appetite for this, so added this script to test/functional/test_framework/crypto/secp256k1.py
as a unit test and removed the sage code from the comment.
ACK 57a0664 |
src/pubkey.cpp
Outdated
|
||
import hashlib | ||
F = FiniteField (0xFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFEFFFFFC2F) | ||
G_DER = '0479be667ef9dcbbac55a06295ce870b07029bfcdb2dce28d959f2815b16f81798483ada7726a3c4655da4fbfc0e1108a8fd17b448a68554199c47d08ffb10d4b8' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not DER encoding; DER is only used for signatures (and once upon a time, for private keys in the wallet). The byte encoding used for secp256k1 public keys is defined in section 2.3.3 of SEC1 v2 (https://www.secg.org/sec1-v2.pdf), though I believe ANSI and/or NIST also standardized it (not for secp256k1 specifically, but for prime-ordered curves in general).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed!
Updated the comment to remove the sage code and replaced with a python unit test for calculating H per @paplorinc and @sipa 's suggestion. |
def test_H(self): | ||
H = sha256(G.to_bytes_uncompressed()).digest() | ||
assert GE.lift_x(FE.from_bytes(H)) is not None | ||
self.assertEqual(H.hex(), "50929b74c1a04954b78b4b6035e97a5e078a5a0f28ec96d547bfee9ace803ac0") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it possible to avoid hard coding? That kinda' defeats the purpose of the test I think
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so? The purpose of adding this code is to give a precise definition of how the constant from BIP341 is calculated. The assert confirms that this code generates a point which matches the expected value from BIP341 and allows anyone else to verify that the code == constant in src/pubkey.cpp
== value from BIP341.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But we're not actually testing the production code this way, i.e. the test would still pass if NUMS_H_DATA
were to change (e.g. after an accidental local search/replace or whatever).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, but that's not the intent of the python code: it's merely to provide a more precise explanation of the definition of H
. A unit test on secp256k1 seemed like the most natural fit.
Seems like what you actually want is a unit test in the C++ code to generate the value of H
and then compare it to XOnlyPubKey::NUMS_H
. I've added that in 9408a04
@@ -344,3 +346,9 @@ def mul(self, a): | |||
|
|||
# Precomputed table with multiples of G for fast multiplication | |||
FAST_G = FastGEMul(G) | |||
|
|||
class TestFrameworkSecp256k1(unittest.TestCase): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should add secp256k1 to the unit test modules list (so it also gets executed on CI):
bitcoin/test/functional/feature_framework_unit_tests.py
Lines 15 to 17 in 7fcf4e9
# List of framework modules containing unit tests. Should be kept in sync with | |
# the output of `git grep unittest.TestCase ./test/functional/test_framework` | |
TEST_FRAMEWORK_MODULES = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
1e02108
to
8d4c70c
Compare
🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the Possibly this is due to a silent merge conflict (the changes in this pull request being Leave a comment here, if you need help tracking down a confusing failure. |
re-ACK 9408a04 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code-review ACK 9408a04
HashWriter hw; | ||
hw.write(MakeByteSpan(G_uncompressed)); | ||
XOnlyPubKey H{hw.GetSHA256()}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit, as one-liner (feel free to ignore):
HashWriter hw; | |
hw.write(MakeByteSpan(G_uncompressed)); | |
XOnlyPubKey H{hw.GetSHA256()}; | |
XOnlyPubKey H{(HashWriter{} << Span(G_uncompressed)).GetSHA256()}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, will add if I end up needing to retouch!
ACK 9408a04 |
Broken out from #28122
BIP341 defines a NUMS point
H
as H = lift_x(0x50929b74c1a04954b78b4b6035e97a5e078a5a0f28ec96d547bfee9ace803ac0) which is constructed by taking the hash of the standard uncompressed encoding of the secp256k1 base point G as X coordinate."Add this as a constant so it can be used in our codebase. My primary motivation is BIP352 specifies a special case for when taproot spends use
H
as the internal key, but outside of BIP352 it seems generally useful to haveH
in the codebase, for testing or other use cases.