Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

crypto: add NUMS_H const #30048

Merged
merged 2 commits into from May 17, 2024
Merged

crypto: add NUMS_H const #30048

merged 2 commits into from May 17, 2024

Conversation

josibake
Copy link
Member

@josibake josibake commented May 6, 2024

Broken out from #28122


BIP341 defines a NUMS point H as H = lift_x(0x50929b74c1a04954b78b4b6035e97a5e078a5a0f28ec96d547bfee9ace803ac0) which is constructed by taking the hash of the standard uncompressed encoding of the secp256k1 base point G as X coordinate."

Add this as a constant so it can be used in our codebase. My primary motivation is BIP352 specifies a special case for when taproot spends use H as the internal key, but outside of BIP352 it seems generally useful to have H in the codebase, for testing or other use cases.

@DrahtBot
Copy link
Contributor

DrahtBot commented May 6, 2024

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage

For detailed information about the code coverage, see the test coverage report.

Reviews

See the guideline for information on the review process.

Type Reviewers
ACK paplorinc, theStack, achow101
Concept ACK ajtowns
Stale ACK Sjors, S3RK

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #28201 (Silent Payments: sending by josibake)
  • #28122 (Silent Payments: Implement BIP352 by josibake)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

Copy link
Contributor

@theStack theStack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK

Found via $ git grep 5092929b74 that we already define this byte-string constant twice for the miniscript fuzz and unit tests (src/test/fuzz/miniscript.cpp and src/test/miniscript_tests.cpp, called NUMS_PK there), that's a good opportunity to deduplicate.

@josibake
Copy link
Member Author

josibake commented May 6, 2024

Found via $ git grep 5092929b74

Nice! Pushed a second commit to update the existing tests.

Copy link
Contributor

@ajtowns ajtowns left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK

src/pubkey.cpp Outdated
@@ -181,6 +181,9 @@ int ecdsa_signature_parse_der_lax(secp256k1_ecdsa_signature* sig, const unsigned
return 1;
}

static const std::vector<unsigned char> NUMS_H_DATA = {0x50, 0x92, 0x9b, 0x74, 0xc1, 0xa0, 0x49, 0x54, 0xb7, 0x8b, 0x4b, 0x60, 0x35, 0xe9, 0x7a, 0x5e, 0x07, 0x8a, 0x5a, 0x0f, 0x28, 0xec, 0x96, 0xd5, 0x47, 0xbf, 0xee, 0x9a, 0xce, 0x80, 0x3a, 0xc0};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a comment indicating where this comes from and why it's believed no one knows the discrete log?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a comment for where the point comes from. Tried to write a sentence for the "why" but don't really understand it well enough to articulate succinctly 😅.

From what I understand, because H is derived from a known public value G via a hash, we learn nothing about the discrete log of H due to the hash function providing us a uniformly random value. But not sure if that's entirely correct; if you have a suggestion on how to better (or more correctly!) phrase this, happy to add!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perfect, will use this!

src/pubkey.cpp Outdated
*
* See https://github.com/bitcoin/bips/blob/master/bip-0341.mediawiki#constructing-and-spending-taproot-outputs
*/
static const std::vector<unsigned char> NUMS_H_DATA = {0x50, 0x92, 0x9b, 0x74, 0xc1, 0xa0, 0x49, 0x54, 0xb7, 0x8b, 0x4b, 0x60, 0x35, 0xe9, 0x7a, 0x5e, 0x07, 0x8a, 0x5a, 0x0f, 0x28, 0xec, 0x96, 0xd5, 0x47, 0xbf, 0xee, 0x9a, 0xce, 0x80, 0x3a, 0xc0};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason to drop the ParseHex, which allows to write this as a string literal?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seemed unnecessary? Representing this as a byte vector allows us to create an XOnlyPublicKey directly without an extra function call. Is there a reason to prefer having this as a string literal?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a style-nit, as it makes it easier to grep and compare the string literal. The function call could be moved to compile time, but I haven't implemented it yet. Calling it once at runtime should be fine also, for now?

Just a nit in any case.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah cool, my slight aversion to ParseHex was the call at runtime for a static const, but if we are going to move that to compile time and the string literal makes it easier to compare (e.g. with the value in BIP341, which is also a string literal), then happy to change it.

@josibake josibake force-pushed the add-NUMS-H branch 2 times, most recently from ed98b91 to cb384b5 Compare May 7, 2024 10:03
@josibake
Copy link
Member Author

josibake commented May 7, 2024

Added a comment per @ajtowns 's feedback and reverted to using ParseHex on a string literal per @maflcko 's feedback.

src/pubkey.cpp Outdated
Comment on lines 195 to 196
G2 = EllipticCurve ([F (0), F (7)]).lift_x(F(int(hashlib.sha256(G_DER.decode('hex')).hexdigest(),16)))
print('%x %x' % G2.xy())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code snippet doesn't work for me on Sage 9.5 (see BlockstreamResearch/secp256k1-zkp#292). Also, we only need to output the x-coordinate here:

Suggested change
G2 = EllipticCurve ([F (0), F (7)]).lift_x(F(int(hashlib.sha256(G_DER.decode('hex')).hexdigest(),16)))
print('%x %x' % G2.xy())
G2 = EllipticCurve ([F (0), F (7)]).lift_x(F(int(hashlib.sha256(bytes.fromhex(G_DER)).hexdigest(),16)))
print('%x' % G2.xy()[0])

(That said, I think it's also completely fine to not include any sage code here and instead just keep the first paragraph.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops! Should have ran it myself before including it. I have a slight preference for keeping the sage code, since its nice to be able to run that snippet and get the same value as a way of verifying. That being said, "be able to run that snippet" implies keeping the comment up to date, but I doubt that will be much (if any) of a maintenance burden.

Also think its more correct to output the x,y coordinate pair since in the comment we are referring to NUMS_H as a point, whereas the x-only value is a byte encoding of a point. I did check that the y-coordinate printed out is even (i.e. int("31d3c6863973926e049e637cb1b5f40a36dac28af1766968c30c2313f3a38904", 16) % 2 == 0), so the x-only encoding is equivalent to the point printed out by the code.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to fix the hex decoding. Went ahead and left the (x,y) print out, but can drop the y-value or drop the whole sage snippet if others feel strongly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree that keeping the sage code and printing out the full point makes sense w.r.t. your point vs byte encoding argument. (By the way, at least in our test-framework implementation of GE, lift_x always produces a point with even y coordinate, but no idea if this is also the case in the GE of sage.)

Copy link
Contributor

@theStack theStack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code-review ACK a5867a3

@DrahtBot DrahtBot requested a review from ajtowns May 8, 2024 09:31
real-or-random added a commit to BlockstreamResearch/secp256k1-zkp that referenced this pull request May 8, 2024
7040a20 doc: fix sage code for deriving alternative generator H (Sebastian Falbesoner)

Pull request description:

  The line calculating H (in particular, the expression `G.decode('hex')`) fails with the following error message on Sage 9.5:

  ```
  AttributeError: 'str' object has no attribute 'decode'
  ```

  Fix that by converting the hex-string to bytes using `bytes.fromhex`.

  (Noticed while reviewing bitcoin/bitcoin#30048 which picks this code snippet comment up.)

ACKs for top commit:
  josibake:
    ACK 7040a20
  real-or-random:
    utACK 7040a20

Tree-SHA512: 0a44f399b103c2f5840056d163c1483a1d4f032bc0f8d3822507ac6da9d567f46e36caa79c7f5016aebcc8827b79e9aec7ebdb4f21c3c0242dc6875be140f289
@josibake josibake mentioned this pull request May 8, 2024
15 tasks
src/pubkey.h Outdated
@@ -300,6 +300,8 @@ class XOnlyPubKey
SERIALIZE_METHODS(XOnlyPubKey, obj) { READWRITE(obj.m_keydata); }
};

extern const XOnlyPubKey NUMS_H;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Would it be better to make this a static const member of the XOnlyPubKey class?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate? Not clear to me what the benefit would be.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly that NUMS_H is an XOnlyPubkey (rather than a CPubKey or the hex/bytes encoding of the point or something else), so maybe it would make sense to keep it tied to that class. Bit of a knee-jerk "does this really need to be in the global namespace" reaction.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see. In my head, there is nothing about H that requires it to be an XOnlyPubKey, but you're right that we are setting it that way (and thats the only way its used in our codebase).

Seems fine to make it a member, altho needing to call something like XOnlyPubKey{XOnlyPubKey::NUMS_H} to use it seems kinda weird to me?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You'd write builder.Finalize(XOnlyPubKey::NUMS_H); which seems fine?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh! I see , I misinterpreted what you were suggesting. This does seem better, will update.

@Sjors
Copy link
Member

Sjors commented May 10, 2024

tACK 57a0664

I checked that Sage produces the number, and ran the tests.

It seems that BIP341 introduced H as just an example of a NUMS point:

One example of such a point is H

The BIP got it from libsecp256k1 which IIUC only uses it for tests. As does Bitcoin Core so far.

BIP352 gives it special status:

The one exception is script path spends that use NUMS point H as their internal key

@DrahtBot DrahtBot requested a review from theStack May 10, 2024 07:50
@S3RK
Copy link
Contributor

S3RK commented May 10, 2024

Code Review ACK 57a0664

The constant definition is consistent with BIP-341. PR removes code duplication and provides a single place to refer to NUMS_H either as vector of bytes or x-only pubkey.

Copy link
Member

@theuni theuni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review 57a0664. I didn't verify with sage.

Copy link
Contributor

@theStack theStack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re-ACK 57a0664

@DrahtBot DrahtBot requested a review from theStack May 11, 2024 04:35
src/pubkey.cpp Outdated
Comment on lines 192 to 196
import hashlib
F = FiniteField (0xFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFEFFFFFC2F)
G_DER = '0479be667ef9dcbbac55a06295ce870b07029bfcdb2dce28d959f2815b16f81798483ada7726a3c4655da4fbfc0e1108a8fd17b448a68554199c47d08ffb10d4b8'
G2 = EllipticCurve ([F (0), F (7)]).lift_x(F(int(hashlib.sha256(bytes.fromhex(G_DER)).hexdigest(),16)))
print('%x %x' % G2.xy())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it make sense to add a Python test which runs this exact script and checks that the result equals NUMS_H_DATA?
Maybe there's something like this already, found parts of it, could you please check? In that case we wouldn't need to put code into a comment, we could just point to the test instead.

Copy link
Member

@Sjors Sjors May 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that would require adding sage to our test dependencies (which I don't think we should do).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see the value of running a test like this? The code here is not the definition of H, just one way of calculating the value, based on the definition in BIP341. For example, if the test were to fail, that would just mean the comment is out of date, not that our value for H is incorrect.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we just extract it to code like it was done with the rest of the sage files, and reference the file instead of the code (I don't like dead code in comments), e.g. https://github.com/bitcoin/bitcoin/blob/master/src/secp256k1/src/ecdsa_impl.h#L19 ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Going to leave this as is for now. The sage code in the comment is meant to provide background on where the constant comes from, so I'd prefer to leave it since it is more precise than a written out explanation. However, the code is not essential for checking the "correctness" of the value in that it would be sufficient for someone to simply check that this value matches BIP341. This value will never change, either, so I don't see any advantage to having a sage file in this repo for generating the constant.

Copy link
Member

@sipa sipa May 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want, all the EC logic for point decompression is in test/functional/test_framework/crypto/secp256k1.py, no need for Sage:

from test_framework.crypto.secp256k1 import G
from hashlib import sha256

print(sha256(G.to_bytes_uncompressed()).digest().hex())

(and to test it's actually an X coordinate):

from test_framework.crypto.secp256k1 import FE, G, GE
from hashlib import sha256

assert GE.lift_x(FE.from_bytes(sha256(G.to_bytes_uncompressed()).digest())) is not None

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the code snippets @sipa . Seems like there is appetite for this, so added this script to test/functional/test_framework/crypto/secp256k1.py as a unit test and removed the sage code from the comment.

@paplorinc
Copy link
Contributor

ACK 57a0664

src/pubkey.cpp Outdated

import hashlib
F = FiniteField (0xFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFEFFFFFC2F)
G_DER = '0479be667ef9dcbbac55a06295ce870b07029bfcdb2dce28d959f2815b16f81798483ada7726a3c4655da4fbfc0e1108a8fd17b448a68554199c47d08ffb10d4b8'
Copy link
Member

@sipa sipa May 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not DER encoding; DER is only used for signatures (and once upon a time, for private keys in the wallet). The byte encoding used for secp256k1 public keys is defined in section 2.3.3 of SEC1 v2 (https://www.secg.org/sec1-v2.pdf), though I believe ANSI and/or NIST also standardized it (not for secp256k1 specifically, but for prime-ordered curves in general).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed!

@josibake
Copy link
Member Author

Updated the comment to remove the sage code and replaced with a python unit test for calculating H per @paplorinc and @sipa 's suggestion.

def test_H(self):
H = sha256(G.to_bytes_uncompressed()).digest()
assert GE.lift_x(FE.from_bytes(H)) is not None
self.assertEqual(H.hex(), "50929b74c1a04954b78b4b6035e97a5e078a5a0f28ec96d547bfee9ace803ac0")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it possible to avoid hard coding? That kinda' defeats the purpose of the test I think

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so? The purpose of adding this code is to give a precise definition of how the constant from BIP341 is calculated. The assert confirms that this code generates a point which matches the expected value from BIP341 and allows anyone else to verify that the code == constant in src/pubkey.cpp == value from BIP341.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But we're not actually testing the production code this way, i.e. the test would still pass if NUMS_H_DATA were to change (e.g. after an accidental local search/replace or whatever).

Copy link
Member Author

@josibake josibake May 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, but that's not the intent of the python code: it's merely to provide a more precise explanation of the definition of H. A unit test on secp256k1 seemed like the most natural fit.

Seems like what you actually want is a unit test in the C++ code to generate the value of H and then compare it to XOnlyPubKey::NUMS_H. I've added that in 9408a04

@@ -344,3 +346,9 @@ def mul(self, a):

# Precomputed table with multiples of G for fast multiplication
FAST_G = FastGEMul(G)

class TestFrameworkSecp256k1(unittest.TestCase):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should add secp256k1 to the unit test modules list (so it also gets executed on CI):

# List of framework modules containing unit tests. Should be kept in sync with
# the output of `git grep unittest.TestCase ./test/functional/test_framework`
TEST_FRAMEWORK_MODULES = [

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

@josibake josibake force-pushed the add-NUMS-H branch 2 times, most recently from 1e02108 to 8d4c70c Compare May 14, 2024 09:40
@DrahtBot
Copy link
Contributor

🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the
documentation.

Possibly this is due to a silent merge conflict (the changes in this pull request being
incompatible with the current code in the target branch). If so, make sure to rebase on the latest
commit of the target branch.

Leave a comment here, if you need help tracking down a confusing failure.

Debug: https://github.com/bitcoin/bitcoin/runs/24941511130

@paplorinc
Copy link
Contributor

re-ACK 9408a04

@DrahtBot DrahtBot requested review from Sjors, S3RK and theStack May 14, 2024 11:22
Copy link
Contributor

@theStack theStack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code-review ACK 9408a04

Comment on lines +371 to +373
HashWriter hw;
hw.write(MakeByteSpan(G_uncompressed));
XOnlyPubKey H{hw.GetSHA256()};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, as one-liner (feel free to ignore):

Suggested change
HashWriter hw;
hw.write(MakeByteSpan(G_uncompressed));
XOnlyPubKey H{hw.GetSHA256()};
XOnlyPubKey H{(HashWriter{} << Span(G_uncompressed)).GetSHA256()};

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, will add if I end up needing to retouch!

@achow101
Copy link
Member

ACK 9408a04

@achow101 achow101 merged commit 4877fcd into bitcoin:master May 17, 2024
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet