Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

generator: massively speed up serialization #294

Merged

Conversation

apoelstra
Copy link
Contributor

secp256k1_pedersen_commit_serialize would call _load (which does a sqrt to fully decompress the key, then a conditional negation based on the flag), then check the Jacobian symbol of the resulting y-coordinate, then re-serialize based on this.

Instead, don't do any of this stuff. Copy the flag directly out of the internal representation and copy the x-coordinate directly out of the internal representation.

Checked that none of the other _serialize methods in the modules do this.

Fixes #293

Copy link
Collaborator

@real-or-random real-or-random left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK

Comment on lines 303 to 304
output[0] = 8 ^ (commit->data[0] & 1);
memcpy(&output[1], &commit->data[1], 32);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't you even memcpy the entire 33 bytes?

Copy link

@delta1 delta1 May 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't you even memcpy the entire 33 bytes?

I tried this, it does pass the tests :)

I see 8 ^ (commit->data[0] & 1) alternates between 1000 and 1001 binary for even/odd values of data[0].

Out of interest as a cryptography noob, what is the significance of this?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's just a flag byte. 0x02, 0x03 (and a few others) are already used for public keys, so whoever wrote that code took 0x08 and 0x09 to distinguish commitments from public keys.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@real-or-random thanks!

@apoelstra
Copy link
Contributor Author

Changed to memcpy the entire 33 bytes. Also added a commit which does the same thing for parsing.

Previously, parsing decoded the whole point, then conditionally negated it, then extracted the jacobi symbol of the y-coordinate (which is entirely determined by the ge_set_xquad followed by conditional negation!), then normalized and re-serialized the x-coordinate (which was parsed directly using fe32_set_b32_limit so it was already normalized).

But all we need to do is (a) check that the first byte is 8 or 9; (b) check that the remainder is a valid x-coordinate, then (c) memcpy.

Step (b) is still pretty slow, compared to serialization which is just a memcpy, but this is a big improvement.

@apoelstra
Copy link
Contributor Author

CI failure appears to be some unrelated break in our macos image.

Copy link
Contributor

@jonasnick jonasnick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK

This PR looks good.

As a sidenote (not this PR) the implementation of pedersen_commitment_load and save is quite different from pubkey_load and save. In contrast to pubkey loading, pedersen commitment loading is costly, which results in unnecessary sqrts when loading commitments in verify_tally.

Comment on lines 270 to 273
CHECK(secp256k1_pedersen_commit(CTX, &commits[i], &blinds[i * 32], values[i], secp256k1_generator_h));
CHECK(secp256k1_pedersen_commitment_serialize(CTX, result, &commits[i]));
CHECK(secp256k1_pedersen_commitment_parse(CTX, &parse, result));
CHECK(secp256k1_memcmp_var(&commits[i], result, 33) == 0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should also check that parse is correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, yeah, it should be checking commits[i] against parse rather than against result :).

This code works but only because of a bunch of C weirdness. Will fix.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

`secp256k1_pedersen_commit_serialize` would call `_load` (which does a
sqrt to fully decompress the key, then a conditional negation based on
the flag), then check the Jacobian symbol of the resulting y-coordinate,
then re-serialize based on this.

Instead, don't do any of this stuff. Copy the flag directly out of the
internal representation and copy the x-coordinate directly out of the
internal representation.

Checked that none of the other _serialize methods in the modules do
this.

Fixes BlockstreamResearch#293
Copy link
Collaborator

@real-or-random real-or-random left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could also consider overwriting the output with zeros (or anything which is initialized but not a valid commitment) when parsing fails. We do this in upstream for other parsing functions.

Not sure, it may be a bit arbitrary to do it here now. This should maybe go to a proper PR that does it consistently in all -zkp modules.

@@ -288,10 +288,8 @@ int secp256k1_pedersen_commitment_parse(const secp256k1_context* ctx, secp256k1_
!secp256k1_ge_set_xquad(&ge, &x)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
!secp256k1_ge_set_xquad(&ge, &x)) {
!secp256k1_ge_x_on_curve_var(&x)) {

and we can drop ge entirely.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, nice!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Force-pushed to do this.

Similar to speeding up serialization; in our parsing logic we did a
bunch of expensive stuff then expensively inverted it. Drop everything
except the essential checks and then memcpy.
Copy link
Collaborator

@real-or-random real-or-random left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utACK 6361266

I think I know how to fix CI, I'll open a PR later today or tomorrow

@real-or-random
Copy link
Collaborator

The CI issue should be resolved once the GitHub Actions image is updated to have brew 4.3.1, which has the fix (Homebrew/brew#17336). The images are updated every week, so the issue should just disappear in a few days.

@jonasnick
Copy link
Contributor

I am surprised that the last 31 bytes of the pedersen commitment struct are uninitialized preventing something like

memcpy(&com1, &com2, sizeof(secp256k1_pedersen_commitment));

However, that is already the case without this PR.

Copy link
Contributor

@jonasnick jonasnick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 6361266

@jonasnick jonasnick merged commit 1683772 into BlockstreamResearch:master May 22, 2024
97 of 107 checks passed
@real-or-random
Copy link
Collaborator

I am surprised that the last 31 bytes of the pedersen commitment struct are uninitialized preventing something like

memcpy(&com1, &com2, sizeof(secp256k1_pedersen_commitment));

Yes, that's a bit strange. That means we can always switch to an uncompressed internal representation. I haven't thought about the performance implications, but uncompressed should usually be faster (unless you only deserialize and serialize again without doing actual computations, but why should you do this...)

This dates back to edb879f (see also the parent commit fca4c3b which does the same for secp256k1_generator), but I can't find the corresponding PR. @apoelstra Do you remember why you changed this?

@apoelstra
Copy link
Contributor Author

Yes, that's a bit strange. That means we can always switch to an uncompressed internal representation. I haven't thought about the performance implications, but uncompressed should usually be faster (unless you only deserialize and serialize again without doing actual computations, but why should you do this...)

This is a somewhat common thing to do with Elements transactions, where you parse the whole transaction (including decompressing the points, because this is how you obtain valid pubkey-type objects) and then only manipulate non-crypto parts before re-serializing.

With normal pubkeys serialization is (almost) just as fast with a compressed vs uncompressed representation because you just need to check the lsb of the y coordinate. But with secp-zkp keys, recompressing an uncompressed key requires recomputing the jacobi symbol :/.

This dates back to edb879f (see also the parent commit fca4c3b which does the same for secp256k1_generator), but I can't find the corresponding PR. @apoelstra Do you remember why you changed this?

Afraid not. And I also cannot find the corresponding PR.

@real-or-random
Copy link
Collaborator

I see, so you're saying it makes sense to keep the internal representation compressed for performance reasons?

If we keep this, then we should at least write all 64 bytes to avoid passing uninitialized memory to the user. Or simply switch back to an opaque type that has just 33 bytes. The latter is cleaner and simpler IMO.

@apoelstra
Copy link
Contributor Author

Honestly I'm tempted to switch to 65 bytes before switching to 33. So we'd have the uncompressed representation which can be efficiently manipulated by the actual crypto functions, and also an xquad flag bit (probably also useful for crypto actually) so we can quickly serialize.

@real-or-random
Copy link
Collaborator

real-or-random commented May 22, 2024

Ok, sure, that's also possible and nicer.

The issue with switching either to 33 or 65 is that it's a breaks ABI compatibility for any code copying or moving structs. We say this:

* The exact representation of data inside is implementation defined and not
* guaranteed to be portable between different platforms or versions. It is
* however guaranteed to be 64 bytes in size, and can be safely copied/moved.

Strictly speaking, all of this is in an experimental module, but yeah, not sure if we want to risk breaking things.

@apoelstra
Copy link
Contributor Author

Agreed. I think for now we should leave it alone. Maybe if/when we implement BP++ and switch to it in Elements, we can consider also breaking the generator API.

@apoelstra apoelstra deleted the 2024-04--fast-serialize branch May 22, 2024 14:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

secp256k1_pedersen_commitment_serialize decompresses the point then recompresses it
4 participants