Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Domain separators for key/value hashes in LeafOp #76

Open
hdevalence opened this issue Feb 23, 2022 · 4 comments
Open

Domain separators for key/value hashes in LeafOp #76

hdevalence opened this issue Feb 23, 2022 · 4 comments

Comments

@hdevalence
Copy link
Collaborator

Hi, we're working on adding ICS23 support for a variant of the Jellyfish Merkle Tree from Diem, modified slightly to be a generic, byte-oriented k/v store rather than a strongly typed object store (as in the original Diem code).

One issue we're running into is expressing the leaf node hashing using a LeafOp. Our leaf nodes are hashed as

SHA256( b"JMT::LeafNode" || SHA256(b"JMT::Key" || key) || SHA256(b"JMT::Value" || value) )

i.e., both key and value bytes are prehashed, with different domain separators.

We could almost express this as a LeafOp with hash = SHA256, prefix = b"JMT::LeafNode", prehash_key, prehash_value = SHA256, length = NO_PREFIX, but that will compute

SHA256( b"JMT::LeafNode" || SHA256(key) || SHA256(value) )

without the domain separators on the key/value pairs.

It seems like there are two possible resolutions:

  1. Remove the domain separators from the key and value hashes (this is less preferred, since domain separators are a good practice);

  2. Change the LeafOp proto to add new prefix fields:

message LeafOp {
    HashOp hash = 1;
    HashOp prehash_key = 2;
    HashOp prehash_value = 3;
    LengthOp length = 4;
    bytes prefix = 5;
    // NEW: fixed bytes that may optionally be included as a domain separator for prehash_key
    bytes prehash_key_prefix = 6;
    // NEW: fixed bytes that may optionally be included as a domain separator for prehash_value
    bytes prehash_value_prefix = 7;
}

The computation would then become

  hkey = prehashKey(prehash_key_prefix || key)
  hvalue = prehashValue(prehash_value_prefix || value)
  output = hash(prefix || length(hkey) || hkey || length(hvalue) || hvalue)

This change would be backwards-compatible, because protos allow missing fields, and if the prehash_*_prefix fields are missing, the behavior is exactly the same as the current implementation.

Does (2) seem sensible, or is there something we're missing?

@ethanfrey
Copy link
Contributor

It seems like there are two possible resolutions:

  1. Remove the domain separators from the key and value hashes (this is less preferred, since domain separators are a good practice);

One option is not to remove, them, but replace them with a length prefix. Varint, like protobuf does

Also, I don't see the use in prefixing something before hashing it. It doesn't separate anything. After hashing, they are all the same 32 byte length, so we know how to separate.

How does SHA256( b"JMT::LeafNode" || SHA256(b"JMT::Key" || key) || SHA256(b"JMT::Value" || value) )
Avoid some collision or confusion that exists in SHA256( b"JMT::LeafNode" || SHA256(key) || SHA256(value) )

Does (2) seem sensible, or is there something we're missing?

That you need every ibc enabled blockchain to update to some newer version with this support before you can use this proof format.

@ethanfrey
Copy link
Contributor

As a side note, I have realised that while every Merkle tree or trie I have seen could create proofs in some ics23 format with the same security and no significant performance hit, everyone likes to make their custom Merkle hashing algorithm. I mean Parity uses a bitmap to avoid storing all 16 child hashes in memory before hashing them. 🤯

I am close to giving up on the possibility of finding a universal format, as it feels like herding cats. I would propose we allow some Turing complete VM (like Wasm?) where people can upload their own ics23 verifiers.

@hdevalence
Copy link
Collaborator Author

One option is not to remove, them, but replace them with a length prefix. Varint, like protobuf does

Also, I don't see the use in prefixing something before hashing it. It doesn't separate anything. After hashing, they are all the same 32 byte length, so we know how to separate.

How does SHA256( b"JMT::LeafNode" || SHA256(b"JMT::Key" || key) || SHA256(b"JMT::Value" || value) ) Avoid some collision or confusion that exists in SHA256( b"JMT::LeafNode" || SHA256(key) || SHA256(value) )

A length prefix doesn't help here, because the point is to do domain separation, to ensure that keys are hashed with a different hash function than values, so that it is impossible to confuse a key hash with a value hash (or any other hash of any other data from any other protocol).

Does (2) seem sensible, or is there something we're missing?

That you need every ibc enabled blockchain to update to some newer version with this support before you can use this proof format.

Yes, that's correct, but it's a drop-in upgrade, because it's backwards-compatible with all existing proof specifications.

I have realised that while every Merkle tree or trie I have seen could create proofs in some ics23 format with the same security and no significant performance hit, everyone likes to make their custom Merkle hashing algorithm.

To be clear, we have a working implementation of ICS23 proofs for our Merkle tree with these changes. We just need to be able to express domain separators for keys and values.

@ethanfrey
Copy link
Contributor

Yes, that's correct, but it's a drop-in upgrade, because it's backwards-compatible with all existing proof specifications.

Yes, but it will take time to roll out.
I just cut 0.7 with support for SMT, which should roll out with ibc-go v3

You can check with the ibc-go team as to estimates when they would roll out newer versions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants