Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[HexaryTrie] Can't use proof to compute updated root-hash on key removal #120

Open
mttbernardini opened this issue Oct 16, 2020 · 0 comments

Comments

@mttbernardini
Copy link
Contributor

mttbernardini commented Oct 16, 2020

  • Version: 2.0.0a4 (and below)
  • Python: 3.7
  • OS: linux

What was wrong?

Given a key k with value v, in some cases it's not possible to exploit only the information provided by get_proof(k) to determine a new root-hash of the trie where k is deleted.

(Conversely, I was not able to find a case where it's not possible to use get_proof(k) to determine a new root-hash after an update or an insertion of k.)

What did you expect it to do?

Authenticated Data Structures usually allow users to update values in the following fashion:

  1. User queries the ADS for the value of key k
  2. ADS returns value v and its proof p
  3. User computes the root-hash determined by v and p and compares it to their previously known root-hash r to authenticate the response. ^
  4. User sends an update message to the ADS for key k with new value v'
  5. User computes the new root-hash r' determined by k and v' and stores it for authenticating following queries.

(^ In Ethereum Tries this differs a tiny bit in the way that r and p are used to determine v and then values are compared instead of root-hashes)

Now, even though this library does not provide a direct method for computing a new root-hash starting from a proof and a new value, it is still possible to do so by building a "partial" trie with the nodes obtained by get_proof() (in a similar fashion the way get_from_proof() is implemented), perform the update on such trie and then retrieve its new root-hash.

Therefore it should still be possible, for a third party not storing the ADS, to perform any kind of update on k starting just from the proof obtained by get_proof(), whether a literal "update" (i.e. change of value), an insertion or a removal (obviously just in respect to k).

NB: I already have a initial patch for this issue (needs refining because it adds nodes also when it might not be needed), if welcome I can go ahead with a PR.

Code to reproduce the error

First, assume a function new_root_after_delete(root, key, proof) for retrieving the new root-hash
def new_root_after_delete(root, key, proof):
    trie = HexaryTrie({})
    for node in proof:
        trie._persist_node(node)
    trie.root_hash = root
    trie.delete(key)

    return trie.root_hash

I was only able to find two scenarios in which this bug arises, and in particular only when the nodes are long enough to get hashed.

1. k is targeted to a branch node B having only one child node C:

In this case the removal of k causes B to be removed and C to collapse upwards, therefore C should be provided in the proof for k, to be able to update its remaining key and update hashes upwards accordingly.

t = HexaryTrie({}, prune=True)
t[b"\x00\x00"] = b"a"*30
t[b"\x00"] = b"b"*30

# client side
prev_root = t.root_hash
key_to_del = b"\x00"
proof = t.get_proof(key_to_del)

new_root_after_delete(prev_root, key_to_del, proof)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 6, in new_root_after_delete
  File "/home/matteo/src/py-trie/trie/hexary.py", line 86, in wrapped
    fn(trie_self, *args)
  File "/home/matteo/src/py-trie/trie/hexary.py", line 356, in delete
    self._raise_missing_node(exc, key)
  File "/home/matteo/src/py-trie/trie/hexary.py", line 302, in _raise_missing_node
    raise MissingTrieNode(exception.args[0], self.root_hash, key, prefix=None) from exception
trie.exceptions.MissingTrieNode: Trie database is missing hash HexBytes('0x1fb7fdc5aae3c3f57d3e9cfa86bceddc19de02cf587c14a326895f451167c4af') needed to look up node at prefix None, when searching for key HexBytes('0x00') at root hash HexBytes('0x8374694098724fa740337bc9b496646c30219e9c5c87285b88e76d987d43a56d')

A visual representation will make it clear:

t proof proof after deletion
scenario 1 t scenario 1 proof case0 trie_after

In this example the node C corresponds to 1fb7fd, which is not included in proof, that then collapses to 13ef40 after the removal of k (which, for this trivial example it also corresponds to the whole trie).

2. k is targeted to a leaf node L which is child of a branch node B (with no value associated) and, in respect to B, L is the only sibling of some other node S:

In this case, the removal of k causes L to be removed, and since B has no value associated with it B is removed too, causing S to collapse upwards, recreating the first scenario. Therefore S should be provided in the proof for k.

t = HexaryTrie({}, prune=True)
t[b"\x00\x00"] = b"a"*30
t[b"\x00\x01"] = b"b"*30

# client side
prev_root = t.root_hash
key_to_del = b"\x00\x00"
proof = t.get_proof(key_to_del)

new_root_after_delete(prev_root, key_to_del, proof)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 6, in new_root_after_delete
  File "/home/matteo/src/py-trie/trie/hexary.py", line 86, in wrapped
    fn(trie_self, *args)
  File "/home/matteo/src/py-trie/trie/hexary.py", line 356, in delete
    self._raise_missing_node(exc, key)
  File "/home/matteo/src/py-trie/trie/hexary.py", line 302, in _raise_missing_node
    raise MissingTrieNode(exception.args[0], self.root_hash, key, prefix=None) from exception
trie.exceptions.MissingTrieNode: Trie database is missing hash HexBytes('0x873d300c30dc3b1d383643935e480a8ca36821ffd3803f78432a345a37e1a440') needed to look up node at prefix None, when searching for key HexBytes('0x0000') at root hash HexBytes('0xc773de0ac200c0f6f3809b2bb7b03ca1e2974068e75b1bab4b48a6a6a2a2d179')

Visual representation:

t proof proof after deletion
scenario 2 t scenario 2 proof case1 after

In this example the node S corresponds to 873d30 which is not included in proof. After the removal of k, S collapses to ef3798 (which as before, for this trivial example is also the whole trie).

I collapsed the code for convenience, since this issue is already quite long. I hope this helps, let me know if I can start a PR or provide any additional input.

Thanks

mttbernardini added a commit to mttbernardini/py-trie that referenced this issue Oct 16, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant