Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Faster verification of HexaryTrieFog.explore inputs #107

Open
carver opened this issue May 22, 2020 · 1 comment
Open

Faster verification of HexaryTrieFog.explore inputs #107

carver opened this issue May 22, 2020 · 1 comment

Comments

@carver
Copy link
Contributor

carver commented May 22, 2020

It seems like we should be able to do better if we modeled the data as a tree. IIUC the current approach is super-linear in runtime complexity.

tree = {}
for segment in sub_segments:
    sub_tree = tree
    for nibble in segment:
        sub_tree.setdefault(nibble, {})
        if not isinstance(sub_tree[nibble], collections.Mapping):
            ...  # we've detected a that `segment` is a superset of `sub_tree[nibble]`
        sub_tree = sub_tree[nibble]
    # once the segment terminates, we store the full segment at that position 
    # so that we can later detect segments that are prefixes of other segments.
    sub_tree[segment[-1]] = segment

I think the untested psuedocode above has O(n) complexity.

Originally posted by @pipermerriam in #95

@carver
Copy link
Contributor Author

carver commented May 22, 2020

Because this whole verification is for an unknown use case (exploring more than one node at a time, in parallel), it's hard to optimize for speed. Are there a huge number of sub_segments? Are there not so many, but each one has a really long segment? etc.

The whole validation check is skipped for known use cases, because the number of different-lengthed sub-segments is always 1 for branches, nodes, and extensions:

        all_lengths = set(len(segment) for segment in sub_segments)
        if len(all_lengths) > 1:
            # verification here

So the runtime of all known use cases is basically unaffected by this verification speed.

If someone does decide to speed this up, I'd want to see a hypothesis test to make sure all the cases are covered. But really I think it's fine to wait until we have a reason to speed it up (and hence, a way to test that it's actually faster).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant