Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add type hints and expose via PEP561 #69

Open
pipermerriam opened this issue Oct 17, 2018 · 14 comments
Open

Add type hints and expose via PEP561 #69

pipermerriam opened this issue Oct 17, 2018 · 14 comments

Comments

@pipermerriam
Copy link
Member

pipermerriam commented Oct 17, 2018

Largely copy/pasted from ethereum/py-evm#1398

Background

Type hints allow us to perform static type checking, among other things. They raise the security bar by catching bugs at development time that, without type support, may turn into runtime bugs.

This stackoverflow answer does a great job at describing their main benefits.

What is wrong?

This library currently does not have any type hints.

This needs to be fixed by:

  1. Add the eth-typing>=2.0.0,<3 library as a dependency.
  2. Adding all missing type hints.
  3. Enforcing (stricter) type checking in CI

How

There does exist tooling (monkeytype) to the generation of type hints for existing code bases. From my personal experience monkeytype can be helpful but does still require manual fine tuning. Also, manually adding these type hints does serve as a great boost to the general understanding of the code base as it forces one to think about the code.

  1. Run mypy --follow-imports=silent --warn-unused-ignores --ignore-missing-imports --no-strict-optional --check-untyped-defs --disallow-incomplete-defs --disallow-untyped-defs --disallow-any-generics -p trie

  2. Eliminate every reported error by adding the right type hint

Because this library supports older versions of python, the type hints will not be able to use the modern python3.6 syntax.

Definition of done

This issue is done when the following criteria are met:

  1. mypy is run in CI

Add a new command to the flake8 environment in the tox.ini file that runs:

mypy --follow-imports=silent --warn-unused-ignores --ignore-missing-imports --no-strict-optional --check-untyped-defs --disallow-incomplete-defs --disallow-untyped-defs --disallow-any-generics -p py_ecc`
  1. Usage of type: ignore (silencing the type checker) is minimized and there's a reasonable explanation for its usage

Stretch goals

When this issue is done, stretch goals can be applied (and individually get funded) to tighten type support to qualify:

  1. mypy --strict --follow-imports=silent --ignore-missing-imports --no-strict-optional -p trie
@6ug
Copy link
Contributor

6ug commented Oct 19, 2018

I like to start working on this issue, please consider my application. @pipermerriam

6ug added a commit to 6ug/py-trie that referenced this issue Oct 19, 2018
* Fixes first part of ethereum#69
TODO: Add all missing type hints
@pipermerriam
Copy link
Member Author

@6ug go for it.

@Bhargavasomu
Copy link
Contributor

@pipermerriam I would like to work on this, as there has been no progress till now. One doubt that I had was, is the type of node essentially bytes? If yes, then there is an issue in the following snippet

def get_node_type(node: bytes) -> int:
    if node == BLANK_NODE:
        return NODE_TYPE_BLANK
    elif len(node) == 2:
        key, _ = node
        nibbles = decode_nibbles(key)
        if is_nibbles_terminated(nibbles):
            return NODE_TYPE_LEAF
        else:
            return NODE_TYPE_EXTENSION
    elif len(node) == 17:
        return NODE_TYPE_BRANCH
    else:
        raise InvalidNode("Unable to determine node type")

Here we have key, _ = node. Then the type of key would become int which is not right for decode_nibbles, as this requires the argument to be explicitly bytes type.

@pipermerriam
Copy link
Member Author

pipermerriam commented Nov 12, 2018

No, I believe that node can be any of the following (you should confirm as it's been a while since I was in this codebase)

  • bytes (an empty node)
  • Tuple[bytes, bytes] -> a leaf node or an extension node
  • Tuple[Optional[bytes], Optional[bytes], Optional[bytes], ...<total-of-17-of-these>] -> a branch node.

Probably good to define these once and re-use them across the codebase.

Also, you've got the 👍 from me to work on this.

@Bhargavasomu
Copy link
Contributor

@pipermerriam @cburgdorf I started type hinting the codebase, but it turned out ugly and I feel that we could refactor the code with the usage of more classes like LeafNode, ExtensionNode, BranchNode. Also I was thinking that since the trie is used to represent the global state of the blockchain (as per my understanding, please correct me if I am wrong, still a noob :P), why don't we add benchmarks?

I would like to refactor the code first as part of another issue, and till then maybe this issue can be blocked. I would need help from you during the refactoring phase. Am I good to go for refactoring?

@Bhargavasomu
Copy link
Contributor

And one more doubt I had is the following

class Trie(HexaryTrie):
    def __init__(self, *args: Any, **kwargs: Any) -> None:
        warnings.warn(DeprecationWarning(
            "The `trie.Trie` class has been renamed to `trie.HexaryTrie`. "
            "Please update your code as the `trie.Trie` class will be removed in "
            "a subsequent release"
        ))
        super().__init__(*args, **kwargs)

Any updates or instructions based on the above code? Like should there be only HexaryTrie in the code?

@pipermerriam
Copy link
Member Author

@Bhargavasomu see this guide for a "better" way to link to existing code:

https://help.github.com/articles/creating-a-permanent-link-to-a-code-snippet/

@pipermerriam
Copy link
Member Author

The warning code you linked above is us maintaining backwards compatibility back a few months ago when it looked like we'd have a separate binary trie implementation in the codebase.

We can probably remove that support, though it will require us to do a major version bump (something you don't have to worry about). However, I don't see us needing to remove it other than for cleanup reasons so I'd suggest not getting sidetracked with that unless you have a specific reason for doing it.

@pipermerriam
Copy link
Member Author

@Bhargavasomu I think it could be an improvement to have the different node types have concrete classes. The devil is going to be in the details/implementation. I'd suggest looking into use of named tuples for this. In general, I support this concept and in general, cleanup of this codebase which hasn't been touched much in the last 1.5 years.

I'm hesitant to commit a bounty to this at this stage since it isn't clear yet whether the effort will produce the desired improvement. But, If this does work out well, I will work to get you compensated in some way (as I know you're a bit of a bounty hunter) and I like seeing people paid for good work.

@pipermerriam
Copy link
Member Author

@Bhargavasomu if you choose to undertake this task, can you open a new issue that provides a high level outline for your plan and we can track progress there.

@Bhargavasomu
Copy link
Contributor

@pipermerriam will come up with a good detailed implementation detail and open an issue in a couple of days.

@Bhargavasomu
Copy link
Contributor

@pipermerriam is there any documentation available apart from README, which would help me progress better with the codebase?

@pipermerriam
Copy link
Member Author

Not really. 😢

This codebase was written by 1) referencing the yellow paper which is very cryptic and not easy to read and 2) using the implementation that was already done in the pyethereum codebase which is also not going to be any easier to read.

@Bhargavasomu
Copy link
Contributor

@pipermerriam will go through the yellow paper.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants