Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add types to engine #1

Closed
wants to merge 9 commits into from
Closed

Conversation

qthequartermasterman
Copy link
Owner

@qthequartermasterman qthequartermasterman commented May 1, 2024

In an attempt to help with HypothesisWorks#3074, I have added type annotations to hypothesis/internal/conjecture/engine.py. I have checked these type hints with both mypy and pyright, and verified a few of the trickier-to-statically-determine types by checking their runtime types in the tests.

There are two instances that I'm still confused about.

  1. Can Overrun be a valid value for self.interesting_examples?

In all places but this one sample (lines 389 in this PR), all of the keys for self.interesting_examples are ConjectureResult. However, this assignment may possible be Overrun. Indeed, in one test, test_backend_can_shrink_bytes, this does occur. This seems (to me) to be the only place in the tests where this Overrun assignment occurs. If Overrun is a valid value, then this violates the typing in several other places where self.interesting_examples is accessed, but nothing seemed to ever trip up the asserts I had placed in those places (and since removed). If Overrun is indeed valid, then I will need to update the hints for self.interesting_examples, and I'd like to better understand the guarantees that ensure Overrun is never a value in other places (so the type narrowing asserts are legal). Many, but not all, of those references are in shrink_interesting_examples for example.

            if changed:
                self.save_buffer(data.buffer)
                self.interesting_examples[key] = data.as_result()  # type: ignore
                self.__data_cache.pin(data.buffer)
                self.shrunk_examples.discard(key)
  1. Potentially uninitialized variable trial_data because of a caught PreviouslyUnseenBehavior exception

In line 842, trial_data is defined inside of a try-except block. In the happy path, this variable is always defined. On the otherhand, unless I am misunderstanding something, if a PreviouslyUnseenBehavior is thrown, that variable will not be defined, meaning trial_data.observer.killed and a few other later instances will reference an uninitialized variable. This never happens in the tests, and I've certainly never experienced it while using hypothesis.

Am I misunderstanding the flow of the code here? Is this try-except superfluous? Or is there a genuine error in how the exception is handled?

                try:
                    trial_data = self.new_conjecture_data(
                        prefix=prefix, max_length=max_length
                    )
                    self.tree.simulate_test_function(trial_data)
                    continue
                except PreviouslyUnseenBehaviour:
                    pass

                # If the simulation entered part of the tree that has been killed,
                # we don't want to run this.
                assert isinstance(trial_data.observer, TreeRecordingObserver)
                if trial_data.observer.killed:
                    continue

                ...

On top of those two confusions I have, mypy also complains about some string literals as keys to a TypedDict (line 247), because it does not consider the concatenation of two literals to be a literal. See the code snippet below. Personally, I feel like this is an error in mypy--pyright does not have this complaint. If we would like mypy to be fully happy, I can either add a # type: ignore, or refactor the code to only use true literals. If we don't care that mypy complains, then no change is needed.

    def _log_phase_statistics(self, phase: Literal["reuse", "generate", "shrink"]) -> Generator[None, None, None]:
        ...
        finally:
            self.statistics[phase + "-phase"] = {  # The key here is the concatenation of two string literals.
                "duration-seconds": time.perf_counter() - start_time,
                "test-cases": list(self.stats_per_test_case),
                "distinct-failures": len(self.interesting_examples),
                "shrinks-successful": self.shrinks,
            }

Doubtless, I've missed or misunderstood some pieces. I am happy to revise and iterate!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
1 participant