Slow typechecking on nested TypedDict with union members #17231

julienp · 2024-05-10T10:08:32Z

Bug Report

For Pulumi we are looking into generating types using TypedDict to model cloud APIs. For example for Kubernetes we have something representing a Deployment.

class DeploymentArgsDict(TypedDict):
  api_version: NotRequired[Input[str]]
  kind: NotRequired[Input[str]]
  metadata: NotRequired[Input['ObjectMetaArgsDict']]
  ...

Pulumi has a notion of inputs and outputs, and the Input type used in the above example looks like this:

Input = Union[T, Awaitable[T], Output[T]]

class Output(Generic[T]):
    pass

Output does a lot things, but for the purposes of this repro all that matters is that its a generic type.

The K8S types can nest pretty deeply, and I suspect a combination of having nested literals along with the Union via the Input type is causing slowness here.

Example:

d: DeploymentArgsDict = {
    "metadata": {
        "name": "nginx",
    },
    "spec": {
        "selector":{
            "match_labels": {}
        },
        "replicas": 1,
        "template": {
            "metadata": {
                "labels": {}
            },
            "spec": {
                "containers": [{
                    "name": "nginx",
                    "image": "nginx"
                }]
            }
        }
    }
}

If I drop Awaitable[T] from the union to reduce it to two members, typechecking completes in 2 seconds. With it present, it takes 40 seconds.

This is a simplified example, and the actual code has another union layered on top. In that case we run out of memory.

To Reproduce

I have created a repro here https://github.com/julienp/typeddict-performance

Expected Behavior

It takes a second or two to typecheck.

Actual Behavior

It takes ~40 seconds on my machine

Your Environment

Mypy version used: 1.10
Mypy command-line flags: none
Mypy configuration options from mypy.ini (and other config files): none
Python version used: 3.12.2

The text was updated successfully, but these errors were encountered:

julienp · 2024-05-13T08:56:32Z

Ran some more tests with a larger set of types, and it looks like the issue might be memory related. I am seeing python max on memory on my system, causing heavy swapping, while the process sits at 100% CPU, probably GCing constantly.

julienp added the bug mypy got something wrong label May 10, 2024

JelleZijlstra added performance topic-typed-dict labels May 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Slow typechecking on nested TypedDict with union members #17231

Slow typechecking on nested TypedDict with union members #17231

julienp commented May 10, 2024

julienp commented May 13, 2024

Slow typechecking on nested TypedDict with union members #17231

Slow typechecking on nested TypedDict with union members #17231

Comments

julienp commented May 10, 2024

julienp commented May 13, 2024