Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Union[str, int] is treated as dynamic #49

Open
LuKuangChen opened this issue Oct 3, 2021 · 6 comments
Open

Union[str, int] is treated as dynamic #49

LuKuangChen opened this issue Oct 3, 2021 · 6 comments
Labels
sp-correctness static python correctness staticpython static python issues

Comments

@LuKuangChen
Copy link

What version of Static Python are you using?

6862bbd
2021-10-03

What program did you run?

from typing import Union

x: Union[int, str] = 1
reveal_type(x)

What happened?

The type of x is dynamic

compiler.static.errors.TypedSyntaxError:
reveal_type(x): 'dynamic',
'x' has declared type 'dynamic' and
local type 'Literal[1]'

What should have happened?

We expected the type of x to be Union[int, str] because union types are supported:

x: Any = 1
if True:
    x = "hello"
else:
    x = 42
reveal_type(x)

# compiler.static.errors.TypedSyntaxError:
# reveal_type(x): 'dynamic',
# 'x' has declared type 'dynamic' and
# local type 'Union[str, int]'
@LuKuangChen
Copy link
Author

We also found that Static Python understands Union[str, None] and Union[None, int], but not Union[int, str].

This bug suggests that some tests pass for a wrong reason. For example, the following test from the static python test suit is asserting that a union is a subtype of a broader union. But what really happened is that "the broader union" is read as dynamic. Other target unions work too (Union[str, B]) even though they shouldn't!

    def test_union_can_assign_to_broader_union(self):
        self.assertReturns(
            """
            from typing import Union
            class B:
                pass
            def f(x: int, y: str) -> Union[int, str, B]:
                return x or y
            """,
            "Union[int, str]",
        )

@carljm
Copy link
Contributor

carljm commented Oct 4, 2021

Hi! So the widening of Union annotations to dynamic is necessary for soundness at the moment because of the limitations of our runtime type checks. We special-case support for runtime checking optional types as a degenerate case of Union, but due to implementation complexity and concerns about pathological performance with large Unions, we don't support runtime checks of arbitrary Unions. This means we can't insert a cast to an arbitrary Union type, which means we can't support it as an annotated argument or variable type, even though the compiler's type system can represent arbitrary unions.

It turns out there is still value in the compiler support, because if we infer a union from e.g. x = y or z (where y: str and z: int), this is a type that we can safely trust; it doesn't require any runtime casts to enforce. So we can later narrow based on e.g. if isinstance(z, str): ... else: ... in both the if and else branches, and this is sound.

Thanks for pointing out the bad tests! Originally we supported union annotations until we realized the soundness hole, so these tests predate the widening of union annotations to dynamic.

The path forward here is of course to add runtime support for casts to union, though we remain concerned about the performance with large union types. We might need to keep some arbitrary limit on union size, and above that limit widen the union to dynamic. Not sure what the best UX is for this; we could I suppose make it a type error to specify such a large union as an annotation? This might be better than silently widening to dynamic? Again this is a case where our desire for Pyre/mypy compatibility would push us away from the explicit approach and towards the silent widening.

@carljm carljm added the sp-correctness static python correctness label Oct 6, 2021
facebook-github-bot pushed a commit that referenced this issue Oct 7, 2021
Summary:
As noted in #49 some of these union
tests became ineffective when we closed the union-annotations soundness hole by
widening all non-optional union annotations to dynamic. The tests are passing but for
the wrong reason; instead of testing union `can_assign_from`, they are effectively now
testing that anything can be assigned to dynamic.

Remove the bad tests and replace them with a single unit-test style test of
`UnionType.can_assign_from` which covers the cases which (for now) aren't testable
via real code.

Reviewed By: DinoV

Differential Revision: D31451895

fbshipit-source-id: f888f1b
@carljm
Copy link
Contributor

carljm commented Oct 7, 2021

The bad tests are addressed in 439b330

@bennn
Copy link
Contributor

bennn commented May 25, 2023

Unions should have static checks, right?

@vivaan2006 and I just ran a small program. We expected a typechecker error, but no:

def foo(i: float) -> None:
    return i

# python -m compiler --static main.py
# OK!

No error for int|str either.

@carljm
Copy link
Contributor

carljm commented May 25, 2023

@bennn This is explained in my comment above from Oct 4, 2021. It's not strictly a "static check" vs "runtime check" distinction. It's not enough to just say that we won't use union type information at runtime and when we emit a runtime check we will consider a union to be dynamic, because this can lead us to statically narrow a (not usable at runtime) union type to a simpler type that we do use at runtime, and then wrongly trust it. Consider:

def f(x: int|str) -> str:
    # we cannot verify the type of x is actually int|str at runtime when called from untyped code
    if not isinstance(x, int):
        # now our compiler believes x is type `str` and will trust this return to be safe
        return x
    return "foo"

Therefore we have to consider x to simply be dynamic in order to avoid unsoundness.

We could do better here if we had the ability in our compiler to track two parallel sets of types, "trusted" and "untrusted", and we used "untrusted" types for the purposes of emitting compile-time errors but never for runtime use. In that case we could record an "untrusted" type of int|str for x above, then narrow it to an "untrusted" type of str, then still emit a cast to str before the return. But we don't have that parallel-tracking facility in our compiler, and it's not a priority to add it, since effectively the "untrusted" types would just duplicate the static checks we already get from Pyre.

So the only time our compiler support for tracking unions actually comes into play is when a union type arises "organically" in a way where we can trust it, e.g. in a case like this:

def f(x: int, y: str) -> str:
    z = x or y  # z is trustable int|str
    if not isinstance(z, int):
        # now it is actually sound to consider z to be str
        return z
    return "foo"

So you'll see that you do get a static error from e.g. this variant of your example:

def f(x: int, y: str) -> None:
    z = x or y
    return z

Let me know if that doesn't make sense.

I won't claim this behavior is ideal when SP is viewed in isolation, but it makes a bit more sense when SP is considered in tandem with use of e.g. mypy or Pyre. In an ideal future system, I would want the parallel tracking of trusted and untrusted types in a single SP compiler (essentially rolling the functionality of pyre/mypy into the static compiler also.)

Also, more near term, we would ideally like to add support for arbitrary unions. The main question mark is the cost of verifying them at runtime, since that cost will be O(size of union) where our current casts are all O(1) cost (or technically, O(size of union * depth of inheritance tree) where today they are O(depth of inheritance tree)). But this may be acceptable, "avoid large unions in hot functions" would just become part of tuning guidance.

@bennn
Copy link
Contributor

bennn commented May 25, 2023

Ok, I was confused about the extent of compiler support for unions.

The examples help: there's support for those organic unions but everything else is dynamic.

The reveal_type(x) = dynamic from #42 is helpful too, in retrospect.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sp-correctness static python correctness staticpython static python issues
Projects
None yet
Development

No branches or pull requests

3 participants