Why does `1 + "foo"` type check? #52

LuKuangChen · 2021-10-14T20:32:47Z

6862bbd
2021-10-14

What program did you run?

def f():
    return 1 + "foo"

f()

What happened?

The program passes type checks.

What should have happened?

We expected a compile-time error complaining that we can't add up int and str.

BTW, the type of int's __add__ method is printed in an interesting way, what does that type mean?

reveal_type((42).__add__)
# compiler.static.errors.TypedSyntaxError:
# reveal_type(int(42).__add__):
# 'int.__add__'

The text was updated successfully, but these errors were encountered:

carljm · 2021-10-17T14:11:55Z

Yes, thanks for the report, this is another consequence of the fact that our priority has been on acquiring sound type information to optimize bytecode, not as much on ensuring that we catch all possible runtime type errors. This is sound because we consider the type of the add expression to be dynamic, but ideally we would error here.

The odd string representation of the type of int.__add__ is just basicaly a placeholder; because we haven't invested yet in support for higher-order functions as type annotations (i.e. Callable type), we also haven't yet built formatting of our internal callable type to a Callable string representation. (And Callable syntax can't represent all the information we carry about a callable signature, so like Pyre we might end up inventing an "extended Callable" notation to use in this output.)

LuKuangChen · 2021-10-17T22:00:07Z

Thanks for the explanation. To avoid unnecessary engineering effort, we want to process e1 + e2 as e1.__add__(e2). Currently these two forms are not equivalent in SP because the former doesn't specify the expected type of e2 while the latter does (See the example program below). If we do the pre-processing, would our model be consistent with a future Static Python?

from __future__ import annotations

class MyClass:
    n: int
    def __init__(self, n: int) -> None:
        self.n = n

    def __add__(self, c: MyClass) -> MyClass:
        return MyClass(self.n + c.n)

def f():
    # This doesn't error statically, but does at runtime.
    MyClass(1) + "foo"

    # This errors statically
    MyClass(1).__add__("foo")

#   File "add_is_dyn.py", line 16
#     MyClass(1).__add__("foo")
#                       ^
# compiler.static.errors.TypedSyntaxError: type mismatch:
# Exact[str] received for positional arg 'c',
# expected __main__.MyClass

LuKuangChen · 2021-10-17T22:00:12Z

BTW, does SP ever optimizes integer addition?

carljm · 2021-10-18T22:37:38Z

would our model be consistent with a future Static Python?

Yes, we intend to support the relationship between operators and their double-underscore methods in the future, but we don't currently.

does SP ever optimizes integer addition?

Not currently. We provide machine primitive integer types (__static__.int64 et al) which are final and thus easier to optimize, and so far we prefer to adopt these in any case where hot code involves integer arithmetic. It's certainly possible that we could optimize some arithmetic on Python integers in future, though the opportunities are mostly limited to cases where we have an exact type (e.g. from a constant), due to the possibility of int subclasses with arbitrary behavior in their __add__ etc. (Or we might be able to speculatively optimize and fall back if we detect such customization at runtime.)

bennn · 2021-10-21T14:55:27Z

How does SP guarantee that int64 etc. are final classes?

My first guess was that SP would check class definitions at runtime, to rule out things like class D(int64) in untyped code. But that definition seems to work --- even in typed code! I didn't get an error until trying an assignment x:int64 = D(1).

My second guess is that int64 isn't really a final class, but instead the type gets enforced with an "exact type" cast to prevent subclasses from breaking the optimizer. Is that right?

carljm · 2021-10-21T19:17:38Z

So the typed/untyped code thing is a bit subtle. The meaning of __static__.int64 is not the same in typed vs untyped code; in untyped code it is already just a subclass of regular Python int. (The reason for this is that we want static code to be runnable without Static Python compiler too; in that case int64 becomes just an alias for int.) So it's fine with us if you subclass it in untyped code; you're creating yet another subclass of int. If you pass this to a Static Python function taking int, that's fine, we know it might be a subclass and we don't optimize int anyway. If you pass it to a Static Python function taking int64, we'll implicitly unbox it at the call boundary and within Static Python it'll be an unboxed machine int; it will no longer be an instance of your int subclass at all. (I guess this is similar to the "exact cast" you mentioned, but it's a bit different, since it's not just an exact type cast, it's an implicit unbox from Python int to int64.) So I don't think we need to prevent subclassing __static__.int64 in untyped code (although we could, and maybe for simplicity we should.)

The fact that you can subclass int64 in typed code is an oversight that we should fix, although I don't think it can result in soundness problems, because as you discovered you won't be able to assign an instance of that type to anywhere that we treat as int64. In fact in the latest version of Static Python even D(1) fails with a compiler assertion error. But it should fail with a clear type error instead on class D(int64): ....

bennn · 2021-10-22T02:59:16Z

Ok, SP protects all its int64 annotations with casts & unboxes.

For modeling: what other classes have different meaning in typed & untyped code? (Or, where can we find a list?). I expect all the CTypes are like this. Are there any others?

For github: may I open a new issue for class D(int64) ?

carljm · 2021-10-22T04:55:55Z

what other classes have different meaning in typed & untyped code?

Only the CTypes: so the various sizes of int and uint, double, cbool... soon I think we'll be adding __static__.Enum to that list, which will be a normal Python Enum type (with integer values) in non-static code, but in static code will be treated as an unboxed machine int, to optimize away various costs of the Python-object Enum implementation. But aside from primitive types, everything else is the same PyObject* inside or outside of Static Python.

may I open a new issue for class D(int64) ?

Of course, that would be great, thank you!

carljm added sp-ux static python user experience issues staticpython static python issues labels Oct 17, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why does `1 + "foo"` type check? #52

Why does `1 + "foo"` type check? #52

LuKuangChen commented Oct 14, 2021

carljm commented Oct 17, 2021 •

edited

LuKuangChen commented Oct 17, 2021

LuKuangChen commented Oct 17, 2021

carljm commented Oct 18, 2021

bennn commented Oct 21, 2021

carljm commented Oct 21, 2021

bennn commented Oct 22, 2021

carljm commented Oct 22, 2021

Why does 1 + "foo" type check? #52

Why does 1 + "foo" type check? #52

Comments

LuKuangChen commented Oct 14, 2021

What program did you run?

What happened?

What should have happened?

carljm commented Oct 17, 2021 • edited

LuKuangChen commented Oct 17, 2021

LuKuangChen commented Oct 17, 2021

carljm commented Oct 18, 2021

bennn commented Oct 21, 2021

carljm commented Oct 21, 2021

bennn commented Oct 22, 2021

carljm commented Oct 22, 2021

Why does `1 + "foo"` type check? #52

Why does `1 + "foo"` type check? #52

carljm commented Oct 17, 2021 •

edited