Dependent types: is it possible to have array sizes and dimensions declared in the types? #308

rcalsaverini · 2020-02-07T20:57:02Z

Is it possible to have array sizes in the types?

Something like:

layer: Model[Array2d[754], Array2d[10]] = ...

and have that type check the definitions of the models and composition of layers to guarantee everything fits?

For example, making this an explicit type error:

layer1: Model[Array2d[10], Array2d[10]] = ...
layer2: Model[Array2d[15], Array2d[10]] = ...

out = chain(layer1, layed2)     # this should be a type error in mypy

If thinc doesn't support this today, is it something that is even implementable in Python's type annotation system?

honnibal · 2020-02-14T06:49:13Z

Hey sorry for the delay replying to this (we've been travelling for PyCon Colombia). It's a good question and one we thought quite a lot about.

We'd love to have something like that working well, but we decided the complexity it would introduce with extraTypeVar, cast, etc would be too much. A lot of user code would need to use type variables to compile correctly, which seems undesirable. There's likely to be improved support for literals in future, so we thought we'd start with the simpler scheme, especially while most of the Python community is still getting used to the types (we are ourselves, too).

Exploring the sizes in the arrays would be a nice side-project if someone wants to take it on. My suggestion would be to have the sizes specified in reverse order. So an array Floats2d[128, 32] would be an array of shape (32, 128). The reason is that the last dimension is the most likely to be important, so you can write Floats2d[128] and have that mean Floats2d[128, Literal[int]]. Otherwise you'll more often be writing Floats2d[Literal[int], 128].

One option for the library would be to add spaces for the types, without using them in layer signatures. This might allow us to avoid introducing complexity for users, while giving us the option of using the types or allowing users to use them. I'm not sure.

There are other ways we could have done the type-system. Possibly we'd be better off having a type-variable for the floats vs ints distinction. Currently we have a problem where it's hard to say "If it takes a floats array, it returns a floats array (but of different dimension)". You have to use an overload for that, not a type variable. If the types also specified the sizes, this situation would arise even more. So maybe we want Array2d['f', 128] instead. I do think it's better to write Array2d instead of Array['f', Literal[int], Literal[int]] though.

svlandeg added the feat / types Type hints and type checking label Feb 10, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dependent types: is it possible to have array sizes and dimensions declared in the types? #308

Dependent types: is it possible to have array sizes and dimensions declared in the types? #308

rcalsaverini commented Feb 7, 2020 •

edited

honnibal commented Feb 14, 2020 •

edited

Dependent types: is it possible to have array sizes and dimensions declared in the types? #308

Dependent types: is it possible to have array sizes and dimensions declared in the types? #308

Comments

rcalsaverini commented Feb 7, 2020 • edited

honnibal commented Feb 14, 2020 • edited

rcalsaverini commented Feb 7, 2020 •

edited

honnibal commented Feb 14, 2020 •

edited