Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dependent types: is it possible to have array sizes and dimensions declared in the types? #308

Open
rcalsaverini opened this issue Feb 7, 2020 · 1 comment
Labels
feat / types Type hints and type checking

Comments

@rcalsaverini
Copy link

rcalsaverini commented Feb 7, 2020

Is it possible to have array sizes in the types?

Something like:

layer: Model[Array2d[754], Array2d[10]] = ...

and have that type check the definitions of the models and composition of layers to guarantee everything fits?

For example, making this an explicit type error:

layer1: Model[Array2d[10], Array2d[10]] = ...
layer2: Model[Array2d[15], Array2d[10]] = ...

out = chain(layer1, layed2)     # this should be a type error in mypy

If thinc doesn't support this today, is it something that is even implementable in Python's type annotation system?

@svlandeg svlandeg added the feat / types Type hints and type checking label Feb 10, 2020
@honnibal
Copy link
Member

honnibal commented Feb 14, 2020

Hey sorry for the delay replying to this (we've been travelling for PyCon Colombia). It's a good question and one we thought quite a lot about.

We'd love to have something like that working well, but we decided the complexity it would introduce with extraTypeVar, cast, etc would be too much. A lot of user code would need to use type variables to compile correctly, which seems undesirable. There's likely to be improved support for literals in future, so we thought we'd start with the simpler scheme, especially while most of the Python community is still getting used to the types (we are ourselves, too).

Exploring the sizes in the arrays would be a nice side-project if someone wants to take it on. My suggestion would be to have the sizes specified in reverse order. So an array Floats2d[128, 32] would be an array of shape (32, 128). The reason is that the last dimension is the most likely to be important, so you can write Floats2d[128] and have that mean Floats2d[128, Literal[int]]. Otherwise you'll more often be writing Floats2d[Literal[int], 128].

One option for the library would be to add spaces for the types, without using them in layer signatures. This might allow us to avoid introducing complexity for users, while giving us the option of using the types or allowing users to use them. I'm not sure.

There are other ways we could have done the type-system. Possibly we'd be better off having a type-variable for the floats vs ints distinction. Currently we have a problem where it's hard to say "If it takes a floats array, it returns a floats array (but of different dimension)". You have to use an overload for that, not a type variable. If the types also specified the sizes, this situation would arise even more. So maybe we want Array2d['f', 128] instead. I do think it's better to write Array2d instead of Array['f', Literal[int], Literal[int]] though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feat / types Type hints and type checking
Projects
None yet
Development

No branches or pull requests

3 participants