Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

too soon for py.typed? #486

Open
dimbleby opened this issue May 13, 2023 · 23 comments
Open

too soon for py.typed? #486

dimbleby opened this issue May 13, 2023 · 23 comments
Labels
typing Related to type hints

Comments

@dimbleby
Copy link

dimbleby commented May 13, 2023

I'm all in favour of typechecking in python and turn it on almost always. So in principle I love the inclusion of py.typed in the latest release of this library.

However, it seems that the type annotations that are being exported are somewhat incomplete / incorrect. I'm getting quite a lot of warnings that I think I simply can't do anything about. (Well I guess I can cast(), but it gets tedious pretty quickly.)

The main example that I hit boils down to this:

import galois

foo: galois.GF2 = galois.GF2([1])
bar: galois.GF2 = galois.GF2([0])
baz: galois.GF2 = foo + bar

where, even having given the typechecker as much help as I can there:

foo.py:5: error: Incompatible types in assignment (expression has type "ndarray[Any, dtype[bool_]]", variable has type "GF2")  [assignment]

because there is no suitable overload on FieldArray.__add__


Trying to typecheck this project itself, I get

$ mypy --strict src
...
Found 1528 errors in 38 files (checked 47 source files)

Certainly it's not a requirement to fix all of those before publishing py.typed - users of the library could not care less whether your internals typecheck, the API is all that matters. However this is perhaps a hint that the annotations aren't in such good shape just yet.


Conscious of showing up only with complaints: I'd like to add that I've really enjoyed using this library! While I've only been using it for some unimportant puzzle-solving, it has saved me a lot of trouble by implementing absolutely loads of useful stuff that I really wouldn't have wanted to struggle over myself. Thank you.

@mhostetter mhostetter added the typing Related to type hints label May 13, 2023
@mhostetter
Copy link
Owner

Thanks for the report. I usually release code into the void and never hear anything back. So it's nice to know someone discovered something after a release, even if it's a bug. Oops.

Type checking with mypy internally has been a pain in the ass. Partly because of subclassing np.ndarray and also using a metaclass. Perhaps I'm not doing something correctly.

The type error you found seems to be caused by __add__ on ndarray not supporting subclass types. If I add this to my FieldArray class, then mypy is satisfied externally. I shouldn't have to add these useless methods everywhere, though. Hmm.

    def __add__(self, other: Self) -> Self:
        return super().__add__(other)

It still complains inside that method with the same error: Expression of type "NDArray[bool_]" cannot be assigned to return type "Self@FieldArray" "NDArray[bool_]" is incompatible with "FieldArray". So I'll have to silence that.

Can users silence mypy checking on galois so that it worked as it did before py.typed? I'd rather not revert the addition of py.typed.

I will create a branch working on resolving some of these type issues. Perhaps you can then test it with some of your code and see if it complains relentlessly still?

@mhostetter
Copy link
Owner

Interestingly, when I run mypy --strict on the source code I get 189 errors, not 1528. Do you have any idea why? Below are my outputs.

PS C:\Users\matth\repos\galois> python3 -m mypy --strict src/       
src\galois\_helper.py:11: error: Function is missing a type annotation  [no-untyped-def]
src\galois\_helper.py:20: error: Incompatible types in assignment (expression has type "FrameInfo", variable has type "Optional[FrameType]")  [assignment]
src\galois\_helper.py:21: error: Value of type "Optional[List[str]]" is not indexable  [index]
...
src\galois\_lfsr.py:9: error: Skipping analyzing "numba": module is installed, but missing library stubs or py.typed marker  [import]
src\galois\_codes\_bch.py:8: error: Skipping analyzing "numba": module is installed, but missing library stubs or py.typed marker  [import]
src\galois\_codes\_bch.py:8: note: See https://mypy.readthedocs.io/en/stable/running_mypy.html#missing-imports
Found 189 errors in 20 files (checked 49 source files)
PS C:\Users\matth\repos\galois> python3 -m mypy --version
mypy 1.3.0 (compiled: yes)

@dimbleby
Copy link
Author

dimbleby commented May 13, 2023

I also told mypy not to worry about numba being untyped - perhaps that allows it to keep going. Like this in pyproject.toml:

[[tool.mypy.overrides]]
module = [
  "numba.*",
]
ignore_missing_imports = true

I agree that it's sad if ndarray subclasses are required to provide useless methods. I found some relevant issues on the numpy tracker: numpy/numpy#20072 seems like a starting point, it links directly or indirectly to several others.

I don't know of a way to be selective about disabling this sort of warning except by doing it on each individual instance.

I have rather a small amount of code and it turned out that all the warnings that I was getting are in the end derived from this single issue. So additional testing with my code isn't going to be very valuable: either this one problem is sorted or it isn't!

For much the same reason, you probably shouldn't rush to do anything in particular on my account. My code is pretty much done, in the end it's of very little consequence to me whether the warnings go away or not. Would be very reasonable to wait and see if anyone else shows up with views, or with better ideas.

@mhostetter
Copy link
Owner

Alright. Thanks again for reporting. This is the kind of feedback I'm looking for.

I'm going to work on getting the source code to appease mypy. I previously considered this, see #410. I'll also add those useless methods so that most FieldArray arithmetic doesn't complain. Maybe running mypy on my unit tests will benchmark how well the public API type hints work.

@amirebrahimi
Copy link

amirebrahimi commented Sep 26, 2023

We're seeing this issue, too, with mypy v1.4.1. I think it is with np.vstack and np.hstack, but those are just two examples. It's odd because FieldArray inherits from np.ndarray.

@mhostetter
Copy link
Owner

Would you mind providing an example snippet of code and the mypy complaint?

@amirebrahimi
Copy link

Would you mind providing an example snippet of code and the mypy complaint?

import numpy as np
from galois import GF2

a: GF2 = GF2([0, 1, 0])
b: GF2 = GF2([1, 0, 1])

c: GF2 = np.vstack((a, b))
# mypy output referencing the line above:
# error: Incompatible types in assignment (expression has type "ndarray[Any, dtype[Any]]", variable has type "GF2")  [assignment]

# but this assertion holds:
assert isinstance(c, GF2)

@mhostetter
Copy link
Owner

Thanks, I remember seeing this when OP first raised this issue.

I'm not sure how to resolve that. NumPy controls the return type signature from np.vstack(). I'm open to ideas, but I'm not sure how to make mypy understand that GF2 is a subclass of np.ndarray.

@amirebrahimi
Copy link

I'm not quite sure either, but using .view(GF2) seems to correct most issues. I get that we cannot control the type signature from np functions, but what about intrinsic operations like matmul or addition since those are overridden on the class?

@mhostetter
Copy link
Owner

Instead of using .view(), you also use typing.cast() to appease the type checker. However, that is a major hassle.

from typing import cast

import galois

x = galois.GF2([1, 0, 1])
y = galois.GF2([1, 1, 0])
z = x + y
z = cast(galois.GF2, z)

I'm working on a branch right now to appease the type checking for arithmetic operations.

@mhostetter
Copy link
Owner

@amirebrahimi would you mind testing #510 and see if it silences most of your issues (wrt to arithmetic operations)? Let me know if I missed any dunder arithmetic methods. Thanks.

@dimbleby
Copy link
Author

is there a reason for not having annotated the type of any of the other arguments?

@mhostetter
Copy link
Owner

I suppose I can annotate them as other: Any.

@dimbleby
Copy link
Author

I'd assume that the valid operations were more restricted than that?

eg wouldn't it be nice if the typechecker warned about galois.GF2([1]) + "nonsense"

@mhostetter
Copy link
Owner

Yes, that would be better. I'll add that tomorrow.

@mhostetter
Copy link
Owner

mypy is doing very strange things... I annotated the arithmetic methods like this:

    def __add__(self, other: Self) -> Self:
        return super().__add__(other)

    # Multiply is different since integers are treated as scalar multiplication
    def __mul__(self, other: int | npt.NDArray | Self) -> Self:
        return super().__mul__(other)

Here is a small example:

import numpy as np
from typing_extensions import reveal_type

import galois

GF = galois.GF(7)
x = GF([1, 0, 1])
y = GF([1, 2, 3])
# y = np.array([1, 2, 3], dtype=float)

z1 = x + y
reveal_type(z1)
z2 = x - y
reveal_type(z2)
z3 = x * y
reveal_type(z3)
z4 = x / y
reveal_type(z4)
z5 = x // y
reveal_type(z5)
z6 = x @ y
reveal_type(z6)
z7 = x % y
reveal_type(z7)

When y = GF([1, 2, 3]), I get this output:

$ mypy test_mypy.py 
test_mypy.py:14: note: Revealed type is "galois._fields._array.FieldArray"
test_mypy.py:16: note: Revealed type is "galois._fields._array.FieldArray"
test_mypy.py:18: note: Revealed type is "galois._fields._array.FieldArray"
test_mypy.py:20: note: Revealed type is "galois._fields._array.FieldArray"
test_mypy.py:22: note: Revealed type is "galois._fields._array.FieldArray"
test_mypy.py:24: note: Revealed type is "galois._fields._array.FieldArray"
test_mypy.py:26: note: Revealed type is "galois._fields._array.FieldArray"

When y = np.array([1, 2, 3], dtype=float), I would expect type errors, Instead, I get this output:

$ mypy test_mypy.py 
test_mypy.py:12: note: Revealed type is "numpy.ndarray[Any, numpy.dtype[numpy.bool_]]"
test_mypy.py:14: note: Revealed type is "numpy.ndarray[Any, numpy.dtype[Any]]"
test_mypy.py:16: note: Revealed type is "galois._fields._array.FieldArray"
test_mypy.py:18: note: Revealed type is "numpy.ndarray[Any, numpy.dtype[numpy.floating[numpy._typing._64Bit]]]"
test_mypy.py:20: note: Revealed type is "numpy.ndarray[Any, numpy.dtype[numpy.signedinteger[numpy._typing._8Bit]]]"
test_mypy.py:22: note: Revealed type is "numpy.ndarray[Any, numpy.dtype[numpy.bool_]]"
test_mypy.py:24: note: Revealed type is "numpy.ndarray[Any, numpy.dtype[numpy.signedinteger[numpy._typing._8Bit]]]"

For every case (except for multiplication) a regular ndarray is not a valid other type. Instead of failing this case, mypy seemds to search for the type annotation from the ndarray parent class. I don't know how to resolve that.

Note, if I annotate as other: Any, then my annotation is always used and FieldArray is always the output type.

Any thoughts or ideas would be appreciated.

@amirebrahimi
Copy link

@amirebrahimi would you mind testing #510 and see if it silences most of your issues (wrt to arithmetic operations)? Let me know if I missed any dunder arithmetic methods. Thanks.

Tried this branch locally with our project and it seems to handle the arithmetic operations mypy issues we were seeing.

@amirebrahimi
Copy link

For every case (except for multiplication) a regular ndarray is not a valid other type. Instead of failing this case, mypy seemds to search for the type annotation from the ndarray parent class. I don't know how to resolve that.

I'd be curious to see what you get if you print out type(zn) instead of reveal_type. I wonder if it is what we had in our example code where the object returned is still a FieldArray, but mypy thinks otherwise.

Note, if I annotate as other: Any, then my annotation is always used and FieldArray is always the output type.

Any thoughts or ideas would be appreciated.

One thought is to leave it as Any and then check in the method for an instance of a GF and raise a TypeError otherwise.

@mhostetter
Copy link
Owner

check in the method for an instance of a GF and raise a TypeError otherwise.

The code currently does this. Running the example with y = np.array([1, 2, 3], dtype=float) yields this error: TypeError: Operation 'add' requires both operands to be GF(7) arrays, not [GF([1, 0, 1], order=7), array([1., 2., 3.])].

I think I may have to leave the annotation as other: Any unless I can find a way to avoid the incorrect type assignment mypy came up with.

@dimbleby
Copy link
Author

dimbleby commented Oct 3, 2023

interesting. I have a theory but it is untested so likely wrong

I speculate that what mypy is doing is first trying __add__ on the galois types: and when it doesn't find a match it tries __radd__ on the numpy types. That allows it to find the unexpected solutions.

If that's right then

  • on the plus side, these annotations do successfully rule out some things eg fieldArray + string
  • on the minus side, it's confusing!

Continuing down this speculative path, I wonder if galois can add an overload for the other: Any case and use that to indicate some sort of error. (The idea being that we want __add__ always to match, avoiding the __radd__ fallback). But I'm not sure how to indicate an inference error in that overload. Maybe returning NoReturn?!? Still kinda confusing...

@mhostetter
Copy link
Owner

@dimbleby your suspicion was exactly correct. mypy was falling back to ndarray.__radd__().

The solution I'm targeting is this:

    def __add__(self, other: npt.NDArray) -> Self:
        return super().__add__(other)

It checks for crazy other types. For FieldArray plus any NumPy array, the return type will be FieldArray, which avoid the erroneous return type of NDArray. If the NumPy array is of the wrong type or wrong Galois field, a runtime error will be thrown.

Let me know if either of you have strong feelings. Otherwise, I'll merge #510 soon.

@dimbleby
Copy link
Author

sounds like a sensible solution to me

@mhostetter
Copy link
Owner

Improved array arithmetic type annotations released in v0.3.7.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
typing Related to type hints
Projects
None yet
Development

No branches or pull requests

3 participants