Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add static type annotations #11047

Open
sam-goodwin opened this issue Apr 11, 2024 · 4 comments
Open

Add static type annotations #11047

sam-goodwin opened this issue Apr 11, 2024 · 4 comments
Labels
good first issue Clearly described and easy to accomplish. Good for beginners to the project.

Comments

@sam-goodwin
Copy link

sam-goodwin commented Apr 11, 2024

I am having trouble with using Dask in a project that uses PyRight because Dask does not use static type annotations/

Take, da.from_array, for example:

dask/dask/array/core.py

Lines 3307 to 3317 in b2ec1e1

def from_array(
x,
chunks="auto",
name=None,
lock=False,
asarray=None,
fancy=True,
getitem=None,
meta=None,
inline_array=False,
):

Or array.rechunk:

dask/dask/array/rechunk.py

Lines 268 to 275 in b2ec1e1

def rechunk(
x,
chunks="auto",
threshold=None,
block_size_limit=None,
balance=False,
method=None,
):

PyRight complains if you use anything but a str for chunks:

image

Argument of type "tuple[Literal[1], Literal[-1], Literal[-1]]" cannot be assigned to parameter "chunks" of type "str" in function "from_array"
  "tuple[Literal[1], Literal[-1], Literal[-1]]" is incompatible with "str"

This happens because in from_array and rechunk, chunks="auto" infers to a str type.

Is there an awareness of this problem? Is there a known solution? If not, is there a plan to fix this?

@github-actions github-actions bot added the needs triage Needs a response from a contributor label Apr 11, 2024
@phofl
Copy link
Collaborator

phofl commented Apr 11, 2024

We are aware of the issue but there is currently no plan on addressing this from within the currently active contributors. We currently have other (higher) priorities to focus our effort on. Contributions are welcome though if you are interested in helping out.

@phofl phofl added good first issue Clearly described and easy to accomplish. Good for beginners to the project. and removed needs triage Needs a response from a contributor labels Apr 11, 2024
@fjetter
Copy link
Member

fjetter commented Apr 12, 2024

We've been adding type annotations continuously over the past 1-2 years but the code base is very large and ensuring that everything is typed is very difficult. This is rather an incremental process. As was already mentioned, contributions are welcome and I guess this is a comparatively easy topic for a first couple of contributions.

We're using mypy for type checking and the modules that are already fully typed are listed here (not many yet)

dask/pyproject.toml

Lines 180 to 184 in b2ec1e1

# Recent or recently overhauled modules featuring stricter validation
module = [
"dask.order",
]
allow_untyped_defs = false

FWIW we're already much further along for dask/distributed https://github.com/dask/distributed/blob/66ced13b33aeafe0d684b4424364c69f0b5a0d43/pyproject.toml#L201-L212

@benrutter
Copy link
Contributor

benrutter commented Apr 12, 2024

@sam-goodwin (or anyone else having issues with type hinting) - are there any specific functions causing problems for you aside from array.rechunk? I'd be up for spending a little time working through some type annotations at some point - this issues currently labelled 'Add static type annotations' implying for the whole project, which is a great goal, but would be awesome to have a couple priority areas to target.

@sam-goodwin
Copy link
Author

Hi @benrutter, I agree that we don't need to take on the whole repo and there is plenty of low hanging fruit. The errors I am running into are:

  • da.from_array - chunks is inferred as a str
  • array.rechunk - chunks is inferred as a str
  • dd.from_delayed returns DataFrame | Series and will always require a cast. Not sure how to fix that other than adding helpers.
  • map_blocks seems to not behave properly as a method in the eyes of pyright:
da.from_zarr(core.core_img).map_blocks(
    segment_tissue,
    dtype=np.uint8,
    drop_axis=0,
    downsample_factor=tissue_masks_params.downsample_factor,
)

Error:

Argument missing for parameter "func"

If you add explicit func=segment_tissue, then get an error about self:

Argument missing for parameter "self"

Perhaps something wacky with @wraps:

    @wraps(map_blocks)
    def map_blocks(self, func, *args, **kwargs):
        return map_blocks(func, self, *args, **kwargs)

map_blocks is of the following type. Looks like it is no longer recognized as a method or something.

map_blocks: _Wrapped[(func: Unknown, *args: Unknown, name: Unknown | None = None, token: Unknown | None = None, dtype: Unknown | None = None, chunks: Unknown | None = None, drop_axis: Unknown | None = None, new_axis: Unknown | None = None, enforce_ndim: bool = False, meta: Unknown | None = None, **kwargs: Unknown), Unknown, (self: Array, func: Unknown, *args: Unknown, **kwargs: Unknown), Unknown] | Any

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Clearly described and easy to accomplish. Good for beginners to the project.
Projects
None yet
Development

No branches or pull requests

4 participants