Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix python 3.7 and PL==1.4.0 compatibility bugs #233

Merged
merged 1 commit into from
Jul 29, 2021
Merged

Conversation

lebrice
Copy link
Owner

@lebrice lebrice commented Jul 28, 2021

  • Limit pytorch version to 1.8.X, from >= 1.8: PyTorch suddenly got
    really picky about type annotations on subclasses of IterableDataset
    for some unexplained reason, and that breaks most of the
    EnvDataset-related wrappers in Sequoia.

  • Fix compatibility issues with lru_cache taking one required argument
    in python 3.7

  • Fix PyTorch-Lightning==1.4.0 compatibility issues due to ModelSummary
    not having a MODE_DEFAULT anymore

  • Fix typing.get_origin bug with python 3.7 in typed_dict.py

Signed-off-by: Fabrice Normandin fabrice.normandin@gmail.com

- Limit pytorch version to 1.8.X, from >= 1.8: PyTorch suddenly got
  really picky about type annotations on subclasses of IterableDataset
  for some unexplained reason, and that breaks most of the
  EnvDataset-related wrappers in Sequoia.

- Fix compatibility issues with `lru_cache` taking one required argument
  in python 3.7

- Fix PyTorch-Lightning==1.4.0 compatibility issues due to ModelSummary
  not having a MODE_DEFAULT anymore

- Fix `typing.get_origin` bug with python 3.7 in typed_dict.py

Signed-off-by: Fabrice Normandin <fabrice.normandin@gmail.com>
@lebrice lebrice merged commit fc767ee into master Jul 29, 2021
@lebrice lebrice deleted the python3.7-hotfix branch July 29, 2021 17:55
@ejguan
Copy link

ejguan commented Aug 3, 2021

Hey @lebrice,
Just got a notice of the issue related to IterableDataset. We did introduced a typing system for IterableDataset due to a new design of DataLoader. We want to use the type annotation as a Dataset attribute to enable the lazy data loading. And, it did make IterableDataset having more strict requirements. But, since we are relying on the python typing to do the type evaluation, the same Error should be raised from any static type checker like mypy.
We are sorry that it prevents you to upgrade PyTorch to 1.9, do you mind sharing some examples to us (may be file an issue on GitHub) about the Error? Then, we can provide suggestion or fix bugs from our side.

@lebrice
Copy link
Owner Author

lebrice commented Aug 4, 2021

Hi @ejguan, thanks for reaching out!

Do you have a link to the documentation or rationale behind that change? Id like to learn more about this.

My current (uninformed) opinion is that this is probably the least pythonic thing I've ever come across, but I'm also open to being very wrong about this! 😛

@ejguan
Copy link

ejguan commented Aug 4, 2021

We have a general RFC for new DataLoader and IterDataPipe (alias of IterableDataset). pytorch/pytorch#49440
We want to make DataPipe modular and connecting each DataPipe instance into a graph to represent a Dataset. And, each DataPipe represents a function like shuffling, batching, loading from disk. But, to validate the prior DataPipe and construct DataPipe instance without loading the data from prior DataPipe, we need a way to figure the type of data from prior DataPipe. For example, unzip DataPipe needs to know the number of elements for each data from prior DataPipe, then construct corresponding unzipped DataPipe.

class Unzip(IterDataPipe):
    def __new__(self, datapipe):
        num = len(datapipe.type)
        ...
        return tuple(SplitDataPipe(self, n) for n in range(num))

And, the typing system for DataPipe does have a document here https://github.com/pytorch/pytorch/blob/master/torch/utils/data/typing.ipynb
Let me know you thought. Any feedback is appreciated.

@lebrice
Copy link
Owner Author

lebrice commented Aug 4, 2021

Hey @ejguan this looks very interesting! I'll take a closer look, and in the meantime I've created a new issue for this: #235 .

Thanks again for reaching out! :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants