Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

containers:apply_to_tensors fails to return (or test) the application result on PackedSequence #996

Open
crutcher opened this issue May 26, 2022 · 1 comment
Assignees
Labels
FSDP FullyShardedDataParallel (zero-3)

Comments

@crutcher
Copy link
Contributor

At this point in apply_to_tensors(), the PackedSequence case drops the result tensors, unlike the other cases
https://github.com/facebookresearch/fairscale/blob/main/fairscale/utils/containers.py#L27

and thus fully_sharded_data_parallel is going to fail to capture the tensors for hooks here:
https://github.com/facebookresearch/fairscale/blob/main/fairscale/nn/data_parallel/fully_sharded_data_parallel.py#L1545

or properly yield casting results here:
https://github.com/facebookresearch/fairscale/blob/main/fairscale/nn/data_parallel/fully_sharded_data_parallel.py#L2490

@crutcher crutcher self-assigned this May 26, 2022
@min-xu-ai min-xu-ai added the FSDP FullyShardedDataParallel (zero-3) label May 26, 2022
@min-xu-ai
Copy link
Contributor

This is a care that our unit tests failed to cover, right?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
FSDP FullyShardedDataParallel (zero-3)
Projects
None yet
Development

No branches or pull requests

2 participants