Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

why pytorch-lightning doc say "Model-parallel training (FSDP and DeepSpeed)". I think there is something wrong. #19823

Closed
HaoyaWHL opened this issue Apr 29, 2024 · 2 comments
Labels
docs Documentation related needs triage Waiting to be triaged by maintainers

Comments

@HaoyaWHL
Copy link

HaoyaWHL commented Apr 29, 2024

馃摎 Documentation

https://lightning.ai/docs/pytorch/stable/advanced/model_init.html in this doc, PL say FSDP after "model-parallel training".

but we all know FSDP is a data parallel method.
just like in https://huggingface.co/docs/transformers/fsdp, it says FSDP is data parallel.
so I think there maybe sth wrong

cc @Borda

@HaoyaWHL HaoyaWHL added docs Documentation related needs triage Waiting to be triaged by maintainers labels Apr 29, 2024
@BrianF-tessera
Copy link

FSDP is both model-parallel and data-parallel. Each GPU only sees a chunk of the model at a time as well as seeing only a chunk of the data at a time. This is correct

@HaoyaWHL
Copy link
Author

HaoyaWHL commented May 6, 2024

Oh, thanks for reply. I got it

@HaoyaWHL HaoyaWHL closed this as completed May 6, 2024
@HaoyaWHL HaoyaWHL closed this as not planned Won't fix, can't repro, duplicate, stale May 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs Documentation related needs triage Waiting to be triaged by maintainers
Projects
None yet
Development

No branches or pull requests

2 participants