why pytorch-lightning doc say "Model-parallel training (FSDP and DeepSpeed)". I think there is something wrong. #19823

HaoyaWHL · 2024-04-29T08:07:48Z

📚 Documentation

https://lightning.ai/docs/pytorch/stable/advanced/model_init.html in this doc, PL say FSDP after "model-parallel training".

but we all know FSDP is a data parallel method.
just like in https://huggingface.co/docs/transformers/fsdp, it says FSDP is data parallel.
so I think there maybe sth wrong

cc @Borda

BrianF-tessera · 2024-04-30T13:49:47Z

FSDP is both model-parallel and data-parallel. Each GPU only sees a chunk of the model at a time as well as seeing only a chunk of the data at a time. This is correct

HaoyaWHL · 2024-05-06T00:55:34Z

Oh, thanks for reply. I got it

HaoyaWHL added docs Documentation related needs triage Waiting to be triaged by maintainers labels Apr 29, 2024

HaoyaWHL closed this as completed May 6, 2024

HaoyaWHL closed this as not planned Won't fix, can't repro, duplicate, stale May 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

why pytorch-lightning doc say "Model-parallel training (FSDP and DeepSpeed)". I think there is something wrong. #19823

why pytorch-lightning doc say "Model-parallel training (FSDP and DeepSpeed)". I think there is something wrong. #19823

HaoyaWHL commented Apr 29, 2024 •

edited by github-actions bot

BrianF-tessera commented Apr 30, 2024

HaoyaWHL commented May 6, 2024

why pytorch-lightning doc say "Model-parallel training (FSDP and DeepSpeed)". I think there is something wrong. #19823

why pytorch-lightning doc say "Model-parallel training (FSDP and DeepSpeed)". I think there is something wrong. #19823

Comments

HaoyaWHL commented Apr 29, 2024 • edited by github-actions bot

📚 Documentation

BrianF-tessera commented Apr 30, 2024

HaoyaWHL commented May 6, 2024

HaoyaWHL commented Apr 29, 2024 •

edited by github-actions bot