Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Example Request] Minimal Example for Fine Tuning a LLM with FSDP utilizing the HuggingFace Trainer #4580

Open
HuBaX opened this issue Mar 1, 2024 · 0 comments

Comments

@HuBaX
Copy link

HuBaX commented Mar 1, 2024

Describe the use case example you want to see
I'm currently trying to figure out how to Fine Tune a LLM with FSDP on a single instance with multiple GPUs. For the training, I'm using the HuggingFace Trainer. Since I don't get it to work I scrolled through the model_parallel examples in this repo and found myself even more confused than before. All of the examples provided in this repo are so big that it's hard for me to understand what I have to do in order to simply get FSDP working for my use case, especially since I'm quite new to Sagemaker and never had to use FSDP before. I also don't know what work the HuggingFace Trainer already does for me when trying to use FSDP. I'd be glad if someone could provide a minimal example for my use case.

How would this example be used? Please describe.
The example would be a reference for developers trying to get FSDP working with the HuggingFace Trainer.

Describe which SageMaker services are involved
Notebook Instances and Training Jobs

Describe what other services (other than SageMaker) are involved*
S3 - for loading the dataset as well as storing the model weights

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant