GPU resource for pretraining and instruction tuning #3

2000ZRL · 2024-04-07T15:03:16Z

What an excellent work! Could you please share the GPU requirement (number and memory) for pretraining and instruction tuning? Thanks.

KerolosAtef · 2024-04-07T16:19:57Z

Hello @2000ZRL
Thank you for your interest in our work.

For Video text datasets:
For llama2:
You can use A100 with 80GB with batch size=4 or V100 with batch size=1 (Minimum GPU RAM is 32GB)

For Mistral:
You can only use A100 with 80GB with batch size=1 (Minimum GPU RAM is 80 GB)

2000ZRL · 2024-04-13T05:49:03Z

Thanks for your reply? Could you please also tell me the training time cost for different model variants, e.g., llama2/mistral

Provide feedback