Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

acceerate launch train.py, parallel train too slowly. it seems that accelerate model parallel not successsful. #17

Open
lileishitou opened this issue Aug 8, 2023 · 0 comments

Comments

@lileishitou
Copy link

how to use command "accelerate config" to generate the /home/duser/.cache/huggingface/accelerate/default_config.yaml.

I have try the two configuration that could run accelrate launch train.py successfully. but seems the model parallel training not successsfully. So how to configure the accelerate.

(1)

Which type of machine are you using? This machine
multi-GPU
How many different machines will you use (use more than 1 for multi-node training)? [1]:
Do you wish to optimize your script with torch dynamo?[yes/NO]:No
Do you want to use DeepSpeed? [yes/NO]: No
Do you want to use FullyShardedDataParallel? [yes/NO]: NO^H^H
Please enter yes or no.
Do you want to use FullyShardedDataParallel? [yes/NO]:
Do you want to use Megatron-LM ? [yes/NO]: NO
How many GPU(s) should be used for distributed training? [1]:5
What GPU(s) (by id) should be used for training on this machine as a comma-seperated list? [all]:1,2,3,4,5
Do you wish to use FP16 or BF16 (mixed precision)?
bf16
accelerate configuration saved at /home/duser/.cache/huggingface/accelerate/default_config.yaml

(2)
-In which compute environment are you running?
This machine
Which type of machine are you using?
multi-GPU
How many different machines will you use (use more than 1 for multi-node training)? [1]:
Do you wish to optimize your script with torch dynamo?[yes/NO]:n^HNO
Please enter yes or no.
Do you wish to optimize your script with torch dynamo?[yes/NO]:NO
Do you want to use DeepSpeed? [yes/NO]: NO
Do you want to use FullyShardedDataParallel? [yes/NO]: yes
What should be your sharding strategy?
FULL_SHARD
Do you want to offload parameters and gradients to CPU? [yes/NO]: yes
What should be your auto wrap policy?
NO_WRAP
-What should be your FSDP's backward prefetch policy?
BACKWARD_PRE
What should be your FSDP's state dict type?
FULL_STATE_DICT
How many GPU(s) should be used for distributed training? [1]:5
Do you wish to use FP16 or BF16 (mixed precision)?
no
accelerate configuration saved at /home/duser/.cache/huggingface/accelerate/default_config.yaml

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant