FSDPStrategy how to set mixed_precision and other params of pytorch #1541

apachemycat · 2024-05-07T09:52:01Z

📚 The doc issue

model is on CPU before input to FSDP

model = FSDP(model,
    auto_wrap_policy=t5_auto_wrap_policy,
    mixed_precision=mp_policy,
    #sharding_strategy=sharding_strategy,
    device_id=torch.cuda.current_device())

Suggest a potential alternative/fix

give more examples of FSDPStrategy
model_wrapper_cfg=dict(type='MMFullyShardedDataParallel', cpu_offload=True,use_orig_params=False)

指定 FSDPStrategy 并配置参数

size_based_auto_wrap_policy = partial(
size_based_auto_wrap_policy, min_num_params=1e7)
strategy = dict(
type='FSDPStrategy',
model_wrapper=dict(auto_wrap_policy=size_based_auto_wrap_policy))

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FSDPStrategy how to set mixed_precision and other params of pytorch #1541

FSDPStrategy how to set mixed_precision and other params of pytorch #1541

apachemycat commented May 7, 2024

FSDPStrategy how to set mixed_precision and other params of pytorch #1541

FSDPStrategy how to set mixed_precision and other params of pytorch #1541

Comments

apachemycat commented May 7, 2024

📚 The doc issue

model is on CPU before input to FSDP

Suggest a potential alternative/fix

指定 FSDPStrategy 并配置参数