Skip to content

No labels!

There aren’t any labels for this repository quite yet.

activation checkpoint
activation checkpoint
AdaScale
AdaScale
Optimizer wrapper for automatic LR scaling without lossing model accuracy
better_eng
better_eng
bug
bug
Something isn't working
CLA Signed
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
dependencies
dependencies
Pull requests that update a dependency file
documentation
documentation
Improvements or additions to documentation
duplicate
duplicate
This issue or pull request already exists
enhancement
enhancement
New feature or request
FSDP + SSD offload
FSDP + SSD offload
FSDP
FSDP
FullyShardedDataParallel (zero-3)
good first issue
good first issue
Good for newcomers
help wanted
help wanted
Extra attention is needed
in_progress
in_progress
this issue is being worked on
invalid
invalid
This doesn't seem right
MEVO
MEVO
memory efficient vocab output
moe
moe
offload_model
offload_model
OSS
OSS
Optimizer State Sharding (zero-1)
Pipeline
Pipeline
Pipeline parallelism
question
question
Further information is requested
SDP
SDP
ShardedDataParallel (zero-2)
unit test
unit test
wontfix
wontfix
This will not be worked on