Skip to content

GPT-NeoX 2.0

Latest
Compare
Choose a tag to compare
@Quentin-Anthony Quentin-Anthony released this 10 Mar 00:26
· 263 commits to main since this release
9610391

With GPT-NeoX 2.0, we now support upstream DeepSpeed. This enables the use of new DeepSpeed features such as Curriculum Learning, Communication Logging, and Autotuning.

For any changes in upstream DeepSpeed that are fundamentally incompatible with GPT-NeoX 2.0, we do the following:

  • Attempt to create a PR to upstream DeepSpeed
  • Stage the PR on DeeperSpeed 2.x, so that there's always a DeepSpeed version that's guaranteed to work with GPT-Neox 2.x.

Therefore, we recommend using DeeperSpeed 2.x unless your use-case relies on a specific upstream DeepSpeed feature that we haven't merged into DeeperSpeed 2.x yet.

What's Changed

  • Mup Support in #704
  • Bring deepspeed_main up-to-date in #746
  • Latest DeepSpeed Support in #663
  • Curriculum Learning Support in #695
  • Autotuning Support in #739

Full Changelog: v1.0...v2.0