Skip to content

v0.2.0

Compare
Choose a tag to compare
@vchiley vchiley released this 04 Jul 05:36
d0efe55

🚀 LLM Foundry v0.2.0

LLM Foundry is an efficient codebase for training, evaluating, and deploying Large Language Models (LLM). LLM Foundry serves as the efficient training codebase for the MPT-7B and MPT-30B models. Our emphasis is on efficiency, scalability, and ease-of-use, to enable fast iteration and prototyping.

We are excited to share the release of v0.2.0, packed with support for new hardware, features, and tutorials.

📖 Tutorials

We have released new tutorial content and helper scripts for dataset preparation, pre-training, fine-tuning, and inference!

To start off, a basic walkthrough and answers to FAQs can be found in our Basic Tutorial.

Next, detailed guides for different workflows are linked below:

Training

  1. Part 1: LLM Pretraining
    1. Installation
    2. Dataset Preparation
    3. How to start single and multi-node pretraining
  2. Part 2: LLM Finetuning
    1. Using a dataset on the HuggingFace Hub
    2. Using a local dataset
    3. Using a StreamingDataset (MDS) formatted dataset locally or in an object store

In addition, for a more advanced and self-contained example of finetuning the MPT-7B model, see Finetune Example.

Inference

The inference tutorials cover several new features we've added that improve integration with HuggingFace and FasterTransformer libraries:

Major Features

LLM Foundry now uses Composer v0.15.0 and Streaming v0.5.1 as minimum requirements. For more details, see their release notes for Composer and Streaming for all the improvements.

⚠️ The new Streaming release includes a few API changes, see the Streaming v0.5 release notes for more details. Our API have also been changed to reflect these API modifications.

  1. 🆕 Torch 2.0 support

    LLM Foundry is now Torch 2.0 compatible!

    Note: we have not tested torch.compile, but do not expect significant performance improvements.

  2. H100 Support

    We now support NVIDIA H100 systems! See our blog post on Benchmarking LLMs on H100 GPUs for initial performance and convergence details.

    To run LLM Foundry with NVIDIA H100 systems, be sure to use a docker images that has CUDA 11.8+ and PyTorch 2.0+ versions.

    For example, mosaicml/pytorch:2.0.1_cu118-python3.10-ubuntu20.04 from our dockerhub has been tested with NVIDIA H100 systems.

    No code changes should be required.

  3. 📈 AMD MI250 GPU Support

    With the release of PyTorch 2.0 and ROCm 5.4+, excited to share that LLM training now works out of the box on AMD Datacenter GPUs! Read our blog post on Training LLMs with AMD MI250 GPUs for more details.

    Running with our stack was straightforward: use the ROCm 5.4 docker image rocm/dev-ubuntu-20.04:5.4.3-complete; and install PyTorch for ROCm 5.4 and install Flash Attention.

    Modify your configuration settings:

    • attn_impl=flash instead of the default triton
      • Note: ALiBi is currently not supported with attn_impl=flash.
    • loss_fn=torch_crossentropy instead of the default fused_crossentropy.
  4. 🚧 LoRA finetuning (Preview)

    We have included a preview release of Low Rank Adaptation (LoRA) support for memory-efficient fine-tuning of LLMs (Shen et al, 2021).

    To use LoRA, follow the instructions found here.

    Note: This is a preview feature, please let us know any feedback! The API and support is subject to change.

  5. 🔎 Evaluation Refactor (#308)

    Our evaluation suite has been significantly refactored into our Model Gauntlet approach. This includes a number of breaking API changes to support multiple models:

    • Instead of model, use the models keyword and provide a list of models.
    • tokenizer is now model-specific.

    For example, to run the gauntlet of various eval tasks with mosaicml/mpt-7b:

    cd llm-foundry/scripts
    composer eval/eval.py eval/yamls/hf_eval.yaml
        model_name_or_path=mosaicml/mpt-7b
    

    This release also makes evaluation deterministic even on different number of GPUs.

    For more details on all these changes, see #308

  6. ⏱️ Benchmarking Inference

    To better support the deployment of LLMs, we have included inference benchmarking suite and results across different hardware and other LLM models.

PR List

New Contributors

Full Changelog: v0.1.1...v0.2.0