Skip to content

v0.5.0

Compare
Choose a tag to compare
@irenedea irenedea released this 08 Feb 00:01
a667ebf

馃殌 LLM Foundry v0.5.0

LLM Foundry is an efficient codebase for training, evaluating, and deploying Large Language Models (LLMs) and serves as the foundation for the MPT model series.

In addition to the usual bug fixes and performance improvements, we've added lots of new features!

New Features

LoRA Support (with FSDP!) (#886)

LLM Foundry now supports LoRA via an integration with the PEFT library. Within LLM Foundry, run train.py, adding peft_config arguments to the model section of the config .yaml, like so:

model:
  ...
  peft_config:
      r: 16
      peft_type: LORA
      task_type: CAUSAL_LM
      lora_alpha: 32
      lora_dropout: 0.05
      target_modules:
      - q_proj
      - k_proj

Read more about it in the tutorial.

ALiBi for Flash Attention (#820)

We've added support for using ALiBi with Flash Attention (v2.4.2 or higher).

model:
     ...
     attn_config:
         attn_impl: flash
         alibi: True

Chat Data for Finetuning (#884)

We now support finetuning on chat data, with automatic formatting applied using Hugging Face tokenizer chat templates.

Each sample requires a single key "messages" that maps to an array of message objects. Each message object in the array represents a single message in the conversation and must contain the following keys:

  • role : A string indicating the author of the message. Possible values are "system" ,"user" , and "assistant" .
  • content : A string containing the text of the message.

We require that there must be at least one message with the role "assistant", and the last message in the "messages" array must have the role "assistant" .

Here's an example .jsonl with chat data:


{ "messages": [ { "role": "user", "content": "Hi, MPT!" }, { "role": "assistant", "content": "Hi, user!" } ]}
{ "messages": [ 
  { "role": "system": "A conversation between a user and a helpful and honest assistant"}
  { "role": "user", "content": "Hi, MPT!" }, 
  { "role": "assistant", "content": "Hi, user!" },
  { "role": "user", "content": "Is multi-turn chat supported?"},
  { "role": "assistant", "content": "Yes, we can chat for as long as my context length allows." }
]}
...

Safe Load for HuggingFace Datasets (#798)

We now provide a safe_load option when loading HuggingFace datasets for finetuning.

This restricts loaded files to .jsonl, .csv, or .parquet extensions to prevent arbitrary code execution.

To use, set safe_load to true in your dataset configuration:

  train_loader:
    name: finetuning
    dataset:
      safe_load: true
      ...

New PyTorch, Composer, Streaming, and Transformers versions

As always, we've updated to new versions of the core dependencies of LLM Foundry, bringing better performance, new features, and support for new models (mixtral in particular).

Deprecations

Support for Flash Attention v1 (#921)

Will be removed in v0.6.0.

Breaking Changes

Removed support for PyTorch versions before 2.1 (#787)

We no longer support PyTorch versions before 2.1.

Removed Deprecated Features (#948)

We've removed features that have been deprecated for at least one release.

What's Changed

New Contributors

Full Changelog: v0.4.0...v0.5.0