Skip to content

Release v0.2.22

Compare
Choose a tag to compare
@merrymercy merrymercy released this 01 Aug 17:26
· 363 commits to main since this release
  • Released Vicuna v1.5 based on Llama 2 with 4K and 16K context lengths. Download weights
  • Released Chatbot Arena Conversations, a dataset containing 33k conversations with human preferences. Download it here.
  • Serving
    • Add a multi-model worker that can host multiple models on a single GPU and share base weights for PEFT models. #1866 #1905
    • AWQ 4-bit quantization support. #2103
    • Support model models (Llama 2, Claude 2, ChatGLM 2, StarChat, Baichuan-13B, InternLM, airoboros, PEFT adapters).
    • Better support for AMD GPUs, Intel XPUs. #1954 #2052
  • Training
    • Support rope scaling. #2013
    • Support flash attention 2. #2059
    • Support xformer. #1970