Skip to content

0.2.27

Latest
Compare
Choose a tag to compare
@marella marella released this 10 Sep 15:20

Changes

  • Skip evaluating tokens that are evaluated in the past. This can significantly speed up prompt processing in chat applications that prepend previous messages to prompt.
  • Deprecate LLM.reset() method. Use high-level API instead.
  • Add support for batching and beam search to 🤗 model.
  • Remove universal binary option when building for AVX2, AVX on macOS.