0.2.27

Latest

Latest

marella released this 10 Sep 15:20

Changes

Skip evaluating tokens that are evaluated in the past. This can significantly speed up prompt processing in chat applications that prepend previous messages to prompt.
Deprecate LLM.reset() method. Use high-level API instead.
Add support for batching and beam search to 🤗 model.
Remove universal binary option when building for AVX2, AVX on macOS.

Assets 2