Release v0.1.32 · ollama/ollama

New models

WizardLM 2: State of the art large language model from Microsoft AI with improved performance on complex chat, multilingual, reasoning and agent use cases.
- wizardlm2:8x22b: large 8x22B model based on Mixtral 8x22B
- wizardlm2:7b: fast, high-performing model based on Mistral 7B
Snowflake Arctic Embed: A suite of text embedding models by Snowflake, optimized for performance.
Command R+: a powerful, scalable large language model purpose-built for RAG use cases
DBRX: A large 132B open, general-purpose LLM created by Databricks.
Mixtral 8x22B: the new leading Mixture of Experts (MoE) base model by Mistral AI.

Ollama will now better utilize available VRAM, leading to less out-of-memory errors, as well as better GPU utilization
When running larger models that don't fit into VRAM on macOS, Ollama will now split the model between GPU and CPU to maximize performance.
Fixed several issues where Ollama would hang upon encountering an error
Fix issue where using quotes in OLLAMA_ORIGINS would cause an error

Full Changelog: v0.1.31...v0.1.32