Releases: ollama/ollama
v0.0.21
- Fixed an issue where empty responses would be returned if
template
was provided in the api, but notprompt
- Fixed an issue where the "Send a message" placeholder would show when writing multi line prompts with
ollama run
- Fixed an issue where multi-line prompts in
ollama run
wouldn't be submitted when pressing Return
Full Changelog: v0.0.20...v0.0.21
v0.0.20
What's Changed
ollama run
has a new & improved experience:- Models will now be loaded immediately making even the first prompt much faster
- Added hint text
- Ollama will now fit words in the available width of the terminal for better readability
OLLAMA_HOST
now supports ipv6 hostnamesollama run
will now automatically pull models if they don't exist when using a remote instance of Ollama- Sending an empty
prompt
field to/api/generate
will now load the model so the next request is fast - Fixed an issue where
ollama create
would not correctly detect falcon model sizes - Add a simple python client to access Ollama in
api/client.py
by @pdevine - Improvements to showing progress on
ollama pull
andollama push
- Fixed an issue for adding empty layers with
ollama create
- Fixed an issue for running Ollama on Windows (compiled from source)
- Fixed an error when running
ollama push
- Readable community projects by @jamesbraza
New Contributors
- @jamesbraza made their first contribution in #550
Full Changelog: v0.0.19...v0.0.20
v0.0.19
What's Changed
- Updated Docker image for Ollama
docker pull ollama/ollama
- Ability to import and use GGUF file type models
- Fixed issue where
ollama push
would error on long-running uploads - Ollama will now automatically clean up unused data locally
- Improve build instructions by @apepper
New Contributors
Full Changelog: v0.0.18...v0.0.19
v0.0.18
What's Changed
- New
ollama show
command for viewing details about a model:- See a system prompt for a model:
ollama show --system orca-mini
- View a model's parameters:
ollama show --parameters codellama
- View a model's default prompt template:
ollama show --template llama2
- View a Modelfile for a model:
ollama show --modelfile llama2
- See a system prompt for a model:
- Minor improvements to model loading and generation time
- Fixed an issue where large prompts would cause
codellama
and similar models to show an error - Fixed compatibility issues with macOS 11 Big Sur
- Fixed an issue where characters would be escaped in prompts causing escaped characters like
&
in the output - Fixed several issues with building from source on Windows and Linux
- Minor performance improvements to model loading and generation
- New sentiments example by @technovangelist
- Fixed
num_keep
parameter not working properly - Fixed issue where
Modelfile
parameters would not be honored at runtime - Added missing options params to the embeddings docs by @herrjemand
- Fixed issue where
ollama list
would error when there were no models to show
When building from source, Ollama will require running go generate
to generate dependencies:
git clone https://github.com/jmorganca/ollama
cd ollama
go generate ./...
go build .
Note: cmake
is required to build dependencies. On macOS it can be installed with brew install cmake
, and on other platforms via the installer or well-known package managers.
New Contributors
- @callmephilip made their first contribution in #448
- @herrjemand made their first contribution in #472
Full Changelog: v0.0.17...v0.0.18
v0.0.17
What's Changed
- Multiple models can be removed together:
ollama rm mario:latest orca-mini:3b
ollama list
will now show a unique ID for each model based on its contents- Fixed bug where a prompt wasn't set by default causing an error when running a model created with
ollama create
- Fixed crash when running 34B parameter models on hardware with not enough memory to run it.
- Fixed issue where non-quantized f16 models would not run
- Improved network performance of
ollama push
- Fixed issue where stop sequences (such as
\n
) wouldn't be honored
New Contributors
Full Changelog: v0.0.16...v0.0.17
v0.0.16
What's Changed
- Ollama version can be checked by running
ollama -v
orollama --version
- Support for 34B models such as
codellama
- Model names or paths with
https://
in front of them will now work when runningollama run
New Contributors
- @rlbaker made their first contribution in #377. Also thanks to @jesjess243 for opening a PR to fix this
Full Changelog: v0.0.15...v0.0.16
v0.0.15
📍 Ollama model list is now available
Ollama now supports a list of models published on ollama.ai/library. We are working on ways to allow anyone to push models to Ollama. Expect more news on this in the future.
Please join the community on Discord if you have any questions/concerns/ want to hang out.
What's Changed
- Target remote Ollama hosts with
OLLAMA_HOST=<host> ollama run llama2
- Fixed issue where
PARAMETER
values weren't correctly in Modelfiles - Fixed issue where a warning would show when parsing a Modelfile comment
- Ollama will now parse data from ggml format models and use them to make sure your system has enough memory to run a model with GPU support
- Experimental support for creating fine-tuned models via
ollama create
with Modelfiles: use theADAPTER
Modelfile instruction - Added documentation for the
num_gqa
parameter - Added tutorials and examples for using LangChain with Ollama
- Ollama will now log embedding eval timing
- Update llama.cpp to the latest version
- Add
context
to api documentation for/api/generate
- Fixed issue with resuming downloads via
ollama pull
- Using
EMBED
in Modelfiles will now skip regenerating embeddings if the input files have not changed - Ollama will now use an already loaded model for
/api/embeddings
if it is available - New example:
dockerit
– a tool to help you build and run your application in a Docker container - Retry download on network errors
New Contributors
- @asarturas made their first contribution in #326
- @gusanmaz made their first contribution in #340
- @bmizerany made their first contribution in #262
Full Changelog: v0.0.14...v0.0.15
v0.0.14
What's Changed
- Ollama 🦜️🔗 LangChain Integration! https://python.langchain.com/docs/integrations/llms/ollama
- API docs for Ollama: https://github.com/jmorganca/ollama/blob/main/docs/api.md
- Llama 2 70B model with Metal support (Recommend at least 64GB of memory)
ollama run llama2:70b
- Uncensored Llama 2 70B model with Metal support
ollama run llama2-uncensored:70b
- New models available! For a list of models you can directly pull from Ollama, please see https://gist.github.com/mchiang0610/b959e3c189ec1e948e4f6a1f737a1fc5
- Embeddings can now be generated for a model with
/api/embeddings
- Experimental
EMBED
instruction in the Modelfile - Configurable rope frequency parameters
OLLAMA_HOST
can now specify the entire address to serve on withollama serve
- Fixed issue where context was truncated incorrectly leading to poor output
ollama pull
can now be run in different terminal windows for the same model concurrently- Add an example on multiline input
- Fixed error not being checked on
ollama pull
New Contributors
- @cmiller01 made their first contribution in #301
- @soroushj made their first contribution in #316
- @findmyway made their first contribution in #311
Full Changelog: v0.0.13...0.0.14
v0.0.13
New improvements
- Using Ollama CLI without Ollama running will now start Ollama
- Changed the buffer limit so that conversations would continue until it is complete
- Models now stay loaded in memory automatically between messages, so series of prompts are extra fast!
- The white fluffy Ollama icon is back when using dark mode
- Ollama will now run on Intel Macs. Compatibility & performance improvements to come
- When running
ollama run
, the/show
command can be used to inspect the current model ollama run
can now take in multi-line strings:% ollama run llama2 >>> """ Is this a multi-line string? """ Thank you for asking! Yes, the input you provided is a multi-line string. It contains multiple lines of text separated by line breaks.
- More seamless updates: Ollama will now show a subtle hint that an update is ready in the tray menu, instead of a dialog window
ollama run --verbose
will now show load duration times
Bug fixes
- Fixed crashes on Macs with 8GB of shared memory
- Fixed issues in scanning multi-line strings in a
Modelfile
v0.0.12
New improvements
- You can now rename models you've pulled or created with
ollama cp
- Added support for running k-quant models
- Performance improvements from enabling Accelerate
- Ollama's API can now be accessed by websites hosted on
localhost
ollama create
will now automatically pull models in theFROM
instruction you don't have locally
Bug fixes
ollama pull
will now show a better error when pulling a model that doesn't exist- Fixed an issue where cancelling and resuming downloads with
ollama pull
would cause an error - Fixed formatting of different errors so they are readable when running
ollama
commands - Fixed an issue where prompt templates defined with the
TEMPLATE
instruction wouldn't be parsed correctly - Fixed error when a model isn't found