Skip to content

Releases: ollama/ollama

v0.0.21

23 Sep 04:01
Compare
Choose a tag to compare
  • Fixed an issue where empty responses would be returned if template was provided in the api, but not prompt
  • Fixed an issue where the "Send a message" placeholder would show when writing multi line prompts with ollama run
  • Fixed an issue where multi-line prompts in ollama run wouldn't be submitted when pressing Return

Full Changelog: v0.0.20...v0.0.21

v0.0.20

22 Sep 22:04
c928ceb
Compare
Choose a tag to compare

What's Changed

  • ollama run has a new & improved experience:
    • Models will now be loaded immediately making even the first prompt much faster
    • Added hint text
    • Ollama will now fit words in the available width of the terminal for better readability
  • OLLAMA_HOST now supports ipv6 hostnames
  • ollama run will now automatically pull models if they don't exist when using a remote instance of Ollama
  • Sending an empty prompt field to /api/generate will now load the model so the next request is fast
  • Fixed an issue where ollama create would not correctly detect falcon model sizes
  • Add a simple python client to access Ollama in api/client.py by @pdevine
  • Improvements to showing progress on ollama pull and ollama push
  • Fixed an issue for adding empty layers with ollama create
  • Fixed an issue for running Ollama on Windows (compiled from source)
  • Fixed an error when running ollama push
  • Readable community projects by @jamesbraza

New Contributors

Full Changelog: v0.0.19...v0.0.20

v0.0.19

11 Sep 20:39
45ac07c
Compare
Choose a tag to compare

What's Changed

  • Updated Docker image for Ollama docker pull ollama/ollama
  • Ability to import and use GGUF file type models
  • Fixed issue where ollama push would error on long-running uploads
  • Ollama will now automatically clean up unused data locally
  • Improve build instructions by @apepper

New Contributors

Full Changelog: v0.0.18...v0.0.19

v0.0.18

06 Sep 20:54
83c6be1
Compare
Choose a tag to compare

What's Changed

  • New ollama show command for viewing details about a model:
    • See a system prompt for a model: ollama show --system orca-mini
    • View a model's parameters: ollama show --parameters codellama
    • View a model's default prompt template: ollama show --template llama2
    • View a Modelfile for a model: ollama show --modelfile llama2
  • Minor improvements to model loading and generation time
  • Fixed an issue where large prompts would cause codellama and similar models to show an error
  • Fixed compatibility issues with macOS 11 Big Sur
  • Fixed an issue where characters would be escaped in prompts causing escaped characters like & in the output
  • Fixed several issues with building from source on Windows and Linux
  • Minor performance improvements to model loading and generation
  • New sentiments example by @technovangelist
  • Fixed num_keep parameter not working properly
  • Fixed issue where Modelfile parameters would not be honored at runtime
  • Added missing options params to the embeddings docs by @herrjemand
  • Fixed issue where ollama list would error when there were no models to show

When building from source, Ollama will require running go generate to generate dependencies:

git clone https://github.com/jmorganca/ollama
cd ollama
go generate ./...
go build .

Note: cmake is required to build dependencies. On macOS it can be installed with brew install cmake, and on other platforms via the installer or well-known package managers.

New Contributors

Full Changelog: v0.0.17...v0.0.18

v0.0.17

30 Aug 18:21
f4432e1
Compare
Choose a tag to compare

What's Changed

  • Multiple models can be removed together: ollama rm mario:latest orca-mini:3b
  • ollama list will now show a unique ID for each model based on its contents
  • Fixed bug where a prompt wasn't set by default causing an error when running a model created with ollama create
  • Fixed crash when running 34B parameter models on hardware with not enough memory to run it.
  • Fixed issue where non-quantized f16 models would not run
  • Improved network performance of ollama push
  • Fixed issue where stop sequences (such as \n) wouldn't be honored

New Contributors

  • @sqs made their first contribution in #415

Full Changelog: v0.0.16...v0.0.17

v0.0.16

26 Aug 01:49
Compare
Choose a tag to compare

What's Changed

  • Ollama version can be checked by running ollama -v or ollama --version
  • Support for 34B models such as codellama
  • Model names or paths withhttps:// in front of them will now work when running ollama run

New Contributors

Full Changelog: v0.0.15...v0.0.16

v0.0.15

18 Aug 00:52
c84bbf1
Compare
Choose a tag to compare

an image of ollama sitting in front of shelves with many books depicting a library of models that are available for download

📍 Ollama model list is now available

Ollama now supports a list of models published on ollama.ai/library. We are working on ways to allow anyone to push models to Ollama. Expect more news on this in the future.

Please join the community on Discord if you have any questions/concerns/ want to hang out.

What's Changed

  • Target remote Ollama hosts with OLLAMA_HOST=<host> ollama run llama2
  • Fixed issue where PARAMETER values weren't correctly in Modelfiles
  • Fixed issue where a warning would show when parsing a Modelfile comment
  • Ollama will now parse data from ggml format models and use them to make sure your system has enough memory to run a model with GPU support
  • Experimental support for creating fine-tuned models via ollama create with Modelfiles: use the ADAPTER Modelfile instruction
  • Added documentation for the num_gqa parameter
  • Added tutorials and examples for using LangChain with Ollama
  • Ollama will now log embedding eval timing
  • Update llama.cpp to the latest version
  • Add context to api documentation for /api/generate
  • Fixed issue with resuming downloads via ollama pull
  • Using EMBED in Modelfiles will now skip regenerating embeddings if the input files have not changed
  • Ollama will now use an already loaded model for /api/embeddings if it is available
  • New example: dockerit – a tool to help you build and run your application in a Docker container
  • Retry download on network errors

New Contributors

Full Changelog: v0.0.14...v0.0.15

v0.0.14

10 Aug 22:46
Compare
Choose a tag to compare

ollama 0 0 14

What's Changed

  • Ollama 🦜️🔗 LangChain Integration! https://python.langchain.com/docs/integrations/llms/ollama
  • API docs for Ollama: https://github.com/jmorganca/ollama/blob/main/docs/api.md
  • Llama 2 70B model with Metal support (Recommend at least 64GB of memory) ollama run llama2:70b
  • Uncensored Llama 2 70B model with Metal support ollama run llama2-uncensored:70b
  • New models available! For a list of models you can directly pull from Ollama, please see https://gist.github.com/mchiang0610/b959e3c189ec1e948e4f6a1f737a1fc5
  • Embeddings can now be generated for a model with /api/embeddings
  • Experimental EMBED instruction in the Modelfile
  • Configurable rope frequency parameters
  • OLLAMA_HOST can now specify the entire address to serve on with ollama serve
  • Fixed issue where context was truncated incorrectly leading to poor output
  • ollama pull can now be run in different terminal windows for the same model concurrently
  • Add an example on multiline input
  • Fixed error not being checked on ollama pull

New Contributors

Full Changelog: v0.0.13...0.0.14

v0.0.13

02 Aug 16:07
Compare
Choose a tag to compare

New improvements

  • Using Ollama CLI without Ollama running will now start Ollama
  • Changed the buffer limit so that conversations would continue until it is complete
  • Models now stay loaded in memory automatically between messages, so series of prompts are extra fast!
  • The white fluffy Ollama icon is back when using dark mode
    Screenshot 2023-08-03 at 4 30 02 PM
  • Ollama will now run on Intel Macs. Compatibility & performance improvements to come
  • When running ollama run, the /show command can be used to inspect the current model
  • ollama run can now take in multi-line strings:
    % ollama run llama2
    >>> """       
      Is this a
      multi-line
      string?
    """
    Thank you for asking! Yes, the input you provided is a multi-line string. It contains multiple lines of text separated by line breaks.
    
  • More seamless updates: Ollama will now show a subtle hint that an update is ready in the tray menu, instead of a dialog window
  • ollama run --verbose will now show load duration times

Bug fixes

  • Fixed crashes on Macs with 8GB of shared memory
  • Fixed issues in scanning multi-line strings in a Modelfile

v0.0.12

26 Jul 15:04
Compare
Choose a tag to compare

New improvements

  • You can now rename models you've pulled or created with ollama cp
  • Added support for running k-quant models
  • Performance improvements from enabling Accelerate
  • Ollama's API can now be accessed by websites hosted on localhost
  • ollama create will now automatically pull models in the FROM instruction you don't have locally

Bug fixes

  • ollama pull will now show a better error when pulling a model that doesn't exist
  • Fixed an issue where cancelling and resuming downloads with ollama pull would cause an error
  • Fixed formatting of different errors so they are readable when running ollama commands
  • Fixed an issue where prompt templates defined with the TEMPLATE instruction wouldn't be parsed correctly
  • Fixed error when a model isn't found