Ollama `ps` command for showing currently loaded models #4327

pdevine · 2024-05-10T21:50:12Z

This change adds a rudimentary ps command which makes use of the new scheduler changes in the server. The UX also

The UX for this depends on whether you're using the CPU, GPU, or a hybrid of both and looks like:

NAME            ID              SIZE    PROCESSOR        UNTIL
mistral:latest  61e88e884507    5.4 GB  100% GPU         28 seconds from now

NAME            ID              SIZE    PROCESSOR        UNTIL
mistral:latest  61e88e884507    5.4 GB  48%/52% CPU/GPU  28 seconds from now

NAME            ID              SIZE    PROCESSOR        UNTIL
mistral:latest  61e88e884507    5.4 GB  100% CPU         28 seconds from now

Additionally, there is a new --keepalive flag in the REPL which can be used to set how long you want the model to stay resident in memory after the model has finished inference. It takes a duration string (e.g. 3m30s), however we can switch this to also accept integers similar to the API.

This also introduces a new /api/ps endpoint which returns back a response similar to the /api/tags endpoint albeit with additional information. The size of the running model will not match the amount reported from the /api/tags endpoint for a given model since it can take additional memory when loaded onto the GPU or as a hybrid.

Partially addresses #3902
Fixes #4013
Replaces #2359

jmorganca · 2024-05-13T23:36:45Z

cmd/cmd.go

@@ -324,6 +325,18 @@ func RunHandler(cmd *cobra.Command, args []string) error {
 	}
 	opts.Format = format

+	keepAlive, err := cmd.Flags().GetString("keepalive")


Suggested change

keepAlive, err := cmd.Flags().GetString("keepalive")

keepAlive, err := cmd.Flags().GetString("keep-alive")

I think this is ok as is, but would suggest keep-alive to be consistent with the api keep_alive – although the design of ollama's cli should be a matter of taste/usability over consistency

I thought about this, but we have --nowordwrap as one of the other options for run, so it felt weird going to kebab case here.

pdevine added 6 commits May 10, 2024 14:19

add ollama ps command

8d95c9b

humantime forever

334fdc7

add keepalive to ollama run

a8e6033

show cpu/gpu percentages

dd38c7e

fix sched unit tests

bc03ad8

feed the linter

d94da46

pdevine mentioned this pull request May 11, 2024

Add ollama ps command #2359

Closed

jmorganca reviewed May 13, 2024

View reviewed changes

jmorganca approved these changes May 13, 2024

View reviewed changes

pdevine merged commit 6845988 into main May 14, 2024
15 checks passed

pdevine deleted the pdevine/ps branch May 14, 2024 00:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ollama `ps` command for showing currently loaded models #4327

Ollama `ps` command for showing currently loaded models #4327

pdevine commented May 10, 2024

jmorganca May 13, 2024 •

edited

pdevine May 13, 2024

	keepAlive, err := cmd.Flags().GetString("keepalive")
	keepAlive, err := cmd.Flags().GetString("keep-alive")

Ollama ps command for showing currently loaded models #4327

Ollama ps command for showing currently loaded models #4327

Conversation

pdevine commented May 10, 2024

jmorganca May 13, 2024 • edited

Choose a reason for hiding this comment

pdevine May 13, 2024

Choose a reason for hiding this comment

Ollama `ps` command for showing currently loaded models #4327

Ollama `ps` command for showing currently loaded models #4327

jmorganca May 13, 2024 •

edited