Ollama token counts #1179

aiseei · 2024-02-15T23:38:39Z

aiseei
Feb 15, 2024

Describe the bug

Great product! easy to setup without the overheads.
We also use it to trace Ollama requests. Price info isnt the issue it is defn token count etc. Are there plans to support this ?

Thanks!

To reproduce

NA

Additional information

No response

Answered by marcklingen

Feb 16, 2024

Hi @aiseei,
Tracking token usage when using Ollama works well with Langfuse.

Ollama returns token counts at the end of the stream or together with the response when not streaming. Just tried this locally and it worked well.
Example from the api reference:

{
  "model": "llama2",
  "created_at": "2023-08-04T19:22:45.499127Z",
  "response": "",
  "done": true,
  "context": [1, 2, 3],
  "total_duration": 10706818083,
  "load_duration": 6338219291,
  "prompt_eval_count": 26, # <- input tokens
  "prompt_eval_duration": 130079000,
  "eval_count": 259, # <- output tokens
  "eval_duration": 4232710000
}

You can then add these token counts to the generation object in Langfuse to track them (docs).

View full answer

marcklingen · 2024-02-16T00:36:27Z

marcklingen
Feb 16, 2024
Maintainer

Hi @aiseei,
Tracking token usage when using Ollama works well with Langfuse.

Ollama returns token counts at the end of the stream or together with the response when not streaming. Just tried this locally and it worked well.
Example from the api reference:

{
  "model": "llama2",
  "created_at": "2023-08-04T19:22:45.499127Z",
  "response": "",
  "done": true,
  "context": [1, 2, 3],
  "total_duration": 10706818083,
  "load_duration": 6338219291,
  "prompt_eval_count": 26, # <- input tokens
  "prompt_eval_duration": 130079000,
  "eval_count": 259, # <- output tokens
  "eval_duration": 4232710000
}

You can then add these token counts to the generation object in Langfuse to track them (docs).

5 replies

aiseei Feb 16, 2024
Author

cool. will try that

flefevre May 3, 2024

Could you share your ollama configuration in order to trace call with langfuse?
I am using ollama docker compose setup.
Thanks

spicoflorin May 8, 2024

Hello!
I'm also interested on how to count the tokens when using the following models:

token counts used for the embeddings (I'm using sentence transformer)
token counts for prompt (I'm using llama2)
count completion tokens (I'm using llama2)
I'm using langchain with RAG to run the inference pipeline like this qa.run(query, callbacks=[stream_handler,langfuse_handler])

What configuration and code for Langfuse do I need for getting the above or at least the total input tokens and output_token?
Can you provide a full workable example on how to do it?
Suppose that I achieve these, where in the UI of the Langfuse server can I see them?

Thank you.

marcklingen May 8, 2024
Maintainer

I assume that you use langchain @spicoflorin, this is different from the original question. When using langchain it depends on whether the token counts are correctly parsed from the llm response by the langchain class as langfuse gets them from there

spicoflorin May 9, 2024

Hello, @marcklingen !

Thank you for your answer.
Yes, I'm using langchain with SenteceTransformer as embedding model and llama2 as generative model.
My prototype is based on genai-stack project where I have used langsmith as observaibility tool (that have incorporated the token counts feature)
Now, I would like to use langfuse for achieving (if it possible) the same functionalities.

Since I'm a beginner with these tools (langchain and langfuse), can you please share a code on how to:

get the tokens from the llm response
how to send them to the langfuse

Thank you in advance.

Regards,
Florin

stevensu1977 · 2024-05-27T02:25:29Z

stevensu1977
May 27, 2024

I have other question , if use streaming , I only get eval_count , but part.prompt_eval_count always undefined

for await (const part of chatResponse) {

                if (part.eval_count){
                    output_tokens=part.eval_count
                }
               .......

              }

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Langfuse

Ollama token counts #1179

{{title}}

Replies: 2 comments 5 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

Langfuse

Ollama token counts #1179

aiseei Feb 15, 2024

Describe the bug

To reproduce

Additional information

Replies: 2 comments · 5 replies

marcklingen Feb 16, 2024 Maintainer

aiseei Feb 16, 2024 Author

flefevre May 3, 2024

spicoflorin May 8, 2024

marcklingen May 8, 2024 Maintainer

spicoflorin May 9, 2024

stevensu1977 May 27, 2024

aiseei
Feb 15, 2024

Replies: 2 comments 5 replies

marcklingen
Feb 16, 2024
Maintainer

aiseei Feb 16, 2024
Author

marcklingen May 8, 2024
Maintainer

stevensu1977
May 27, 2024