After 10-20 questions the chatbot app is no longer working #1083

benoitf · 2024-05-13T13:53:11Z

Is your enhancement related to a problem? Please describe

ask many questions like
"What is Podman Desktop" (waiting each time to get the answer)

After that you get an error or application hangs and consume all the CPU

Describe the solution you'd like

No error or being properly asked to restart the app

Describe alternatives you've considered

No response

Additional context

No response

MichaelClifford · 2024-05-13T15:38:46Z

Do you get an error like this in the model server pod?

    result = context.run(func, *args)
             ^^^^^^^^^^^^^^^^^^^^^^^^
�  File "/opt/app-root/lib64/python3.11/site-packages/llama_cpp/llama_chat_format.py", line 247, in _convert_text_completion_chunks_to_chat
    for i, chunk in enumerate(chunks):
  File "/opt/app-root/lib64/python3.11/site-packages/llama_cpp/llama.py", line 970, in _create_completion
    raise ValueError(
ValueError: Requested tokens (2098) exceed context window of 2048

If so, it may be do to the fact that the Memory module needs to be updated to prevent this from happening.

benoitf · 2024-05-13T19:08:48Z

I had this error sometimes but here I got

Llama.generate: prefix-match hit

in the logs

Llama.generate: prefix-match hit
INFO:     127.0.0.1:40230 - "GET / HTTP/1.1" 404 Not Found
INFO:     127.0.0.1:40232 - "GET / HTTP/1.1" 404 Not Found
INFO:     127.0.0.1:50622 - "GET / HTTP/1.1" 404 Not Found
INFO:     127.0.0.1:50638 - "GET / HTTP/1.1" 404 Not Found
INFO:     127.0.0.1:40224 - "POST /v1/chat/completions HTTP/1.1" 200 OK
INFO:     127.0.0.1:57906 - "GET / HTTP/1.1" 404 Not Found

and then all CPU is eaten but I never get an answer

benoitf · 2024-05-13T19:10:07Z

but yes at summit I faced the Requested tokens exceed context window error

MichaelClifford · 2024-05-13T19:11:23Z

OK, I think we can fix the context window issue. But I'm not sure about the "all-cpu-no-answer" issue. What hardware are you using?

benoitf · 2024-05-14T07:04:27Z

forgot to hit send, hardware is Apple M2 Max
but I think I only reached once the 'no reply'

I think also containers/ai-lab-recipes#497 is more important

MichaelClifford · 2024-05-15T20:46:52Z

@benoitf please see containers/ai-lab-recipes#495

MichaelClifford mentioned this issue May 15, 2024

Lower ConversationBufferWindowMemory containers/ai-lab-recipes#495

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

After 10-20 questions the chatbot app is no longer working #1083

After 10-20 questions the chatbot app is no longer working #1083

benoitf commented May 13, 2024

MichaelClifford commented May 13, 2024

benoitf commented May 13, 2024

benoitf commented May 13, 2024

MichaelClifford commented May 13, 2024

benoitf commented May 14, 2024

MichaelClifford commented May 15, 2024

After 10-20 questions the chatbot app is no longer working #1083

After 10-20 questions the chatbot app is no longer working #1083

Comments

benoitf commented May 13, 2024

Is your enhancement related to a problem? Please describe

Describe the solution you'd like

Describe alternatives you've considered

Additional context

MichaelClifford commented May 13, 2024

benoitf commented May 13, 2024

benoitf commented May 13, 2024

MichaelClifford commented May 13, 2024

benoitf commented May 14, 2024

MichaelClifford commented May 15, 2024