Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

After 10-20 questions the chatbot app is no longer working #1083

Open
benoitf opened this issue May 13, 2024 · 6 comments
Open

After 10-20 questions the chatbot app is no longer working #1083

benoitf opened this issue May 13, 2024 · 6 comments

Comments

@benoitf
Copy link
Collaborator

benoitf commented May 13, 2024

Is your enhancement related to a problem? Please describe

ask many questions like
"What is Podman Desktop" (waiting each time to get the answer)

After that you get an error or application hangs and consume all the CPU

Describe the solution you'd like

No error or being properly asked to restart the app

Describe alternatives you've considered

No response

Additional context

No response

@MichaelClifford
Copy link

Do you get an error like this in the model server pod?

    result = context.run(func, *args)
             ^^^^^^^^^^^^^^^^^^^^^^^^
�  File "/opt/app-root/lib64/python3.11/site-packages/llama_cpp/llama_chat_format.py", line 247, in _convert_text_completion_chunks_to_chat
    for i, chunk in enumerate(chunks):
  File "/opt/app-root/lib64/python3.11/site-packages/llama_cpp/llama.py", line 970, in _create_completion
    raise ValueError(
ValueError: Requested tokens (2098) exceed context window of 2048

If so, it may be do to the fact that the Memory module needs to be updated to prevent this from happening.

@benoitf
Copy link
Collaborator Author

benoitf commented May 13, 2024

I had this error sometimes but here I got

Llama.generate: prefix-match hit

in the logs

Llama.generate: prefix-match hit
INFO:     127.0.0.1:40230 - "GET / HTTP/1.1" 404 Not Found
INFO:     127.0.0.1:40232 - "GET / HTTP/1.1" 404 Not Found
INFO:     127.0.0.1:50622 - "GET / HTTP/1.1" 404 Not Found
INFO:     127.0.0.1:50638 - "GET / HTTP/1.1" 404 Not Found
INFO:     127.0.0.1:40224 - "POST /v1/chat/completions HTTP/1.1" 200 OK
INFO:     127.0.0.1:57906 - "GET / HTTP/1.1" 404 Not Found

and then all CPU is eaten but I never get an answer

@benoitf
Copy link
Collaborator Author

benoitf commented May 13, 2024

but yes at summit I faced the Requested tokens exceed context window error

@MichaelClifford
Copy link

OK, I think we can fix the context window issue. But I'm not sure about the "all-cpu-no-answer" issue. What hardware are you using?

@benoitf
Copy link
Collaborator Author

benoitf commented May 14, 2024

forgot to hit send, hardware is Apple M2 Max
but I think I only reached once the 'no reply'

I think also containers/ai-lab-recipes#497 is more important

@MichaelClifford
Copy link

@benoitf please see containers/ai-lab-recipes#495

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants