llama-cpp: fix tabby when using llama-cpp with versions b2320+ (introduction of abort callbacks) #1751

ghthor · 2024-04-01T23:39:11Z

Please describe the feature you want

Enabled tabbyml to run with llama-cpp version b2320 and newer.

Additional context

I've run into an issue supporting tabbyml on NixOS. This is due to the fact that I ran into issues trying to get tabbyml to compile with the vendored version of llama-cpp. Because of this I opt'ed to utilize the existing version of llama-cpp in nixpkgs and dynamically link against that version instead of the static linking against the vendored version included in this repo. At the initial time I merged the support for tabby 0.8.3, the version of llama-cpp in nixpkgs was b2296 and everything was working.

Since that time the version of llama-cpp in nixpkgs has been upgraded multiple times. I recently tried to upgrade my system and my tabbyml became broken. I narrowed this down the the version of llama-cpp that it was being linked against. After bisecting the version I've narrowed it down to b2320 introducing the change that breaks tabbyml. b2319 works as expected there are no issues. Anything newer than b2319 and tabbyml starts up and finds the GPU but then experiences a segfault.

Apr 01 19:21:02 cryptnix systemd[1]: Started Self-hosted AI coding assistant using large language models.
Apr 01 19:21:03 cryptnix tabby[135490]: 2024-04-01T23:21:03.214601Z  INFO tabby::serve: crates/tabby/src/serve.rs:116: Starting server, this might take a few minutes...
Apr 01 19:21:03 cryptnix tabby[135490]: 2024-04-01T23:21:03.218108Z  INFO tabby::services::code: crates/tabby/src/services/code.rs:53: Index is ready, enabling server...
Apr 01 19:21:03 cryptnix tabby[135490]: ggml_init_cublas: GGML_CUDA_FORCE_MMQ:   no
Apr 01 19:21:03 cryptnix tabby[135490]: ggml_init_cublas: CUDA_USE_TENSOR_CORES: yes
Apr 01 19:21:03 cryptnix tabby[135490]: ggml_init_cublas: found 1 CUDA devices:
Apr 01 19:21:03 cryptnix tabby[135490]:   Device 0: NVIDIA GeForce RTX 2080 SUPER, compute capability 7.5, VMM: yes
Apr 01 19:21:04 cryptnix tabby[135490]: *** stack smashing detected ***: terminated
Apr 01 19:21:07 cryptnix systemd[1]: tabby.service: Main process exited, code=dumped, status=6/ABRT
Apr 01 19:21:07 cryptnix systemd[1]: tabby.service: Failed with result 'core-dump'.

I'm not sure exactly how this change introduces the segfault, ggerganov/llama.cpp@4a6e2d6; but the change does look like it could introduce a segfault if tabbyml isn't providing some type of abort callback to llama-cpp. IE I imagine with further inspection we'd find that llama-cpp is expecting that callback to not be NULL and that's why it is segfault'ing.

This isn't as much a feature request as a warning about a potential issue we may have when updating the vendored version of llama-cpp in tabbyml. Though I looking to implement a fix into tabbyml for the this issue since it has caused the nixpkgs version of tabby to become broken without a clear path for users fix the breakage.

Please reply with a 👍 if you want this feature.

The text was updated successfully, but these errors were encountered:

wsxiaoys · 2024-04-22T22:04:21Z

Seems the issue is gone when upgrading to b2715 in #1926

ghthor added the enhancement New feature or request label Apr 1, 2024

wsxiaoys added the good first issue Good for newcomers label Apr 2, 2024

wsxiaoys mentioned this issue Apr 22, 2024

feat: upgrade llama.cpp to b2715 #1926

Merged

wsxiaoys added fixed-in-next-release and removed good first issue Good for newcomers labels Apr 22, 2024

wsxiaoys closed this as completed May 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

llama-cpp: fix tabby when using llama-cpp with versions b2320+ (introduction of abort callbacks) #1751

llama-cpp: fix tabby when using llama-cpp with versions b2320+ (introduction of abort callbacks) #1751

ghthor commented Apr 1, 2024 •

edited

wsxiaoys commented Apr 22, 2024

llama-cpp: fix tabby when using llama-cpp with versions b2320+ (introduction of abort callbacks) #1751

llama-cpp: fix tabby when using llama-cpp with versions b2320+ (introduction of abort callbacks) #1751

Comments

ghthor commented Apr 1, 2024 • edited

wsxiaoys commented Apr 22, 2024

ghthor commented Apr 1, 2024 •

edited