Skip to content

Pull requests: ggerganov/llama.cpp

Author
Filter by author
Label
Filter by label
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Milestones
Filter by milestone
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

ggml : rewrite silu and softmax for cpu
#7154 opened May 9, 2024 by jart Loading…
Server: Fix system_prompt handling
#7153 opened May 8, 2024 by ngxson Loading…
perplexity: add BF16 vs. FP16 results
#7150 opened May 8, 2024 by JohannesGaessler Loading…
tokenization: no double BOS tokens
#7107 opened May 6, 2024 by JohannesGaessler Loading…
Vulkan Bugfixes and Improvements
#7084 opened May 5, 2024 by 0cc4m Loading…
CUDA: generalize FP16 fattn vec kernel
#7061 opened May 3, 2024 by JohannesGaessler Loading…
Add token healing example
#7028 opened May 1, 2024 by mare5x Draft
Fix flash attention for ROCm
#7011 opened Apr 30, 2024 by jdecourval Draft
llama3 custom regex split
#6965 opened Apr 28, 2024 by jaime-m-p Loading…
server: avoid breaking KV cache when prompt >= n_ctx
#6958 opened Apr 28, 2024 by prfd Loading…
move ndk code to a new library
#6951 opened Apr 27, 2024 by eltonkola Loading…
Updated server_queue to delete tasks from queue when server is shutdown. Feature Request #6421 demo Demonstrate some concept or idea, not intended to be merged
#6941 opened Apr 27, 2024 by rahsuri Loading…
Fix clip build on windows + clang
#6934 opened Apr 26, 2024 by dhiltgen Loading…
ProTip! Type g p on any issue or pull request to go back to the pull request listing page.