ggerganov / llama.cpp Public

Notifications
Fork 8.2k
Star 57.8k

Code
Issues 346
Pull requests 209
Discussions
Actions
Projects 4
Wiki
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Wiki
Security
Insights

Pull requests: ggerganov/llama.cpp

Labels 46 Milestones 0

New pull request New

209 Open 2,916 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

ggml : rewrite silu and softmax for cpu

#7154 opened May 9, 2024 by jart

Loading…

Server: Fix system_prompt handling

#7153 opened May 8, 2024 by ngxson

Loading…

perplexity: add BF16 vs. FP16 results

#7150 opened May 8, 2024 by JohannesGaessler

Loading…

chore: Add model vocab support

#7117 opened May 7, 2024 by teleprint-me • Draft

tokenization: no double BOS tokens

#7107 opened May 6, 2024 by JohannesGaessler

Loading…

main-interactive-mode: optionally allow for special tokens from user in interactive mode for fill-in-middle etal

#7097 opened May 6, 2024 by hanishkvc

Loading…

Documenting debugging one test without anything else in the loop.

#7096 opened May 6, 2024 by josh-ramer

Loading…

opencl alignment size should be converted from bits to bytes

#7090 opened May 5, 2024 by albertjin

Loading…

Vulkan Bugfixes and Improvements

#7084 opened May 5, 2024 by 0cc4m

Loading…

Add left recursion check: quit early instead of going into an infinite loop

#7083 opened May 5, 2024 by nuchi

Loading…

CUDA: generalize FP16 fattn vec kernel

#7061 opened May 3, 2024 by JohannesGaessler

Loading…

Script to convert Grok-1 weights from raw JAX pickle files.

#7058 opened May 3, 2024 by heiner

Loading…

Add token healing example

#7028 opened May 1, 2024 by mare5x • Draft

docs: Fix typo and update description for --embeddings flag

#7026 opened May 1, 2024 by louixs

Loading…

Added support for the ArcticForCausalLM.

#7020 opened May 1, 2024 by fairydreaming

Loading…

Fix flash attention for ROCm

#7011 opened Apr 30, 2024 by jdecourval • Draft

add chatglm3-6b model support [help wanted]

#6999 opened Apr 30, 2024 by mnlife • Draft

new tokenizer-verifier tool to check gguf tokenizer parameters

#6988 opened Apr 29, 2024 by anisse

Loading…

llama3 custom regex split

#6965 opened Apr 28, 2024 by jaime-m-p

Loading…

server: avoid breaking KV cache when prompt >= n_ctx

#6958 opened Apr 28, 2024 by prfd

Loading…

move ndk code to a new library

#6951 opened Apr 27, 2024 by eltonkola

Loading…

Option to split during conversion

#6942 opened Apr 27, 2024 by christianazinn • Draft

Updated server_queue to delete tasks from queue when server is shutdown. Feature Request #6421

demo

Demonstrate some concept or idea, not intended to be merged

#6941 opened Apr 27, 2024 by rahsuri

Loading…

Implemented basic interface for llamacheck and link to weights, adapt…

#6940 opened Apr 27, 2024 by Ferruolo

Loading…

Fix clip build on windows + clang

#6934 opened Apr 26, 2024 by dhiltgen

Loading…

Previous 1 2 3 4 5 … 8 9 Next

Previous Next

ProTip! Type g p on any issue or pull request to go back to the pull request listing page.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly