Issues: huggingface/text-generation-inference
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
how do I adjust the logging level when launching via the docker container?
#1872
opened May 8, 2024 by
bitsofinfo
2 of 4 tasks
llama3-70B-Instruct-AWQ causing CUDA error: an illegal memory access was encountered
#1871
opened May 8, 2024 by
anindya-saha
4 tasks
Cannot use Inference Endpoint: UnprocessableEntityError: Error code: 422 - {'error': 'Template error: template not found', 'error_type': 'template_error'}
#1870
opened May 8, 2024 by
rvoak
1 of 4 tasks
"docker run --gpus all --shm-size 1g -p 8080:80 -v $volume:/data -e HUGGING_FACE_HUB_TOKEN={your_token} ghcr.io/huggingface/text-generation-inference:latest --model-id $model --num-shard $num_shard" showing error with my token id that "Unable to find image 'ghcr.io/huggingface/text-generation-inference:latest' locally latest: Pulling from huggingface/text-generation-inference docker: no matching manifest for linux/arm64/v8 in the manifest list entries. See 'docker run --help'."
#1868
opened May 7, 2024 by
anushka192001
4 tasks
Use pre-built FA2, vllm, quantization kernels in the dockerfiles
#1867
opened May 7, 2024 by
fxmarty
Encounter install error when install vllm package.
#1862
opened May 6, 2024 by
for-just-we
2 of 4 tasks
Serverless inference API endpoints fails to return logprobs via chat completions
#1852
opened May 2, 2024 by
ggbetz
2 of 4 tasks
UserWarning: You are using a Backend <class 'text_generation_server.utils.dist.FakeGroup'> as a ProcessGroup. This usage is deprecated since PyTorch 2.0
#1847
opened May 2, 2024 by
fxmarty
2 of 4 tasks
Failing to start a TGI pod with 2 or more GPUs. Sharding fails.
#1838
opened Apr 30, 2024 by
jayteaftw
3 of 4 tasks
Canno launch with error exllamav2_kernels not installed.
#1837
opened Apr 30, 2024 by
coderaBruce
2 of 4 tasks
TGI crashes with complex json schemas provided as grammar without any information (on debug/trace level)
#1834
opened Apr 30, 2024 by
o1iv3r
2 of 4 tasks
Out of Memory Errors When Running text-generation-benchmark Despite Compliant Batch Token Limit
#1831
opened Apr 30, 2024 by
martinigoyanes
The TGI loading model consumes all available gpus memory
#1824
opened Apr 28, 2024 by
IdleIdiot
2 of 4 tasks
Python client: Extra slash in base_uri leads to failures in chat endpoint
#1823
opened Apr 27, 2024 by
kcarnold
2 of 4 tasks
Previous Next
ProTip!
What’s not been updated in a month: updated:<2024-04-08.