Experimenting with: Tiny Large Language Models.
docker compose up
- http://localhost:8000/public/index.html
- http://localhost:8000/api/llm?prompt=abcdefghijklmnopqrstuvwxyz
- This is a simple example (NOT Production-worthy) and basic safeguards (like input field validation, param sanitization, and the like) are mostly omitted here.
- There's not a tremendous lot one can do within the context of this simple demo to break things/be malicious.
- But the design pattern should be avoided in Production as is (without the addition of typical Production security mechanisms)!
- The build can take upwards of
30 minutes
due to large (many GB
) LLM Models.
- Hugging Face will download and cache
models--arnir0--Tiny-LLM
intoroot/.cache/huggingface/hub
. - Also, Docker Desktop 4.40 now supports Docker Model Runner.
- The following commands will launch of Dockerized LLM Model (somewhat similar to this one):
docker model pull ai/deepseek-r1-distill-llama
docker model run ai/deepseek-r1-distill-llama
- Docker Model Runner supports Hugging Face Models.
- The following commands will launch of Dockerized LLM Model (somewhat similar to this one):