Siri LLama is apple shortcut that access locally running LLMs through Siri or the shortcut UI on any apple device connected to the same network of your host machine. It uses Langchain 🦜🔗 and supports open source models from both Ollama 🦙 or Fireworks AI 🎆
-
Install Ollama for your machine, you have to run
ollama serve
in the terminal to start the server -
pull the models you want to use, for example
ollama run llama3 # chat model
ollama run llava # multimodal
- Install Langchain and Flask
pip install --upgrade --quiet langchain langchain-community
pip install flask
- in
ollama_models.py
setchat_model
andvchat_model
to the models you pulled from Ollama
1.Install Langchain and Flask
pip install --upgrade --quiet langchain langchain-fireworks
pip install flask
-
get your Fireworks API Key and put it in
fireworks_models.py
-
in
fireworks_models.py
setchat_model
andvchat_model
to the models you want to use from Fireworks AI
- after setting the provider (Ollama / Fireworks) Run the flask app using
python3 app.py
-
On your Apple device, Download the shortcut from here
-
Run the shortcut through Siri or the shortcut UI, in first time you run the shortcut you will be asked to enter your IP address and the port number
- Even we access the flask app (not Ollama server directly), Some windows users who have Ollama installed using
WSL
have to make sure ollama servere is exposed to the network, Check this issue for more details