Skip to content

GURPREETKAURJETHRA/Multimodal-AI-App-using-Llava-7B

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Multimodal-AI-App-using-Llava-7B

Multimodal AI App using Llava 7B and Gradio. Building an AI Voice Assistant App using Multimodal LLM "Llava" and Whisper.

Description:

  • Dive into the fascinating world of generative AI as we build a cutting-edge voice assistant using the multimodal LLM "Llava 1.5 7B" for unparalleled image/text understanding capabilities, and the robust Whisper model by OpenAI for accurate speech-to-text conversion.
  • It showcases the integration of these technologies within a Gradio app, complemented by the gTTS library for realistic text-to-speech functionality bringing our voice assistant to life.
  • Build an AI Voice Assistant App using Multimodal LLM "Llava" and Whisper

Implementation Expert Guide:

Demo ▶️


©️ License 🪪

Distributed under the MIT License. See LICENSE for more information.


If you like this LLM Project do drop ⭐ to this repo

Follow me on LinkedIn   GitHub