[Question]: Approaches to searching documents requiring different kind of embeddings (multimodal too) #13594

coffepowered · 2024-05-20T07:50:53Z

Question Validation

I have searched both the documentation and discord for an answer.

Question

Hello,
I am reflecting on how to tackle the problem of searching over a knowledge base that could require different modalities for embeddings. To make this more concrete, my users are used to upload:

business documents/reports (what the system was conceived to ingest)
documents containing code or SQL (I get bad performance on this!)
documents containing images
documents containing tables

I recall llama_index has some approaches to images and tables, but is there a general preferred pattern to do this?
In general, I see some options to handle this:

having a super strong multimodal embedding (which is able to interpret images, SQL, multilingual text and whatever). I am not aware of such an embedded
maintaining segregated indexes for each modalities, then route queries depending on the type of question
a mix of 1-2.

Do you see more pattern to apply? Which would be the "llama-index" way of approaching the problem, if any?

dosubot · 2024-05-20T07:51:19Z

To effectively search over a knowledge base with documents requiring different kinds of embeddings, including multimodal ones, LlamaIndex offers a robust approach that aligns well with the options you're considering. Here's a concise overview tailored to your needs:

Multimodal Embeddings: LlamaIndex employs MultiModalEmbedding for handling various content types, such as text, images, and code. For images, it uses CLIP embeddings, and for documents containing code or SQL, it leverages specialized embeddings that understand the structure and semantics of code, enhancing retrieval accuracy.
Separate Indexes for Each Modality: It maintains separate vector stores for different modalities within the MultiModalVectorStoreIndex class. This setup allows for efficient handling and storage of embeddings specific to each modality, ensuring that each type of content is indexed and queried optimally.
Combination of Approaches: By combining multimodal embeddings with separate indexes for each modality, LlamaIndex provides a comprehensive solution for indexing and querying a diverse knowledge base. This hybrid approach ensures that all types of documents, whether they contain text, images, code, or SQL, are effectively processed and made searchable.

For implementing these strategies, consider leveraging vector space models for semantic representation, utilizing specialized embeddings for code and text, and integrating with language models for enhanced query processing. Customizable ingestion pipelines and efficient indexing mechanisms are also crucial for handling large and diverse datasets.

For more detailed guidance and examples, you might want to explore the source code and documentation provided by LlamaIndex, which offers insights into the implementation of these strategies.

Details

Sources

About Dosu
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

To continue the conversation, mention @dosu.

logan-markewich · 2024-05-20T15:31:36Z

Basically it's 2. Create indexes for each type. The multimodal index is one specific use case of these, for images and text, but the general approach makes sense, since each modality requires a specific embedding model

coffepowered · 2024-05-20T15:50:53Z

the response from dosubot is useful, but nothing beats a human. Thank you @logan-markewich !

coffepowered added the question Further information is requested label May 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Question]: Approaches to searching documents requiring different kind of embeddings (multimodal too) #13594

[Question]: Approaches to searching documents requiring different kind of embeddings (multimodal too) #13594

coffepowered commented May 20, 2024

dosubot bot commented May 20, 2024 •

edited

Details

logan-markewich commented May 20, 2024

coffepowered commented May 20, 2024

[Question]: Approaches to searching documents requiring different kind of embeddings (multimodal too) #13594

[Question]: Approaches to searching documents requiring different kind of embeddings (multimodal too) #13594

Comments

coffepowered commented May 20, 2024

Question Validation

Question

dosubot bot commented May 20, 2024 • edited

Details

logan-markewich commented May 20, 2024

coffepowered commented May 20, 2024

dosubot bot commented May 20, 2024 •

edited