Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question]: How do I access a raptor pack variable that I created later on? #13540

Open
1 task done
sohilbhatia opened this issue May 16, 2024 · 3 comments
Open
1 task done
Labels
question Further information is requested

Comments

@sohilbhatia
Copy link

Question Validation

  • I have searched both the documentation and discord for an answer.

Question

How do I access the raptor_pack variable later on. I am using this in a flask application but I want to call the query engine by using raptor_pack.retriever as we noticed that produced better results. Trying to figure out a way to do this without creating a retriever and passing that into the arguments of RetrieverQueryEngine, and without cookies/global vars etc.

def createDepRaptors(path, document):
    client = chromadb.PersistentClient(path=f"./{path}")
    collection = client.get_or_create_collection(f"{path}")
    documents = document
    vector_store = ChromaVectorStore(chroma_collection=collection)

    raptor_pack = RaptorPack(
        documents,
        embed_model=OpenAIEmbedding(
            model="text-embedding-3-small"
        ),  # used for embedding clusters
        llm=OpenAI(model="gpt-3.5-turbo", temperature=0.1),  # used for generating summaries
        vector_store=vector_store,  # used for storage
        similarity_top_k=8,  # top k for each layer, or overall top-k for collapsed
        mode="collapsed",  # sets default mode
        transformations=[
            SentenceSplitter(chunk_size=400, chunk_overlap=50)
        ],  # transformations applied for ingestion
    )

    query_engine = RetrieverQueryEngine.from_args(raptor_pack.retriever, llm=OpenAI(model="gpt-3.5-turbo", 
     temperature=0.1))
@sohilbhatia sohilbhatia added the question Further information is requested label May 16, 2024
Copy link

dosubot bot commented May 16, 2024

Hey there, @sohilbhatia! Fancy seeing you here again. 😄 Looks like you've got another intriguing puzzle for us. Let me dive into the depths of raptor_pack wisdom and I'll get back to you with something useful shortly.

@logan-markewich
Copy link
Collaborator

Assuming you already created the index once, you can do

raptor_pack = RaptorPack(
       [],
        embed_model=OpenAIEmbedding(
            model="text-embedding-3-small"
        ),  # used for embedding clusters
        llm=OpenAI(model="gpt-3.5-turbo", temperature=0.1),  # used for generating summaries
        vector_store=vector_store,  # used for storage
        similarity_top_k=8,  # top k for each layer, or overall top-k for collapsed
        mode="collapsed",  # sets default mode
        transformations=[
            SentenceSplitter(chunk_size=400, chunk_overlap=50)
        ],  # transformations applied for ingestion
    )

query_engine = RetrieverQueryEngine.from_args(raptor_pack.retriever, llm=OpenAI(model="gpt-3.5-turbo", 
     temperature=0.1))

Notice an empty list was passed in for documents. This should basically end up being a no-op, so its fine to run this whenever needed

If you wanted, you can also import and use the RaptorRetriever directly too

from llama_index.packs.raptor import RaptorRetriever

retriever = RaptorRetriever(
   [],
    embed_model=embed_model,
    llm=llm,
    similarity_top_k=similarity_top_k,
    vector_store=vector_store,
    mode=mode,
    verbose=verbose,
    transformations=transformations,
)

Copy link

dosubot bot commented May 16, 2024

To access and utilize the raptor_pack variable in your Flask application without re-creating indexes or using global variables, consider storing raptor_pack within the Flask application context. This method ensures raptor_pack is initialized once and accessible across your application. Here's how you can implement this:

  1. Initialize raptor_pack during Flask app startup:
from flask import Flask
from llama_index.packs.raptor import RaptorPack

app = Flask(__name__)

with app.app_context():
    app.raptor_pack = RaptorPack()
  1. Access raptor_pack in your route or view function:
from flask import current_app

@app.route('/use_raptor')
def use_raptor():
    raptor_pack = current_app.raptor_pack
    # Now you can use raptor_pack here
    return "Raptor Pack Accessed"

This approach leverages Flask's app_context() to store the raptor_pack instance, making it available throughout the application lifecycle without the need for global variables or passing the instance around.

For more detailed guidance, refer to the Flask documentation on application contexts.

If you're considering using RaptorRetriever directly for more control over the indexing and retrieval process, remember that RaptorPack provides a simplified, high-level interface for most use cases. However, direct use of RaptorRetriever might be beneficial if you require specific customizations beyond what RaptorPack offers.

Details

Sources


About Dosu
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

To continue the conversation, mention @dosu.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants