Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom Model Guidance #1

Open
jwahnn opened this issue Feb 27, 2024 · 4 comments
Open

Custom Model Guidance #1

jwahnn opened this issue Feb 27, 2024 · 4 comments
Labels
question Further information is requested

Comments

@jwahnn
Copy link

jwahnn commented Feb 27, 2024

Hi, thanks for sharing the code publicly. I have a few questions about using a custom model for RAPTOR:

  1. Do I create a new file for running the lines under "Setting Up RAPTOR" and "Adding Documents to the Tree" or is there a specific location to add these lines code?
  2. How does setting up RAPTOR and adding documents differ with using a custom model? Do I just follow what's on the README page but ignore importing os and setting up my openai key?
  3. Would adding documents to the tree be the same regardless of the type of model (custom vs. baseline) I use?
  4. I assume the RetrievalAugmentation function is unique to RAPTOR, correct? In other words, using a different model still preserves the methods of RAPTOR, right?

Thanks in advance! Was looking forward to this work :)

@parthsarthi03
Copy link
Owner

1, 2, 3: Please refer to the detailed walkthrough in our Jupyter Notebook example, which is available at this link. In this notebook, we provide step-by-step instructions on how to set up RAPTOR, including the integration of a custom model.

4: Yes, setting up custom models will preserve the methods of RAPTOR, including the RetrievalAugmentation function. This means that the core functionalities of RAPTOR, such as document retrieval and augmentation, remain intact and operational even when you integrate a custom model.

Thanks for your interest in RAPTOR, and I hope this helps! If you have any more questions, feel free to ask.

@parthsarthi03 parthsarthi03 added the question Further information is requested label Feb 27, 2024
@jwahnn
Copy link
Author

jwahnn commented Feb 28, 2024

  1. Do I have to complete the sections "Building the tree" and "Querying from the tree" when working with custom language models?
  2. Does RAPTOR have a unique approach to summary generation and Q&A as well, or is it only the Retrieval Augmentation that is unique?
  3. It seems to be as if the below code only adds one document worth of text? How does it build a tree?
    RA = RetrievalAugmentation() RA.add_documents(text)
  4. Running the demo code in the ipynb file seem to output the following using LLAMA This is a friendly reminder - the current text generation call will exceed the model's predefined maximum length (2048). Depending on the model, you may observe exceptions, performance degradation, or nothing at all. Answer: : [/MASKED] Is this because I am only feeding in one document? I guess the question goes back to question 3.

@jwahnn
Copy link
Author

jwahnn commented Mar 7, 2024

Hi, making another comment just in case the previous one got missed.

@parthsarthi03
Copy link
Owner

  1. The sections "Building the tree" and "Querying from the tree" walk you through using an example document once you have initialized your RetrievalAugmentation Class. If you want to use custom models, first define the models as in the Using other Open Source Models for Summarization/QA/Embeddings Section of the notebook and then initialize the RA class.

  2. You can look at how RAPTOR summarizes the text and does QA in raptor/SummarizationModels.py and raptor/QAModels.py respectively. If you want to define a custom prompt or do it another way, you can define your own Summarization and QA Models as shown in the Using other Open Source Models for Summarization/QA/Embeddings Section.

  3. RAPTOR takes a single text file to build a tree. If you want to pass in multiple documents, concatenate them to a single string before passing it to RA.add_documents(text). We are working on better support for multiple document handling and adding to the tree.

  4. Can you show your Custom Llama Model Class, and print out what the context being provided to the model is?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants