[FEAT] Code Interpreter for higher quality response and output, while summarizing the files and documents #856

haseeb-heaven · 2024-03-04T15:09:29Z

How are you running AnythingLLM?

AnythingLLM desktop app

What happened?

I was trying to chat with my documents. It was a basic document with the employee data with name and IDs and it wasn't able to generate the high-quality response. Most of the time it was giving me random data, and the output was not matching with the already tools that are available� for this.

I have tested.
1.Claude-2.1.
2.Gemini Pro.
3.Local Models.

And I'm not trying to promote my product, but I compared with already available tools that are code interpreters that can generate the code, and it can analyze and all the files in our local system,

Link Code-Interpreter this is my tool i tested with same models like Claude 2.1 and I got better results with this and that was more accurate, and other tools that are already available that are called code interpreters.

Are there known steps to reproduce?

You can try with the very basic file and you can ask about the data and it will try to generate like table but the data will not be accurate all the time even though if you use the same models and different software available by code interpreters.

The text was updated successfully, but these errors were encountered:

tylerfeldstein · 2024-03-04T20:55:52Z

Im having the same issue. Ive offloaded the embedding to ollama using nomic-embed-text to see if that was the issue since I saw the loacal embedding has this.embeddingMaxChunkLength = 1_00; but am having the same results. I have also pushed the LLM to LM studio to see the logs in the API call and noticed the context is VERY limited. Limited enough to make the LLM hallucinate quite often.

haseeb-heaven · 2024-03-05T13:40:08Z

Yes we need to improve this quality and less hallucinate.

tylerfeldstein · 2024-03-05T21:08:20Z

Try to jam the BERT tokenizer over the existing one and see what you get.
In /collector/utils/tokenizer/index.js add:

// Importing the BertTokenizer class from bert-tokenizer module.
const { BertTokenizer } = require("bert-tokenizer");

// Instantiate the BERT tokenizer.
const tokenizer = new BertTokenizer();

// A function that tokenizes a string using the BERT's text encoding.
// If no string is provided, it defaults to an empty string.
function tokenizeString(input = "") {
  try {
    // Tokenize the input string and return the tokens.
    return tokenizer.tokenize(input);
  } catch (e) {
    // If an error occurs, log a message to the console and return an empty array.
    console.error("Could not tokenize string!");
    return [];
  }
}

// Export the tokenizeString function.
module.exports = {
  tokenizeString,
};

You will want to yarn add bert-tokenizer from the /collector/

In some pdfs, im able to get better context. still testing on my side

tylerfeldstein · 2024-03-05T21:20:28Z

I should add:

I'm using

this.model = "NomicAi/nomic-embed-text-v1_5";
    this.cacheDir = path.resolve(
      process.env.STORAGE_DIR
        ? path.resolve(process.env.STORAGE_DIR, `models`)
        : path.resolve(__dirname, `../../../storage/models`)
    );
    //this.modelPath = path.resolve(this.cacheDir, "Xenova", "all-MiniLM-L6-v2"); //DEFAULT
    //this.modelPath = path.resolve(this.cacheDir, "BAAI", "bge-small-en-v1_5"); //Test 1
    this.modelPath = path.resolve(
      this.cacheDir,
      "NomicAi",
      "nomic-embed-text-v1_5"
    );

as the embedder in server/utils/EmbeddingEngines/native/index.js I will be jumping over to BERT embedding in a second to get it all to match (You will have to copy the files into the STORAGE_DIR models folder if you are going to use this) , then Qdrant as the DB, and Mixtral 8x7b as the LLM on LM studio

tylerfeldstein · 2024-03-06T00:59:00Z

Still chugging along on this.

Things I've learned

bge-small-en-v1_5 embedder did decent. Still wanting better responses
nomic-embed-text-v1_5 perfromed better but the container gets shut down when embedding a large doc. Could probably slow it down based on doc size so this doesnt happen?
Still can talk gibberish and give back less than desirable responses. Would like to get document TITLE sent in the context for referencing later.

Best Results So Far

Bert Tokenizer from above
Ollama running nomic for embedding at 2048 chunk size (can probably be done on localAi too)
LM studio running Muxtral 8x7b

Tiberius1313 · 2024-03-06T11:34:28Z

Step by step instructions would be much appreciated 🙏
Issues:

When I use Mixtral 8x7b I get during embedding: Error: "1 document failed to add. Could not embed document chunks! This document will not be recorded."
And in LM Studio i get: [ERROR] Unexpected endpoint or method. (POST /v1/embeddings). Returning 200 anyway
I can not find a path that fits: "/collector/utils/tokenizer/"

tylerfeldstein · 2024-03-13T00:20:04Z

Update. V2. Results have been sub par by using different embedding models. The setup I am currently running is

BERT Tokenizer from above
Embedding: nomic-embed-text
Vector storage: Qdrant
LLM: TheBloke / dolphin-2.7-mixtral-8x7b.Q6_K.gguf

I have been messing with the chunking and been getting better success though. I am going to try and mess with different chunking and text splitting methods and overlaps. I read that a parent child method works pretty good and will give that a try at some point.

I believe chunking occurs at the ./server/utils/vectorDBProvidors/<your db>/index.js in this section below.

       const textSplitter = new RecursiveCharacterTextSplitter({
        chunkSize:
          getEmbeddingEngineSelection()?.embeddingMaxChunkLength || 1_000,
        chunkOverlap: 20,
      });
      const textChunks = await textSplitter.splitText(pageContent);

Even by just changing the overlap manually to 100 or so I feel like I get better results. Also tied this with a 400 and 40 just to see what it would do and it was performing alright. This wouldnt be great for lots of hits though because it would clutter the context in the LLM and theres a hard coded max that will be hit.

timothycarambat · 2024-03-13T16:09:13Z

@tylerfeldstein fyi, Related issue! #490

We can prioritize this so you can mess with it more easily. Are you using Docker, Desktop, or local dev?

tylerfeldstein · 2024-03-13T16:11:15Z

ACK. I'll jump over to that one.
I've switched to a local dev environment so I can test it in real time.

haseeb-heaven added the possible bug Bug was reported but is not confirmed or is unable to be replicated. label Mar 4, 2024

haseeb-heaven changed the title ~~[BUG]: The accuracy is totally very low and output is not a high quality output.~~ [BUG]: low quality response and output, while summarizing the files and documents Mar 4, 2024

timothycarambat added enhancement New feature or request feature request and removed possible bug Bug was reported but is not confirmed or is unable to be replicated. labels Mar 6, 2024

timothycarambat changed the title ~~[BUG]: low quality response and output, while summarizing the files and documents~~ [FEAT] Code Interperter for higher quality response and output, while summarizing the files and documents Mar 6, 2024

timothycarambat changed the title ~~[FEAT] Code Interperter for higher quality response and output, while summarizing the files and documents~~ [FEAT] Code Interpreter for higher quality response and output, while summarizing the files and documents Mar 6, 2024

Tiberius1313 mentioned this issue Mar 6, 2024

The accuracy of data retrieval is not high #645

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEAT] Code Interpreter for higher quality response and output, while summarizing the files and documents #856

[FEAT] Code Interpreter for higher quality response and output, while summarizing the files and documents #856

haseeb-heaven commented Mar 4, 2024

tylerfeldstein commented Mar 4, 2024

haseeb-heaven commented Mar 5, 2024

tylerfeldstein commented Mar 5, 2024

tylerfeldstein commented Mar 5, 2024 •

edited

tylerfeldstein commented Mar 6, 2024 •

edited

Tiberius1313 commented Mar 6, 2024

tylerfeldstein commented Mar 13, 2024 •

edited

timothycarambat commented Mar 13, 2024

tylerfeldstein commented Mar 13, 2024

[FEAT] Code Interpreter for higher quality response and output, while summarizing the files and documents #856

[FEAT] Code Interpreter for higher quality response and output, while summarizing the files and documents #856

Comments

haseeb-heaven commented Mar 4, 2024

How are you running AnythingLLM?

What happened?

Are there known steps to reproduce?

tylerfeldstein commented Mar 4, 2024

haseeb-heaven commented Mar 5, 2024

tylerfeldstein commented Mar 5, 2024

tylerfeldstein commented Mar 5, 2024 • edited

tylerfeldstein commented Mar 6, 2024 • edited

Tiberius1313 commented Mar 6, 2024

tylerfeldstein commented Mar 13, 2024 • edited

timothycarambat commented Mar 13, 2024

tylerfeldstein commented Mar 13, 2024

tylerfeldstein commented Mar 5, 2024 •

edited

tylerfeldstein commented Mar 6, 2024 •

edited

tylerfeldstein commented Mar 13, 2024 •

edited