Any additional suggestion about performance ? #6

DrEight · 2024-01-05T16:07:50Z

Do you have any additional suggestions on how to optimize performance apart from what you wrote in the article?

ADefWebserver · 2024-01-05T16:11:06Z

What performance issues are you having?

DrEight · 2024-01-05T16:28:39Z

I have a Azure DTU based database with 200DTUs. This is my typical scenario. The query takes: CPU time = 2235 ms, elapsed time = 1133 ms.
I tried:

convert the float to real. The size of the storage is less, but the performance is unchanged.
I 'quantized' the float to an int16. Same results of the url is missing /embeddings #1
I tried to calculate the vectors out of the json using only the table 'wikipedia_articles_embeddings', but the performance are horrible.
I tried to calculate the vector stored as a concatenated string, splitting it with STRING_SPLIT. Same as Update README.md to add Community Samples #3
I tried to increase the compression of the column store with: DATA_COMPRESSION = COLUMNSTORE_ARCHIVE. No changes in the performance

ADefWebserver · 2024-01-05T16:46:52Z

So 1133 ms is unacceptable performance? This is probably what I am getting, but in my applications, it has not been an issue. I always have 'other processes' like my calls the LLM going on, that have their own delays, before returning a result to the end user.

DrEight · 2024-01-05T17:08:26Z

I would like to have at least the double of the performance. My concern is related to the storage and the number of the users connected to the db. 25000 articles are a good amount of data for my use case, but I can't exclude that they become more, and I'm assuming that the performance is inversely proportional to the number of articles.

hitec-lbmc · 2024-02-12T18:18:19Z

I am considering expanding the use of SQL Server for my production application, which is an RAG platform with about 80,000 pages of data. Our application's data needs are growing, and we're exploring options to increase our storage capacity significantly.

Given that SQL Server is our favorite DB, I'm interested in exploring it as the layer that stores the vector embedding from our data.

Did you have success in optimizing performance?
Any insights or lessons learned from your experience would be greatly appreciated, especially if you've navigated similar challenges.

If you decided not to use SQL Server for storing vectors, I would love to learn why.

Thank you.

Luigi-Ottoboni · 2024-02-13T09:47:32Z

Yes, I'm still using the SQL and I did a lot of tests and improvements.
First as I mentioned above converting from float to real, gave me a 3.2X performance.
Then I notice that the kernel memory is adding to many layers and abstractions, so I extracted the code for my use case. I can't tell you exactly, but this was the biggest improvement in performance and code maintenance.
Another rule: do not mix 'row query' with 'column query': when you query a table with the column store, it must be a pure table scan.
Then optimise the embedding. I strongly suggest to take a look to the new text-embedding-3-large/text-embedding-3-small. They supports 'shortening' meaning that you can reduce the size of the vector finding the best balance performance/accuracy.
Feel free to contact me also privately if you need more info

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Any additional suggestion about performance ? #6

Any additional suggestion about performance ? #6

DrEight commented Jan 5, 2024

ADefWebserver commented Jan 5, 2024

DrEight commented Jan 5, 2024

ADefWebserver commented Jan 5, 2024

DrEight commented Jan 5, 2024

hitec-lbmc commented Feb 12, 2024

Luigi-Ottoboni commented Feb 13, 2024

Any additional suggestion about performance ? #6

Any additional suggestion about performance ? #6

Comments

DrEight commented Jan 5, 2024

ADefWebserver commented Jan 5, 2024

DrEight commented Jan 5, 2024

ADefWebserver commented Jan 5, 2024

DrEight commented Jan 5, 2024

hitec-lbmc commented Feb 12, 2024

Luigi-Ottoboni commented Feb 13, 2024