Python: Redis Memory Connector Performance Issues when calling Memory.Search #6088

engt21 · 2024-05-01T19:35:09Z

There are several performance issues with the Python Redis Memory Connector. These issues were discovered by developing an application in C# that worked properly, then duplicating the application in Python and noticing significant differences.

As they are related and occurred in the same application we tested within our enterprise and are interconnected, and the fixes are partially connected, I am listing the issues here.

Within our enterprise, I have also developed a fix, and can raise a PR with the changes that worked for us. I have only included the problems below and not our functioning solution.

Describe the bug vs. Expected behavior

Issue No. 1: Incorrect relevancy score calculation
The correct behavior in C# is displayed in the below screenshots.

It should A) take the similarity score, and then B) as shown in the GetSimilarity helper, return 1 - vectorScore, which is the cosine distance, not the similarity, due to how redis reports the score/creates the index.

However, if you look at what Python does, it only gets the vector_score as is. This means that all the results will be returning the inverse of what it should be.

NB: When we ran it with our fix, we found the exact same performance as C#, which was the expected behavior.

Issue No. 2: Incorrect Results Order

After fixing Issue 1 above, we found that we were still getting poor performance. This is because the results from the redis query are default returning in descending order, while C# was not. Thus, we found that the lowest results were being returned at the top and only those results were coming back.

This is a quick fix with the return settings.

Issue No. 3: Error thrown while parsing documents
When retrieving attributes of the document - for example, in getting the vector_score, but also in the utils deserialize_document_to_record file, we get "Document object is not subscriptable" where the call to redis_key = doc["id"] happens.

The solution we developed worked while also allowing the previous functionality to stand.

As an aside, while we did not check this theory out, but we loaded our data initially to redis via C# SK, and then retrieved with Python SK. Not sure if this would cause the issue; however, our logic was that clients should be language agnostic and should not have issues retrieving documents regardless of the method of storing them in redis.

Similar issues would possibly come up with the call to split_redis_key, as it assumes there is only one instance of ":" in the key with its implementation. While this is fine if the user stored data with the SK redis connector, for some enterprises or applications that have separate data pipelines to store then use SK to retrieve, this would lead to parsing errors.

To Reproduce/Screenshots

Steps to reproduce the behavior:
You will need an existing redis instance. You should then create an instance of the Redis Memory Connector.
We used embeddings ada2.
You should then store some data in redis - you can use SK in Python or C#.
You should then do a memory.search on a test query to see similarity to your data.

NB: As this application is developed within my enterprise, and the code samples we ran, along with displayed output, has firm specific data, I cannot paste any screenshots or provide details on showing examples of how to reproduce.

While this was not the example our enterprise used, one would expect similar issues.

Storing book descriptions and their titles in redis, then providing sample user query about type of book they're interested and returning top books based off similarity of topics.

Platform

OS: Windows
IDE: Pycharm
Language: Python
Source: python-0.9.6b1

engt21 · 2024-05-01T19:43:32Z

Please let me know if/when/what next steps are before I can raise a PR. Thanks!

markwallace-microsoft added .NET Issue or Pull requests regarding .NET code python Pull requests for the Python Semantic Kernel triage labels May 1, 2024

github-actions bot changed the title ~~Python Redis Memory Connector Get Nearest Matches Performance Issues~~ .Net: Python Redis Memory Connector Get Nearest Matches Performance Issues May 1, 2024

engt21 changed the title ~~.Net: Python Redis Memory Connector Get Nearest Matches Performance Issues~~ Python: Redis Memory Connector Get Nearest Matches Performance Issues May 1, 2024

engt21 changed the title ~~Python: Redis Memory Connector Get Nearest Matches Performance Issues~~ Python: Redis Memory Connector Performance Issues when calling Memory.Search May 1, 2024

matthewbolanos added enhancement and removed triage labels May 7, 2024

engt21 mentioned this issue May 8, 2024

Python Redis Memory Connector should allow for filtering when searching #6159

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Python: Redis Memory Connector Performance Issues when calling Memory.Search #6088

Python: Redis Memory Connector Performance Issues when calling Memory.Search #6088

engt21 commented May 1, 2024 •

edited

engt21 commented May 1, 2024

Python: Redis Memory Connector Performance Issues when calling Memory.Search #6088

Python: Redis Memory Connector Performance Issues when calling Memory.Search #6088

Comments

engt21 commented May 1, 2024 • edited

engt21 commented May 1, 2024

engt21 commented May 1, 2024 •

edited