Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Smaller memory-consumption for arrays of numbers. (To improve vector search) #1087

Open
javiabellan opened this issue Sep 11, 2023 · 0 comments

Comments

@javiabellan
Copy link

javiabellan commented Sep 11, 2023

RedisJSON arrays uses 24 bytes per element. Doc Reference

This is good for arrays containing multiple types but really bad for storing a huge amount of vectors (embeddings) that are made of array of floats.

Redis search currently support vector search using both HASH and JSON Doc Reference

  • Redis HASH: store vectors in an efficient way (raw bytes, 4 bytes per number if float32) but they can not store nested elements (an array of vectors per document)
  • Redis JSON: store vectors inefficiently (24 bytes per number always) but they can store nested elements Doc Reference

I would like the best of both worlds. So list of possible solutions are:

  1. Smaller a memory footprint in JSON arrays of numbers. (4 bytes per number, would be ideal)
    • Storing vectors as raw bytes (as redis hashes do).
  2. Array of vectors in redis HASHES
    • Storing N vectors in a single hash field as concatenated bytes. If the embeding dimension is 10 and each number occupies 4 bytes, 1 vector is 40 bytes, 2 vectors are 80bytes, and so on.
  3. Build my own effienct datatype using redis modules (maybe something similar to this), and make redisearch aware of this new data structure for index it (curruently only supports ON HASH and ON JSON)
@javiabellan javiabellan changed the title Smaller JSON arrays of numbers. (To improve vector search) Smaller memory-consumption for arrays of numbers. (To improve vector search) Sep 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant