You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
AFAIK, currently Typesense stores embeddings vectors for semantic search using float[], which uses 32 bits for each dimension on the array.
With recent research on this area showing that the recall loss of quantization into lower precision formats is low enough that it's a trade-off that should be considered in most applications, leading software solutions are following this trend:
With memory usage being always a point of attention for teams considering deployments of Typesense, offering types with lower memory footprint would allow Typesense to be used in more use cases. Even adopting only the fp16 option means lowering the memory usage of this attribute in half.
We're considering using/moving @discourse to Typesense, and after doing a hardware requirement analysis I got surprised by the high numbers I've got.
Just tried dropping the embeddings field on my collections, and my memory requirement went down to just 23% of the original numbers 🤯.
I shared some finding around recall with different types in our use case at pgvector/pgvector#521 and the big impact those types can have in making those features adopted in more environments.
Description
AFAIK, currently Typesense stores embeddings vectors for semantic search using
float[]
, which uses 32 bits for each dimension on the array.With recent research on this area showing that the recall loss of quantization into lower precision formats is low enough that it's a trade-off that should be considered in most applications, leading software solutions are following this trend:
https://github.com/huggingface/text-embeddings-inference defaults to float16
https://github.com/pgvector/pgvector is now shipping float16 and bit vectors
Some articles about the technique:
https://huggingface.co/blog/embedding-quantization
https://blog.pgvecto.rs/my-binary-vector-search-is-better-than-your-fp32-vectors
With memory usage being always a point of attention for teams considering deployments of Typesense, offering types with lower memory footprint would allow Typesense to be used in more use cases. Even adopting only the fp16 option means lowering the memory usage of this attribute in half.
Steps to reproduce
Create a collection like
The
embedding
field stores each element using 32 bits.Expected Behavior
Users can define a schema using
Actual Behavior
float16
is not a valid type for a fieldMetadata
Typesense Version: 26
OS: Linux
The text was updated successfully, but these errors were encountered: