Support for lower precision types for embeddings vectors, like fp16, fp8, int8 and bit vectors #1684

xfalcox · 2024-04-22T15:05:20Z

Description

AFAIK, currently Typesense stores embeddings vectors for semantic search using float[], which uses 32 bits for each dimension on the array.

With recent research on this area showing that the recall loss of quantization into lower precision formats is low enough that it's a trade-off that should be considered in most applications, leading software solutions are following this trend:

https://github.com/huggingface/text-embeddings-inference defaults to float16
https://github.com/pgvector/pgvector is now shipping float16 and bit vectors

Some articles about the technique:

https://huggingface.co/blog/embedding-quantization

https://blog.pgvecto.rs/my-binary-vector-search-is-better-than-your-fp32-vectors

With memory usage being always a point of attention for teams considering deployments of Typesense, offering types with lower memory footprint would allow Typesense to be used in more use cases. Even adopting only the fp16 option means lowering the memory usage of this attribute in half.

Steps to reproduce

Create a collection like

schema = {
  'name'      => 'places',
  'fields'    => [
    {
      'name'  => 'title',
      'type'  => 'string'
    },
    {
      'name'  => 'points',
      'type'  => 'int32'
    },
    {
      'name'     => 'embedding',
      'type'     => 'float[]',
      'num_dim'  => 4
    }
  ],
  'default_sorting_field' => 'points'
}

The embedding field stores each element using 32 bits.

Expected Behavior

Users can define a schema using

schema = {
  'name'      => 'places',
  'fields'    => [
    {
      'name'  => 'title',
      'type'  => 'string'
    },
    {
      'name'  => 'points',
      'type'  => 'int32'
    },
    {
      'name'     => 'embedding',
      'type'     => 'float16[]',
      'num_dim'  => 4
    }
  ],
  'default_sorting_field' => 'points'
}

Actual Behavior

schema = {
  'name'      => 'places',
  'fields'    => [
    {
      'name'  => 'title',
      'type'  => 'string'
    },
    {
      'name'  => 'points',
      'type'  => 'int32'
    },
    {
      'name'     => 'embedding',
      'type'     => 'float16[]',
      'num_dim'  => 4
    }
  ],
  'default_sorting_field' => 'points'
}

float16 is not a valid type for a field

Metadata

Typesense Version: 26

OS: Linux

The text was updated successfully, but these errors were encountered:

kishorenc · 2024-04-22T16:45:57Z

Thanks for creating this issue. Yes, this is on our roadmap for vector search.

xfalcox · 2024-04-22T17:26:27Z

Thanks for the quick reply @kishorenc.

We're considering using/moving @discourse to Typesense, and after doing a hardware requirement analysis I got surprised by the high numbers I've got.

Just tried dropping the embeddings field on my collections, and my memory requirement went down to just 23% of the original numbers 🤯.

I shared some finding around recall with different types in our use case at pgvector/pgvector#521 and the big impact those types can have in making those features adopted in more environments.

kishorenc added the feature:vectorsearch label Apr 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for lower precision types for embeddings vectors, like fp16, fp8, int8 and bit vectors #1684

Support for lower precision types for embeddings vectors, like fp16, fp8, int8 and bit vectors #1684

xfalcox commented Apr 22, 2024

kishorenc commented Apr 22, 2024

xfalcox commented Apr 22, 2024

Support for lower precision types for embeddings vectors, like fp16, fp8, int8 and bit vectors #1684

Support for lower precision types for embeddings vectors, like fp16, fp8, int8 and bit vectors #1684

Comments

xfalcox commented Apr 22, 2024

Description

Steps to reproduce

Expected Behavior

Actual Behavior

Metadata

kishorenc commented Apr 22, 2024

xfalcox commented Apr 22, 2024