Lantern’s Performance vs. pgvector - Authenticity and Future Improvements #272

Haskely · 2024-01-29T02:57:46Z

Hi Lantern Team,

I noticed your commitment to performance excellence in the README. However, a comparison on tembo.io (https://tembo.io/blog/postgres-vector-search-pgvector-and-lantern) suggests Lantern falls behind pgvector in certain metrics.

Could you comment on:

The validity of these findings?
Any planned improvements for Lantern?

Looking forward to your insights.

Ngalstyan4 · 2024-01-29T18:51:40Z

Hi @Haskely,

Thanks your interest in lantern and for commenting on this!
You are right that the benchmarks in the Readme are out of date.
Tembo's findings seem reasonable.

We are doing a major upgrade of lantern with improved storage layer (see PR1, PR2).
I think pgvector will soon release parallel index builds, which we can compare to our external parallel index builds. pgvector has merged improvements on this front as well since we last benchmarked it and plan to rerun all those benchmarks after their next release.

So, we are working on improvements but I won't promise anything here and will let the benchmarks speak for themselves once they are out (will be within 10 days).

In particular, we are working on supporting the new hardware optimizations from usearch, enabling cpu-specific build flags, adding vector quantization techniques.

In the meantime, here are some of the reasons you might still want to consider lantern:

Seamless embedding genration and maintenance - you just insert your text or image data into postgres, we create and maintain corresponding embeddings using one of supported open-source of proprietary embedding models
External index generation which builds the index in parallel, outside of the DB machine and does not hog down its resources (can be done with a couple of clicks in our cloud!
HNSW Index tuning experiments

valentijnvenus · 2024-02-03T21:18:00Z

@Ngalstyan4 thanks for the update:

In particular, we are working on supporting the new hardware optimizations from usearch, enabling cpu-specific build flags, adding vector quantization techniques.

I guess it is not as simple as updating the third_partry folder to point to a recent USearch?

Ngalstyan4 · 2024-02-05T00:19:06Z

Right.
For details on what it involves, you can look at unum-cloud/usearch#335 adding storage interface, and #262, building on that interface

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Lantern’s Performance vs. pgvector - Authenticity and Future Improvements #272

Lantern’s Performance vs. pgvector - Authenticity and Future Improvements #272

Haskely commented Jan 29, 2024

Ngalstyan4 commented Jan 29, 2024

valentijnvenus commented Feb 3, 2024

Ngalstyan4 commented Feb 5, 2024

Lantern’s Performance vs. pgvector - Authenticity and Future Improvements #272

Lantern’s Performance vs. pgvector - Authenticity and Future Improvements #272

Comments

Haskely commented Jan 29, 2024

Ngalstyan4 commented Jan 29, 2024

valentijnvenus commented Feb 3, 2024

Ngalstyan4 commented Feb 5, 2024