Algo updates #335

erikbern · 2023-03-08T03:00:11Z

maumueller · 2023-03-08T08:22:01Z

I honestly feel that disabling these implementations is a bad idea. I would rather disable the 5+ implementations of the same idea (HNSW) instead of disabling unique approaches that might be performing a bit worse.

erikbern · 2023-03-08T14:08:24Z

Yeah I'm open to keeping some of those – any particular one in particular that you think we should not disable?

Mostly just trying to reduce the clutter in the graphs and the total runtime of the benchmarks. But like you said, maybe a better approach would be to reduce some of the duplication. Is there a particular set of hnsw algos you think we could disable?

maumueller · 2023-03-09T08:25:47Z

I think BallTree(nmslib) (a standard tree data structure) and puffinn (LSH) are worth keeping. As you already mentioned, elastiknn is going to be a default choice for many users, so it would be great if it were to show up in the benchmark.

With respect to decluttering, I would say (but I'm happy if the authors of the implementation were to disagree)

SW-Graph < HNSW (disable SW-Graph)
faiss-ivf < faiss-ivfpqfs (disable faiss-ivf)
vamana(diskann) < vamana-pq(diskann) (disable vamana(diskann))
hnswlib, hnsw(faiss), hnsw(nmslib), hnsw(vespa), n2 are all HNSW. Pick one? (hnswlib is the reference implementation)

However, many users will use FAISS, so it is probably worth having their hnsw implementation in the benchmark as well.

I plan to run the benchmark locally on a server and record the total time spent for each implementation to get a bit more insight into where we actually spend most of the time.

erikbern · 2023-03-10T02:38:58Z

Excellent points! I think my only counterpoint would be to maybe keep the hnsw(vespa) since I think that's the only vespa one?

Also agree about making sure no algo blows up the total benchmark time. I could potentially add some output for that. Open to subsampling any one that takes too much total time

WPJiang · 2023-03-16T14:52:47Z

Hi, @erikbern , it's good news to know you are planning to re-run all benchmarks. Just yesterday, we have updated our algorithm qsgngt to a new version, which can fix the problem， “qsngt: something with the wheel, i'll take a look” . You may need to rebuild the docker image of qsgngt. Please let me know if there is any question with rerunning qsgngt, thanks a lot.

erikbern · 2023-03-16T20:02:02Z

sounds great. i'll check in the next few days!

WPJiang · 2023-03-20T08:36:34Z

sounds great. i'll check in the next few days!

Hi，@erikbern ，I wonder if there is any problem on rerunning qsgngt algorithm, we have runned the annbenchmarks and found it has a very good performance . We hope this algorithm can keep up with your re-run plan. Thank you!

jobergum · 2023-04-13T17:41:14Z

Excellent points! I think my only counterpoint would be to maybe keep the hnsw(vespa) since I think that's the only vespa one?

Thanks, I would be sad if Vespa was deleted from the runs.

benwtrent · 2023-07-27T14:56:37Z

hey y'all,

I have been thinking about this recently myself. I am concerned about ANY algorithm that uses network serialization in its run. This is Elasticsearch, Opensearch, Weaviate, QDrant. These aren't really "algorithms" in the same sense as scann or faiss.

Vespa addressed this issue by adding a special "benchmark" python client that directly wraps the native HNSW implementation. But this just isn't feasible for many typically HTTP driven solutions.

What do we think of just removing all "remote network call" algorithms? Or labeling them as such and comparing them that way? Once you start getting into very efficient algorithms, the network and serialization overhead becomes the dominating factor.

benwtrent · 2023-07-27T15:02:48Z

@erikbern ^ I will happily create a different issue for this discussion. Just seemed related to an overall update to the algos :)

erikbern · 2023-07-28T10:56:56Z

I think it would make sense to break out the client-server based ones into a separate chart – makes sense to keep them separate. That being said I don't think the network/serialization overhead should be substantial, at least not for high-recall regimes

benwtrent · 2023-07-28T13:02:01Z

That being said I don't think the network/serialization overhead should be substantial, at least not for high-recall regimes

For very high recalls, the constant overhead impact is lessened. However, serializing over the network (regardless of RESTful, protobufs, etc.) will always have a significant overhead when compared to native memory access. The benchmark then becomes "How can reduce the network serialization overhead the best" vs. "who has the better vector search algorithm".

Benchmarking things fairly, correctly, and realistically is a difficult thing. Thank you for all your work on this project @erikbern! It really does help the vector search community.

maumueller · 2023-07-28T13:59:19Z

@benwtrent Good point. We implemented the "prepared query" mode to make it possible to hide this cost, but looking at the current implementations (e.g., https://github.com/erikbern/ann-benchmarks/blob/main/ann_benchmarks/algorithms/luceneknn/module.py#L101-L123), it seems it's not exactly used in this way. The idea was that at query time you only send a small signal to carry out the actual query, but all serialization/transfer is done in the prepare_query and get_reprepared_....

I think it would be great to tag each implementation with information if this is algorithm or library and some info on the basic algorithm that is implemented (like our huge collection of HNSW implementations...). Can we just decorate the constructor(s) in each module?

benwtrent · 2023-07-28T14:33:21Z

@maumueller

We implemented the "prepared query" mode ... The idea was that at query time you only send a small signal to carry out the actual query, but all serialization/transfer is done in the prepare_query and get_reprepared_....

This assumes that the service implements a "store query & execute later" API. I wouldn't expect many vector search systems to implement this as its not a common case (users of these systems just query with a vector and expect a result). It would be strange to me to make a special API in a production system just for benchmarking.

I think these things could be decorations on the constructor. Or, I was thinking in the algo definition file having a new field tagging the algorithm kind or something. This way its easily filterable in runs and result files can be annotated with those tags.

maumueller · 2023-07-28T15:03:22Z

Right, the config is probably a better place at this point in time.

This was referenced Mar 8, 2023

Disable a few algos #336

Merged

Fix a few algos #338

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Algo updates #335

Algo updates #335

erikbern commented Mar 8, 2023 •

edited

maumueller commented Mar 8, 2023

erikbern commented Mar 8, 2023

maumueller commented Mar 9, 2023 •

edited

erikbern commented Mar 10, 2023

WPJiang commented Mar 16, 2023 •

edited

erikbern commented Mar 16, 2023

WPJiang commented Mar 20, 2023

jobergum commented Apr 13, 2023

benwtrent commented Jul 27, 2023

benwtrent commented Jul 27, 2023

erikbern commented Jul 28, 2023

benwtrent commented Jul 28, 2023

maumueller commented Jul 28, 2023

benwtrent commented Jul 28, 2023

maumueller commented Jul 28, 2023

Algo updates #335

Algo updates #335

Comments

erikbern commented Mar 8, 2023 • edited

maumueller commented Mar 8, 2023

erikbern commented Mar 8, 2023

maumueller commented Mar 9, 2023 • edited

erikbern commented Mar 10, 2023

WPJiang commented Mar 16, 2023 • edited

erikbern commented Mar 16, 2023

WPJiang commented Mar 20, 2023

jobergum commented Apr 13, 2023

benwtrent commented Jul 27, 2023

benwtrent commented Jul 27, 2023

erikbern commented Jul 28, 2023

benwtrent commented Jul 28, 2023

maumueller commented Jul 28, 2023

benwtrent commented Jul 28, 2023

maumueller commented Jul 28, 2023

erikbern commented Mar 8, 2023 •

edited

maumueller commented Mar 9, 2023 •

edited

WPJiang commented Mar 16, 2023 •

edited