USearch Benchmarks

This set of benchmarks is meant to test USearch capabilities for Billion-scale vector search. It provides an alternative to the ann-benchmarks and the big-ann-benchmarks which generally operate on much smaller collections.

The main objective is to understand the scaling laws of the USearch compared to FAISS. Supplementary adapters for other popular systems is also available under index/ directory:

Alternative HNSW implementations, like HNSWlib,
Alternative CPU-based libraries, like SCANN,
Vector Databases, like Qdrant, and Wevaite.

The primary dataset used for benchmarks is the Deep1B dataset of 1 Billion 96-dimensional vectors, totalling at 384 GB. Ground-truth nearest neighbors are provided to calculate the recall metrics.

Setup

First of all, we recommend creating a conda environment to isolate the dependencies:

conda create -n usearch-benchmarks python=3.10
conda activate usearch-benchmarks

Then install dependencies, getting an MKL-accelerated version of FAISS library.

pip install usearch hnswlib scann lancedb qdrant-client weaviate-client psutil plotly kaleido
conda install -c pytorch faiss-cpu=1.7.4 mkl=2021 blas=1.0=mkl

To benchmark Qdrant, you need to run their Docker container:

docker run -d -p 6333:6333 -p 6334:6334 qdrant/qdrant

Finally, download the Deep1B dataset:

wget https://storage.yandexcloud.net/yandex-research/ann-datasets/DEEP/base.1B.fbin -P data
wget https://storage.yandexcloud.net/yandex-research/ann-datasets/DEEP/base.10M.fbin -P data # For smaller subset

To run the ANN benchmarks pass a configuration file:

python run.py configs/usearch_1B.json 1B # Outputs stats/*.npz file
python utils/draw_plots.py # Exports tp plots/*.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

configs

configs

data

data

index

index

plots

plots

stats

stats

utils

utils

.gitattributes

.gitattributes

README.md

README.md

main.sh

main.sh

run.ipynb

run.ipynb

run.py

run.py

Repository files navigation

USearch Benchmarks

Setup

About

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
configs		configs
data		data
index		index
plots		plots
stats		stats
utils		utils
.gitattributes		.gitattributes
README.md		README.md
main.sh		main.sh
run.ipynb		run.ipynb
run.py		run.py

unum-cloud/usearch-benchmarks

Folders and files

Latest commit

History

Repository files navigation

USearch Benchmarks

Setup

About

Topics

Resources

Stars

Watchers

Forks

Languages