Skip to content

data61/DyANN

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Dynamic Approximate Nearest Neighbour Benchmarks

Install the framework

Unpackage code and setup python environment

> tar xvzf dyann.tar.gz
> cd dyann
> /apps/python/3.9.X/bin/python3 -m venv env
> source env/bin/activate
> pip install -r requirements.txt --no-cache-dir

Running the code

Quick test

> python download.py data=[datacol_quick]
> python run.py data=[datacol_quick] algo=[linear,hnsw]
> python plot-pareto.py data=[datacol_quick] algo=[linear,hnsw]
> python plot-algo.py data=[datacol_quick] algo=[hnsw]

Preload all datasets and pregenerate all groundtruth (could take hours, ensure at least 30GB space)

> python download.py data=[datacol,datacol_lerp,datacol_efreq,datacol_esfreq]
> python download.py data=[featlearn,featlearn_lerp,featlearn_efreq,featlearn_esfreq]

Generate all benchmarking results (can easily take days or weeks, best run in parallel with a job scheduler)

> python run.py data=[datacol,datacol_lerp,datacol_efreq,datacol_esfreq] algo=[linear,annoy,hnsw,ivfpq,scann,kdtree]
> python run.py data=[featlearn,featlearn_lerp,featlearn_efreq,featlearn_esfreq] algo=[linear,annoy,hnsw,ivfpq,scann,kdtree]

Adding new datasets

A template file for new datasets is provided at ./dyann/data/template.py

Usage Instructions:

  1. Copy template.py and change the filename and class name for your new dataset
  2. Update ./dyann/data/proxy.py to include the names you have chosen
  3. Fill in each of the TODO items (refer to existing datasets for hints if needed)
  4. Create any number of configuration sets in ./conf/data/ with name property set to this filename and scale property providing an optional parameter sweep

Adding new ANN algorithms

A template file for new datasets is provided at ./dyann/algo/template.py

Usage Instructions:

  1. Copy template.py and change the filename and class name for your new ANN algorithm
  2. Update ./dyann/algo/proxy.py to include the names you have chosen
  3. Fill in each of the TODO items (refer to existing algorithms for hints if needed)
  4. Create both the build and search configuration files in ./conf/algo/ with name property set to this filename the lists of parameters for the build and query properties will be swept

Benchmarks for Static ANN

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages