Bag of Indexes (BoI)

Bag of Indexes (BoI) is a multi-index hashing C++ library, used for solving Approximate Nearest Neighbors (ANN) search problems. In the belove figure, an example of BoI's working is presented.

LSH and multi-probe LSH are the only, at the moment, projection functions implemented.

Datasets

The original dataset files are converted in binary through the application of a simple C++ script:

After downloading the dat files you need to create a folder called dataset and then put in the uncompressed version.

Installation

Requirements:
- G++ 5.4 or greater
- openmp
Build: g++ -o BoI BoI.cpp -lstdc++fs -std=c++14 -fopenmp -O2

Test

./BoI SIFT1M 16 25 500 true ..

In the previous command the values of the parameters are:

δ = 16;
L = 25;
topN = 500;
fast re-ranking = true (loading descriptors from RAM).

Results

SIFT1M

The following table represents the results obtained through the application of BoI adaptive multi-probe LSH, trying to change some parameters:

L: number of hash tables;
δ: number of buckets per hash table (hash dimension);
topN: number of top elements to re-rank based on original distance (usually Euclidean distance).

In every test, the neighborhood is setted to 3.

Configuration	1	10	100	avg retrieval time
δ = 16, L = 25, top500	90.50%	91.57%	91.57%	6 msec
δ = 15, L = 50, top500	98.36%	99.19%	99.19%	18 msec
δ = 15, L = 50, top10k	99.20%	100%	100%	18 msec

The reduced average retrieval time is obtained through the multi-threading and the cut-off (an unsuperivesd reduction of the BoI structure). For more info see the paper cited in the Reference section. The reason behind the same retrieval time changing the top elements to re-rank is due to the speed of loading files from RAM memory.

GIST1M

Configuration	1	10	100	avg retrieval time
δ = 16, L = 100, top500	79.80%	80.80%	80.80%	60 msec
δ = 16, L = 100, top10k	97.40%	98.50%	98.50%	100 msec

All the experiments are evaluated following the Recall@R, that is the average rate of queries for which the 1-nearest neighbor is ranked in the top R positions.

Reference

@article{magliani2018efficient,
  title={Efficient Nearest Neighbors Search for Large-Scale Landmark Recognition},
  author={Magliani, Federico and Fontanini, Tomaso and Prati, Andrea},
  journal={arXiv preprint arXiv:1806.05946},
  year={2018}
}

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
BoI.cpp		BoI.cpp
README.md		README.md
utils.h		utils.h

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BoI.cpp

BoI.cpp

README.md

README.md

utils.h

utils.h

Repository files navigation

Bag of Indexes (BoI)

Datasets

Installation

Test

Results

SIFT1M

GIST1M

Reference

About

Releases

Packages

Languages

fmaglia/BoI

Folders and files

Latest commit

History

Repository files navigation

Bag of Indexes (BoI)

Datasets

Installation

Test

Results

SIFT1M

GIST1M

Reference

About

Topics

Resources

Stars

Watchers

Forks

Languages