Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash using cosine similarity when calling search #40

Open
spullara opened this issue Apr 13, 2022 · 4 comments
Open

Crash using cosine similarity when calling search #40

spullara opened this issue Apr 13, 2022 · 4 comments

Comments

@spullara
Copy link

When indexing a small number of vectors I am getting this error when specifying cosine_similarity (euclidean works fine for instance):

thread 'hora_test' panicked at 'called `Option::unwrap()` on a `None` value', /Users/sam/.cargo/registry/src/github.com-1ecc6299db9ec823/hora-0.1.1/src/core/neighbor.rs:32:54
stack backtrace:
   0: rust_begin_unwind
             at /rustc/7737e0b5c4103216d6fd8cf941b7ab9bdbaace7c/library/std/src/panicking.rs:584:5
   1: core::panicking::panic_fmt
             at /rustc/7737e0b5c4103216d6fd8cf941b7ab9bdbaace7c/library/core/src/panicking.rs:143:14
   2: core::panicking::panic
             at /rustc/7737e0b5c4103216d6fd8cf941b7ab9bdbaace7c/library/core/src/panicking.rs:48:5
   3: core::option::Option<T>::unwrap
             at /rustc/7737e0b5c4103216d6fd8cf941b7ab9bdbaace7c/library/core/src/option.rs:752:21
   4: <hora::core::neighbor::Neighbor<E,T> as core::cmp::Ord>::cmp
             at /Users/sam/.cargo/registry/src/github.com-1ecc6299db9ec823/hora-0.1.1/src/core/neighbor.rs:32:9
   5: <hora::core::neighbor::Neighbor<E,T> as core::cmp::PartialOrd>::partial_cmp
             at /Users/sam/.cargo/registry/src/github.com-1ecc6299db9ec823/hora-0.1.1/src/core/neighbor.rs:38:14
   6: core::cmp::PartialOrd::le
             at /rustc/7737e0b5c4103216d6fd8cf941b7ab9bdbaace7c/library/core/src/cmp.rs:1129:19
   7: core::cmp::impls::<impl core::cmp::PartialOrd<&B> for &A>::le
             at /rustc/7737e0b5c4103216d6fd8cf941b7ab9bdbaace7c/library/core/src/cmp.rs:1505:13
   8: alloc::collections::binary_heap::BinaryHeap<T>::sift_up
             at /rustc/7737e0b5c4103216d6fd8cf941b7ab9bdbaace7c/library/alloc/src/collections/binary_heap.rs:562:16
   9: alloc::collections::binary_heap::BinaryHeap<T>::push
             at /rustc/7737e0b5c4103216d6fd8cf941b7ab9bdbaace7c/library/alloc/src/collections/binary_heap.rs:496:18
  10: hora::index::hnsw_idx::HNSWIndex<E,T>::search_layer::{{closure}}
             at /Users/sam/.cargo/registry/src/github.com-1ecc6299db9ec823/hora-0.1.1/src/index/hnsw_idx.rs:363:25
  11: <core::slice::iter::Iter<T> as core::iter::traits::iterator::Iterator>::for_each
             at /rustc/7737e0b5c4103216d6fd8cf941b7ab9bdbaace7c/library/core/src/slice/iter/macros.rs:211:21
  12: hora::index::hnsw_idx::HNSWIndex<E,T>::search_layer
             at /Users/sam/.cargo/registry/src/github.com-1ecc6299db9ec823/hora-0.1.1/src/index/hnsw_idx.rs:353:13
  13: hora::index::hnsw_idx::HNSWIndex<E,T>::search_knn
             at /Users/sam/.cargo/registry/src/github.com-1ecc6299db9ec823/hora-0.1.1/src/index/hnsw_idx.rs:433:25
  14: <hora::index::hnsw_idx::HNSWIndex<E,T> as hora::core::ann_index::ANNIndex<E,T>>::node_search_k
             at /Users/sam/.cargo/registry/src/github.com-1ecc6299db9ec823/hora-0.1.1/src/index/hnsw_idx.rs:615:55
  15: hora::core::ann_index::ANNIndex::search
             at /Users/sam/.cargo/registry/src/github.com-1ecc6299db9ec823/hora-0.1.1/src/core/ann_index.rs:93:9
  16: hora_c::hora_test
             at ./src/lib.rs:192:13
  17: hora_c::hora_test::{{closure}}
             at ./src/lib.rs:168:1
  18: core::ops::function::FnOnce::call_once
             at /rustc/7737e0b5c4103216d6fd8cf941b7ab9bdbaace7c/library/core/src/ops/function.rs:227:5
  19: core::ops::function::FnOnce::call_once
             at /rustc/7737e0b5c4103216d6fd8cf941b7ab9bdbaace7c/library/core/src/ops/function.rs:227:5
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.


failures:
    hora_test
@cnrpman
Copy link

cnrpman commented Jun 21, 2022

Same. Do you have any clue?

@spullara
Copy link
Author

spullara commented Jun 21, 2022

I ended up switching to brute force search for my use case so haven't revisited it.

https://github.com/spullara/bfes

@rangsikitpho
Copy link

I had the same issue and it's caused by this commit fca4516 which negates the output of the dot product resulting in calling sqrt() of a negative number when calculating the cosine distance. I'm not sure why the change was made but reverting it fixed CosineSimilarity for me though it may break other things. You can see the change I made here: rangsikitpho@0836f2c

@kernelsoe
Copy link

Thanks @rangsikitpho !
Removing negation in line 28 and 32 fixes this and top distance pairs look something like (0, -0.060707208), (3, -0.26921165), (1, -0.6891982), (2, -0.9331413)].
I just convert the distance.abs() to display score.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants