You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Recently learned sparse representations have been developed to compute term weights using a neural model such as transformer based retriever and deliver strong relevance results, together with document expansion.
Sparse vector is like '((1, 0.2), (4, 0.3), (100, 0.4), (1000, 5.4), ...)', the first field of sparse point is index(1, 4, 100, 1000), its second is weight which is float type.
A simple implementation of sparse vector search is that treating every index as term and save weight in postings then similarity of two sparse vector is computed by dot product.
Because of k is usually big and the dimension of sparse vector is often high (avg. > 100), the performance of Block-Max Wand is even worse than exhaustive-or search(not use BMW). 2GTI looks like a good doc skip algorithm for sparse vector search.
I'm wandering it's any plan for tantivy to support sparse vector search?
The text was updated successfully, but these errors were encountered:
@cyccbxhl no plans at the moment. It is worth to leave the issue open. Several companies in that space are using tantivy. This ticket could spark a discussion.
Recently learned sparse representations have been developed to compute term weights using a neural model such as transformer based retriever and deliver strong relevance results, together with document expansion.
Sparse vector is like '((1, 0.2), (4, 0.3), (100, 0.4), (1000, 5.4), ...)', the first field of sparse point is index(1, 4, 100, 1000), its second is weight which is float type.
A simple implementation of sparse vector search is that treating every index as term and save weight in postings then similarity of two sparse vector is computed by dot product.
Because of k is usually big and the dimension of sparse vector is often high (avg. > 100), the performance of Block-Max Wand is even worse than exhaustive-or search(not use BMW).
2GTI looks like a good doc skip algorithm for sparse vector search.
I'm wandering it's any plan for tantivy to support sparse vector search?
The text was updated successfully, but these errors were encountered: