Replies: 1 comment
-
Hi Joan, thanks for the question. It's an interesting idea. There is no way to do this right now, but I do think it's possible to add. Here's a high-level outline of how: We would have to add the And thread it down all the way to this method: And use it constrain the A couple considerations:
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi,
searching performance (precision, recall and time) are highly dependent on the K and L values.
Once indexed, for searching the system needs to compute and search for L words of K bits
Increasing L helps to increase recall but time also increases.
On large datasets it is difficult to know, a priori, which are the effects of using different K and L values as, the theoretical values suppose a random distribution. Also search times depend on the size of indices , memory, other fields...
From the implementation, I understand that internally there are as many Lucene fields as L indicates.
So, in order to speed-up testing, I would like to be able to index with a high value of L (e.g. 300) but then be able to do searches with a lower value of L (that is, using only the first L' indices, where L' <= L ) So, I can compare results and do a table of execution_time/precision by different L values, without having to reindex millions of documents. That would be useful to take a decision about the best L to use (for a given K).
Also it could be useful , on other scenarios: for example in some searches, if we only want very high similarity results we can reduce L, or we we don't care of recall , we could also reduce L .
So, my question/discussion is,
Is there any way (easy way) to use a lower L at query time?
Beta Was this translation helpful? Give feedback.
All reactions