New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Optimizer with negation incorrect output #4395
Comments
Hmm, my apologies, I actually have 170K more entries in the dataset which has totalSold below 100000. I will prepare a more concrete test case, please hold. |
OK, I have a better test case below: In bash, create 200,000 records:
Create index:
Then, execute the queries:
First one will return 0 items (since address is negated), but the second optimized query will return the record. |
If it helps, the query:
returns record 199999 and excluded 200000 as expected. Perhaps this bug is only triggered when the optimizer has to SkipTo the a record which is part of the negation, in which case it accepts the record incorrectly as a result. |
Hmm, not all the time. If I drop and re-create the index, [199999 200000] still includes record 200000, so it is dependent on where the document sits in the numeric index. Will keep testing and post here if I find anything else useful to fix the bug. |
OK, I have dug deeper into the code, and NOT iterator's NI_SkipTo() function returns INDEXREAD_NOTFOUND if it is not a match. But the loop in OPT_Read() does not actually check the results, and treats it as good if the docid's match, ie:
As a quick test, I updated the logic to be:
and this fixes the issue. Will keep testing to see this breaks other test cases which passed perviously. |
The fix above is not quite right. NOT iterators return INDEXREAD_NOTFOUND when the next record does not match, but for other iterators, it points to the next valid record when INDEXREAD_NOTFOUND is returned. To cater for this, the fix should be:
I am not sure what happens if the iterator is an intersect or union and the NOT iterator is further down. Will need to keep testing. |
Tested with NOT clause within a UNION (below), and it breaks the fix above... So, not too sure what is a general robust way to handle NOTs in the query:
Will keep playing around with it, but if anyone has some pointers to this bug, please let me know. |
I have abandoned the above fix, and found a simpler one, but with greater risks as it is further down at the iterator level. In index.c, we can see the NotInterator's skip to function NI_SkipTo():
Since the the Optimizer ignores INDEXREAD_NOTFOUND and INDEXREAD_OK and only reads the docId, I updated the above to:
Surprisingly, this has passed all the Optimizer test cases that failed before! I would like to ask if there is test suite in RediSearch which I can run to see the effect of the change above? And if so, how do I run it? |
Hi @keithchew |
Hi @meiravgri Thank you for your confirmation of the bug. I have a few updates. The docId + 1 fix above broke some test cases in redisearch's "make pytest", so that fix is not good, I thought it was too deep in anyway. I am currently testing the original fix above, but applying it in a few more places. In OPT_Read(), UI_SkipToHigh(), UI_SkipTo() and II_SkipTo(), I am checking if the type is NOT and only setting the minId on INDEXREAD_OK, ie:
So far, it seems to have passed all the test cases, but I am not comfortable with it as it touches quite a number of critical places in the code. Please advise here when a PR is ready, so I can help test the official fix and remove my changes. Thanks again. |
Hi @meiravgri Just wanted to note here another test case (can probably be classified as another bug since it is further down at the index level). In index.c for NI_SkipTo(), it says:
This logic might apply to other iterators, but for the NOT iterator, because it can either be INDEXREAD_OK or INDEXREAD_NOTFOUND if the childId is > dockId (you can verify this by by reading the code further down the snipper above). Here is a case where the Optimizer's OPT_Read() is requesting these SkipTos to the NOT iterator:
You can see above the first call's rc value is INDEXREAD_NOTFOUND but the third call with the same skip to docId returns INDEXREAD_OK, which is incorrect. |
Hi @meiravgri Do you have any ideas/suggestions on how to solve this. I had another good look at the code, and this appears to be non-trivial to fix. Without the fix, it is too dangerous to use, eg we can have a query like "-@type:{critical}" to purge non critical records, but the optimizer can return critical records that will be accidentally wiped out! And without using the optimizer, there can be a huge performance hit (up to 10x to 100x) depending on the dataset. Since I have had a good look into the code, let me know if you need me to try out some ideas/suggestions, or even contribute to the fix. |
@keithchew fixing this bug is on our roadmap. |
This issue is stale because it has been open for 60 days with no activity. |
Testing on v2.8.11
Not sure why I did not catch this in my earlier tests, but here it is...
Test case:
Queries:
Non optimized query above is correct:
The optimized version is incorrect:
I still need to dig a bit deeper to see why we are getting the error above, but looks like the optimizer does not like negations.
The text was updated successfully, but these errors were encountered: