Skip to content
This repository has been archived by the owner on May 27, 2020. It is now read-only.

Difference in query results (select * vs select col1,col2,...) #394

Open
rampeni opened this issue Jul 2, 2018 · 1 comment
Open

Difference in query results (select * vs select col1,col2,...) #394

rampeni opened this issue Jul 2, 2018 · 1 comment

Comments

@rampeni
Copy link

rampeni commented Jul 2, 2018

We are facing a strange problem while testing the upgrade of our applications from Cassandra+lucene plugin 2.2.4 to the latest 3.11.2 + plugin 3.11.1.0.

We get a difference when querying a certain lucene index: when selecting specific fields we get no result (0 rows found), when performing the exact same query but using select *, we get the expected rows.

Extra difficulty: the problem is hard to reproduce. Once it occurs for a certain table it is reproducable. However dropping, recreating and refilling the same table makes the issue disappear.

We tried to recreate the issue on a sample table, but failed probably due to the non-deterministic nature of the occurrence.
I've attached a sample statement (admitted, query can be improved but we want to validate the upgrade and touch the apps later).
Replacing the field list by a * makes it work (all via cqlsh)

We added some debug statements, the issue seems to be caused by "something" between searching and topping of the results in IndexPostProcessor.scala. After the collect, in both cases 50 rows are returned. After the top, in the case with fields we have no rows left.

Does this ring a bell with anyone? We've debugged further into the Lucene core but that's costing a lot of time.
So far, our investigation has reached the point where we think the difference is caused by a difference in the org.apache.lucene.searchTermQuery's getTermsEnum-method return value.
In the working case, it returns a Term, which is then used to define a scorer and obtain results.
In the faulty case, it returns null, no scorer is assigned and the story ends.
The org.apache.lucene.index.TermContext's internal TermState[] has been initialised, but nothing has been registered in this case so it only contains null.

Additional hint: the above actually causes us to end up in the org.apache.lucene.search.TopDocsCollector, topDocs method, where the authors add an if with the comment "Don't bother to throw an exception, just return an empty TopDocs in case the parameters are invalid or out of range. TODO: shouldn't we throw IAE if apps give bad params here so they dont have sneaky silent bugs?"...would have been nice, because in the fauly case the if is true and and empty collection is returned...for sure sneaky and silent :)

Any ideas are helpful...either to solve the issue, or to ensure that the "workaround" of using select * is always returning the correct result.
broken.txt

@rampeni
Copy link
Author

rampeni commented Jul 2, 2018

Extra strangeness/hints: when we add the fields we search on to the query, we get different results.
vat_search and fuzzy_company_search: 50
exact_company_search: 0
company_search_prio1: 9, ...prio2: 41, ...prio3: 42...

So the combination field / query type probably has an influence.
I realise it is hard to comment, but we've gone a long way trying to provide a reproducible test case and have failed...

@rampeni rampeni changed the title Difference in query results (select * vs select coL1,col2,...) Difference in query results (select * vs select col1,col2,...) Jul 2, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant