New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Exclude characters with special meaning in Lucene Query Parser syntax from searchbylabel search (DEV-1446) #2269
fix: Exclude characters with special meaning in Lucene Query Parser syntax from searchbylabel search (DEV-1446) #2269
Conversation
✅ Linked to Bug DEV-1446 · DSP-API: searchbylabel doesn't find strings with "-" |
Codecov ReportBase: 86.85% // Head: 86.99% // Increases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## main #2269 +/- ##
==========================================
+ Coverage 86.85% 86.99% +0.13%
==========================================
Files 241 242 +1
Lines 27967 28066 +99
==========================================
+ Hits 24292 24415 +123
+ Misses 3675 3651 -24
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM (and you beat me to it - I was about to comment on the duplicated test data file when you removed it ^^)
val searchString = | ||
val sparqlEncodedSearchString = | ||
stringFormatter.toSparqlEncodedString( | ||
searchval, | ||
throw BadRequestException(s"Invalid search string: '$searchval'") | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this step still necessary if we replace some special characters anyways? I haven't checked - but I thought that just replaces some characters too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
toSparqlEncodedString
returns an error when there is a new line or an empty string handed over. Also, the characters that are replaced are different than the ones that are replaced because of the Lucene Query Parser syntax. I thought it is better to leave them both separate - although there is some overlap (\"
and \\
).
?resource <http://jena.apache.org/text#query> "@searchTerm.generateLiteralForLuceneIndexWithoutExactSequence" . | ||
?resource <http://jena.apache.org/text#query> (rdfs:label "@searchTerm.generateLiteralForLuceneIndexWithoutExactSequence") . |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what influence does this change have? has our search by label been defective all along?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The update doesn't affect anything. It's just good practice to be as explicit as possible, according to the Apache Jena documentation, so I thought I add this.
Issue Number: DEV-1446
Pull Request Checklist
Basic Requirements
Please check if your PR fulfills the following requirements:
PR Type
What kind of change does this PR introduce?
Does this PR introduce a breaking change?
Does this PR change client-test-data?
Other information