New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider brackets within wildcard as regular characters #19973
base: master
Are you sure you want to change the base?
Conversation
I had this PR in draft for a few months, and I had the following in sandbox, see the comment: // The order of the replacements below is important <-- this comment
// Replace "\?" (which has been regex-escaped above) with "."
wildcard.replace(u"\\?"_s, u"."_s);
// Replace "\*" (which has been regex-escaped above) with ".*"
// Consecutive wildcards are merged to avoid catastrophic backtracking
wildcard.replace(QRegularExpression(u"(?:\\\\*)+"_s), u".*"_s); On second thought, I cannot figure out any case where switching the replacement order would cause issues. But maybe I'm overlooking something I had thought of a few months ago. Does anyone think of any edge case? |
So, no one has anything to say about it? |
Mmmm, not sure about the number of backslashes here:
|
53ae771
to
ff37f4d
Compare
ff37f4d
to
095bf83
Compare
Force-pushed to (probably) fix the number of backslashes, as described above. |
There is a build failure, but it is unrelated to this PR, see here:
|
This PR is stale because it has been 60 days with no activity. This PR will be automatically closed within 7 days if there is no further activity. |
Coincidentally, I had seen the 5.0 beta release, and I was about to "bump" this PR. |
By the way, someone might be interested to submit a PR for Qt's wildcardToRegularExpression, to prevent regex catastrophic backtracking, if there are multiple consecutive It is as simple as adding the following in the case '*':
rx += settings.starEscape;
+++ while (i < wclen && wc[i] == u'*') {
+++ ++i;
+++ }
break; |
For context, in #4170 we added support for « ? » and « * » glob wildcards in search patterns. But a side-effect is that « [ » and « ] » also become special characters, which is rather unexcepted and undesirable.
This PR is a new attempt at #16965, which was broken and had to be reverted in #17820. I still don't know why #16965 was failing…
Here I'm taking a new approach: we fully regex-escape the provided pattern, then we replace
\?
and\*
with.
and.*
respectively. This approach should be more robust, and is even simpler actually. See this question on SO that led me to this approach.I'm taking the opportunity to point out that at the time of #4170 (2015), only plaintext search was available, hence the need for these wildcards. In 2018 the powerful regex search has been added: #9255. Ideally, we should be able to choose between plaintext, wildcard and regex searches, see this draft.