Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[web] Make search insensitive to differences in punctuation #1713

Open
hacketiwack opened this issue Jan 28, 2024 · 3 comments
Open

[web] Make search insensitive to differences in punctuation #1713

hacketiwack opened this issue Jan 28, 2024 · 3 comments

Comments

@hacketiwack
Copy link
Collaborator

Current situation
When searching for a song containing punctuation characters, the search string must match exactly the punctuation characters.

Example
Let's admit that you have a song named L’Hymne à l’amour in your database and you want to search for it.
By entering the search term l'hymne you won't get any result as the punctuation characters in the database are (U+2019) instead of ' (U+0027).
It would be awesome to be able to search with whatever punctuation characters.

For example, the search in Firefox takes that into account. Therefore, when searching for or ', the results are the same.
Added to that, there are other punctuation characters that affect the search: e.g., «, », , , , ,

@ejurgensen
Copy link
Member

Yes, I guess this has relation to #1390, and I imagine the solution here is also some modification of the db collation. However, not sure how to actually do that, much of what is in done in that part of sqlext.c is black magic to me.

@hacketiwack
Copy link
Collaborator Author

@ejurgensen indeed, it is certainly related. I will try to find some more information on how to do that.
So far, my searches have given me no valuable results.
I'll keep you posted.

@hacketiwack
Copy link
Collaborator Author

I thought about this again, and I was wondering if using uc_is_punct could be used to ignore punctuation characters.

Maybe we could add a test on this line to ignore punctuation characters like below.

if ( !uc_is_punct(uString) && sqlite3Fts5UnicodeFold(uString, 1) != sqlite3Fts5UnicodeFold(uPattern, 1) )

However, I'm not sure what negative effects this would have.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants