Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sqlite: add full text search index #4767

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

sqlite: add full text search index #4767

wants to merge 1 commit into from

Conversation

brunnre8
Copy link
Member

Our search was using a linear search over the message text data.
This is very inefficient for large databases and barely usable.
We can add an FTS index, trading storage for speed.

As it's setup, this only supports English, but then we get fancy
stemming so that say "work" matches itself as well as "working".

This could be reduced to just stripping funny chars, with less good
search in the English case.

@brunnre8
Copy link
Member Author

Now, this is the bare bones version that just supports normal phrases, no fancy wildcards, no prefix matches / and / or logic etc.

FTS5 supports all that though so it's just a matter of coming up with a sane syntax and writing the parser for it:

https://www.sqlite.org/fts5.html

@brunnre8 brunnre8 added Type: Feature Tickets that describe a desired feature or PRs that add them to the project. Status: needs-review PR not yet reviewed by enough maintainers labels Jul 29, 2023
@brunnre8
Copy link
Member Author

Don't merge as is quite yet, this will not merge cleanly with the current master branch due to a security feature iirc...

Need to rebase and relax the checks

@brunnre8

This comment was marked as outdated.

@brunnre8 brunnre8 added Meta: Do Not Merge This PR should not be merged. and removed Status: needs-review PR not yet reviewed by enough maintainers labels Jan 28, 2024
Our search was using a linear search over the message text data.
This is very inefficient for large databases and barely usable.
We can add an FTS index, trading storage for speed.

As it's setup, this only supports English, but then we get fancy
stemming so that say "work" matches itself as well as "working".

This could be reduced to just stripping funny chars, with less good
search in the English case.
@brunnre8
Copy link
Member Author

brunnre8 commented Jan 28, 2024

So, fixed the obvious bugs... @MaxLeiter would you mind re-running it on your machine and activate the storage cleaner with "everything" for a bit?

Just to see if pragma trusted_schema might be needed or not?

(That executes the on delete triggers, in case that wasn't obvious as to why I want you to do that)

@brunnre8 brunnre8 added Status: needs-review PR not yet reviewed by enough maintainers and removed Meta: Do Not Merge This PR should not be merged. labels Jan 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: needs-review PR not yet reviewed by enough maintainers Type: Feature Tickets that describe a desired feature or PRs that add them to the project.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants