Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed-up Xapian searches by preloading indexes #617

Open
kelson42 opened this issue Aug 29, 2021 · 4 comments
Open

Speed-up Xapian searches by preloading indexes #617

kelson42 opened this issue Aug 29, 2021 · 4 comments

Comments

@kelson42
Copy link
Contributor

#418 has shown that the typical steps for a search are:

  1. Read the zim file (to be able to locate the xapian index in it) : Cold : 7.44s | Warm : 0.12s
  2. Open the xapian database (internal xapian code) : Cold : 0.09s | Warm : 0.003s
  3. Set the enquire on the database : Cold : 0.02s | Warm: 0.0004s
  4. Run the enquire and get a set of (ranged) results from the enquire (internal xapian code) : Cold : 3.74s | Warm : 1.5s

Here is when it happens:

  1. Once at file opening
  2. At first search requested and then cached
  3. At each search
  4. At each search

In a attempt to speed-up searches (in particular the first one) the idea would be to have the following workflow:

  1. Once at file opening
  2. Once at file opening (optional?) and then cached
  3. At file opening and then keep one (more?) ready to go all the time
  4. At each search

He would be the related questions on my side:

  • Can we secure that 2. does not slows down the opening of the file (so it should run in an other thread)?
  • Can we secure that 3. does not slows down the searches (so it should run in an other thread)?
  • I guess the whole search system is protected to avoid two search requests to happen at the same time. If this is secure in a multithreaded context. This will be responsible of massive slow downs in many search requests happen at the same time. Would that be possible/reasonable to have a pull of "searcher"?
@kelson42
Copy link
Contributor Author

@mgautierfr @maneeshpm What do you think? Is that a proper approach?

@kelson42 kelson42 changed the title Speed-up Xapian searches Speed-up Xapian searches by preloading Xapian indexes Aug 29, 2021
@kelson42 kelson42 changed the title Speed-up Xapian searches by preloading Xapian indexes Speed-up Xapian searches by preloading indexes Aug 29, 2021
@kelson42 kelson42 pinned this issue Sep 2, 2021
@kelson42
Copy link
Contributor Author

Any update here?

@kelson42 kelson42 added this to the 7.1.0 milestone Sep 27, 2021
@kelson42
Copy link
Contributor Author

kelson42 commented Jan 2, 2022

@mgautierfr @maneeshpm We have started the dev of 7.2.0. Do we agree on this approach?

@kelson42 kelson42 removed their assignment Jan 2, 2022
@kelson42 kelson42 modified the milestones: 8.2.0, 8.3.0 Mar 18, 2023
@Jaifroid
Copy link

Just to add to the documented issue here, Xapian-based search in the WASM version of libzim is basically unusable on Android, due to excessive I/O generated by libzim on startup. See openzim/javascript-libzim#42.

@kelson42 kelson42 modified the milestones: 9.0.0, 9.1.0 Sep 26, 2023
@kelson42 kelson42 modified the milestones: 9.1.0, 10.0.0 Nov 1, 2023
@kelson42 kelson42 unpinned this issue Dec 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants