Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create and use a separate table (parts_by_lcsc) to index FTS5 parts table by rowid to speed up. #444

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

gyohng
Copy link
Contributor

@gyohng gyohng commented Apr 21, 2024

The purpose of this PR is to increase the speed of any operation that involves list rebuilding, which is basically any operation, such as part selection, sorting, hiding/showing BOM parts, or even invoking the plugin window.

The database increase due to this is +3.3%, which I think is a reasonable trade-off until a better way is found to integrate it with FTS5 directly while maintaining the speed.

The problem is illustrated in this YouTube video (the principal part of the commit is initially commented out and then uncommented to show the difference in performance impact).

Youtube video with demonstrating the problem

The commit includes the following changes:

  1. Creation of a New Table (parts_by_lcsc): A new table is created in the database to hold mappings between part identifiers (partsId) and the LCSC Part numbers. This table is designed to optimise lookup times by indexing parts based on their LCSC Part number rather than through direct queries on the parts table, which seems to be slow.

  2. Modification in get_part_details Function:

    • The method has been updated to first attempt fetching partsId values from the new parts_by_lcsc table using the provided LCSC Part numbers.
    • If the new method finds all required partsId, it then fetches the parts details directly using these IDs, which is much faster as it leverages the row IDs for direct access instead of scanning through text fields.
    • In case of any issues (e.g., the new table not having all the entries), it falls back to the original method of fetching data directly using the LCSC Part numbers from the parts table.
  3. Database Indexing in the download Method:

    • A significant process added in the download method is the population of the parts_by_lcsc table and the removal of an old index if it exists.
    • It iterates through every row in the parts table, inserts or updates the mapping in the parts_by_lcsc table, and updates a progress indicator, which communicates the operation's progress to the user interface.
    • After filling up the new table, it creates an index on the lcsc column of the parts_by_lcsc table to speed up the lookups.
    • This indexing is expected to significantly reduce the query time by utilising a quicker, more efficient index.
  4. User Interface Updates: The code updates a gauge control that reflects the progress of the database indexing operation. On a modern M1 MacBook Pro it finishes under 5 seconds and is significantly faster than the downloading.

…id instead of `LCSC Part` to significantly speed up performance.

+3.3% DB size.
@chmorgan
Copy link
Collaborator

chmorgan commented Apr 21, 2024

Can we avoid rebuilding the list to reduce the impact of list rebuilding being expensive?

I get having a separate index table and considered such things. Imo it isn't worth the complexity if the cost of a few seconds is only when users alter the sort list of a component list with 100+ components. But again, I'm likely biased.

@gyohng
Copy link
Contributor Author

gyohng commented Apr 22, 2024

Can we avoid rebuilding the list to reduce the impact of list rebuilding being expensive?

I don't think it's possible if we want to keep the component list synchronised to the database. Even at a cost of a major refactor, you'll still need to do it every time the plugin starts, which is quite annoyingly slow right now.

In the view of the FTS5 change, which expanded the database four-fold (and if you download the second time, watch for the multi-gigabyte bak file), I think a db size increase of 3.3% and an extra SELECT is not such a big thing.

@whmountains
Copy link
Collaborator

whmountains commented Apr 24, 2024

I second the observation that plugin startup has been very slow since the fts5 change. The plugin also freezes for about 20 seconds whenever I select a part which sort of defeats the purpose of fts5 in the first place. Search as you type is no fun if you then have to wait for the changes to commit.

I was just in the process of doing some profiling to troubleshoot the issue but it looks like @gyohng may have found it already.

(I'm using MacOS and KiCAD 8 BTW)

@chmorgan
Copy link
Collaborator

chmorgan commented Apr 24, 2024 via email

@whmountains
Copy link
Collaborator

whmountains commented Apr 24, 2024

The slowness seems to come from the populate_footprint_list function. I added some log statements to print the execution time whenever the function was called. Here are some measurements I got in standalone mode:

Time taken in the populate_footprint_list function:

  • Startup - 15.94s
  • Click "Select part" from the part picker dialog - 13.74s
  • Sort by stock in the main window - 13.40s.

@chmorgan
Copy link
Collaborator

chmorgan commented Apr 24, 2024 via email

@whmountains
Copy link
Collaborator

whmountains commented Apr 24, 2024

I think #360 is related. The issue was actually fixed in 5eb59e2, with the comment "massive startup time improvement" but it never got closed. We lost this index with the move to FTS5, which caused startup time to once again take a lot longer.

@gyohng
Copy link
Contributor Author

gyohng commented Apr 25, 2024

I think #360 is related. The issue was actually fixed in 5eb59e2, with the comment "massive startup time improvement" but it never got closed. We lost this index with the move to FTS5, which caused startup time to once again take a lot longer.

Hi! :) I am actually a fan of your previously submitted PR with LIKE optimisation, but now FTS5 took over...

This PR is essentially a replica of 5eb59e2. The main problem with 5eb59e2 (indexing by LCSC Part) now is that there's no way to reintroduce it with FTS5 - the FTS5's own index isn't quite as efficient, and sqlite won't accept index creation for an FTS table.

I went even as far as creating an index on the internal parts_content table for the FTS structure (it didn't help). If I had to choose, I'd probably keep the FTS5 part separately just for the part search and otherwise resort to the old table with index approach for everything else, even despite the data growth.

@whmountains
Copy link
Collaborator

Hi @gyohng! I thought I already replied but I can't find the comment so I'll reply again.

I think this comes down to workflow. You and I make heavy use of parametric search, so #439 was really all we needed and FTS5 has a high bar to meet. @chmorgan and probably a lot of other people prefer a single search-box interface hence the project going in that direction.

But I think we all agree that the UI freezing for 15 seconds or more every time a part is selected is not acceptable to anyone's workflow. You may be interested in the discussion under #451, where we are trying to figure out where the slowdown is coming from. And #450 which is a totally diffferent idea for a parts database.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants