Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Operations are slower with fts5 database #451

Open
chmorgan opened this issue Apr 24, 2024 · 9 comments
Open

Operations are slower with fts5 database #451

chmorgan opened this issue Apr 24, 2024 · 9 comments
Assignees

Comments

@chmorgan
Copy link
Collaborator

The slowness seems to come from the populate_footprint_list function. I added some log statements to print the execution time whenever the function was called. Here are some measurements I got in standalone mode:

Time taken in the populate_footprint_list function:

Startup - 15.94s
Click "Select part" from the part picker dialog - 13.74s
Sort by stock in the main window - 13.40s.
@chmorgan
Copy link
Collaborator Author

@whmountains thought we could track the performance stuff here.

@chmorgan
Copy link
Collaborator Author

@whmountains it would be neat if we had these profiling calls built into the library, I'm having to add them here but if we call these functions rarely the overhead of the profiling calls (using time.time() difference here) should be very low.

@chmorgan
Copy link
Collaborator Author

@whmountains, I'm seeing things take much less time here, M3 MacBook:

/Users/kicad/remoteroot/workspace/build-macos-kicad-release/kicad-mac-builder/build/kicad/src/kicad/pcbnew/action_plugin.cpp(163): assert "PgmOrNull()" failed in register_action().
starting jlcpcbtools standalone mode...
duration 1.013922929763794
cmorgan@MacBook-Pro plugins % 

Opening 'select part' is ~1 second.

Sorting takes a handful of seconds.

@whmountains
Copy link
Collaborator

Thanks for checking into that @chmorgan !

I wonder what could be the source of the speed difference? The only thing I can think of is that your M3 MacBook is definitely faster than my 2020 MBP. Still, a factor of 10 seems like a lot.

Are we talking about the same function? The slowness I am observing is with populate_footprint_list in mainwindow.py, not populate_part_list in partselector.py. I'm guessing you already know that but I just wanted to check since you mentioned opening 'select part' which is not something I even benchmarked. I'm sort of grasping at straws here. The big slowdown I saw here was upon closing 'select part'.

@chmorgan
Copy link
Collaborator Author

diff --git a/mainwindow.py b/mainwindow.py
index 9cfa916..2261925 100644
--- a/mainwindow.py
+++ b/mainwindow.py
@@ -559,6 +559,7 @@ class JLCPCBTools(wx.Dialog):
 
     def populate_footprint_list(self, *_):
         """Populate/Refresh list of footprints."""
+        start = time.time()
         if not self.store:
             self.init_store()
         self.footprint_list.DeleteAllItems()
@@ -626,6 +627,8 @@ class JLCPCBTools(wx.Dialog):
                         break
 
             self.footprint_list.AppendItem(part)
+        end = time.time()
+        print(f"duration {end - start}");
 
     def OnSortFootprintList(self, e):
         """Set order_by to the clicked column and trigger list refresh."""

@whmountains
Copy link
Collaborator

Looks like we are benchmarking the same thing with the same technique.

Funny thing happened while testing today. While testing #452 I noticed it was a lot faster and added the timing code. Sure enough populate_footprint_list was taking exactly 2.00s. Every single time. (I formatted it to two decimal places so there could have been some slight variation beyond that.)

So I went back to the latest from the official repo, and am seeing over 14s and a spinning beachball again:

starting jlcpcbtools standalone mode...
duration 14.247678995132446
2024-04-25 10:25:55.205 Python[32583:7910929] This application is trying to draw a very large combo box, 30 points tall.  Vertically resizable combo boxes are not supported, but it happens that 10.4 and previous drew something that looked kind of sort of okay.  The art in 10.5 does not break up in a way that supports that drawing.  This application should be revised to stop using large combo boxes.  This warning will appear once per app launch.
duration 14.376977920532227

There must be some hidden variable which is causing the huge speed difference.

@whmountains
Copy link
Collaborator

For consistency's sake, let's define a standard way of testing. I propose the following:

  1. Check out c37d15e (currently the latest commit), and make sure git status shows no modified files like settings.json.
  2. Apply the patch you listed above. git apply ../add_timing.patch
  3. Delete the global data folder. rm -r <plugin_dir>/jlcpcb
  4. Delete the project-specific data folder. (working directory in standalone mode) rm -r <workdir>/jlcpcb
  5. Open the plugin in standalone mode and wait for db download.
  6. If an exception occurs while downloading the DB, make note of it but continue on. (I am now hitting Plugin sometimes stops working after combining part files. #453 almost every time so this is the only way I can proceed.)
  7. Relaunch the plugin in standalone mode, select R1, and click on "Assign LCSC number".
  8. Select C1002 from the list, search for it if neccesary, and click "select part"
  9. Observe the logged duration in the console.

@whmountains
Copy link
Collaborator

After following those steps many times trying out different configurations, I have some observations:

  1. When running in standalone, the first call to that function only takes 0.0025s. I can quit and relaunch as many times as I want but this first call is consistently 1-3ms. I suspect this is more related to the standalone mode stubbing than
  2. First time I select a part it takes 7s
  3. Additional part selections only take 1.7s
  4. After relaunching, the first call on startup takes 1.7s as well now, I never get the 0.0025ms startup again once I have selected one part.

I also sporadically observed:

  • A 9s performance band, correlated with running ANALYZE on the parts db.
  • A 13-15s performance band, correlated with deleting the project-level jlcpcb directory but not the global databases.

I could not reproduce either of these reliably. But I can say anecdotally that the 13-15s performance band is what I was stuck in on my last real-world project using this tool. Was quite annoying for sure!

In summary I think we can say the following:

  • Performance of the populate_footprint_list is highly variable.
  • It seems to have something to do with sqlite, rather than network speed or the python interpreter.
  • Performance is pretty consistent over repeated test runs, but when you stop testing and do something else, you will generally come back to find it in a different performance band.
  • It is an open question whether these performance bands have to do with os-level disk caching or the sqlite query planner.

I'm currently suspecting disk cache. It makes sense that the DB would be cached after the first query, and a 10x speedup in table scan speed would not be unexpected if the database file happened to be cached in RAM. I have 32Gb of RAM on my MBP so a lot of caching is certainly happening.

@whmountains
Copy link
Collaborator

whmountains commented Apr 25, 2024

Update: I now am quite certain that this is related to the database file getting cached in RAM.

Steps to reproduce on MacOS:

  1. Open Activity Monitor, switch to memory tab, and keep an eye on the "Cached Files" number.
  2. Run sudo purge in the terminal. You should see the "cached files" value go down.
  3. Immediately run the JLCPCB tools plugin in standalone and observe startup time. In my case it is over 17s with a cold disk cache. @chmorgan, you will probably not see it quite so slow because of the blazing fast SSD in an M3 MacBook.
  4. Close and re-launch the program again. Note how the startup time has gone down significantly.
  5. Note the amount of "Cached Files" in activity monitor.

If disk cache is indeed the issue, it will be hard to reproduce this consistently because it is up to the OS to decide when to cache and when not to cache. On my system, and probably yours as well @chmorgan, the OS has figured out that the parts DB is a hot file and aggressively caches it. On the system of a new user, or on any system with a lot of other apps open, it would be much less likely to get cached.

Something in populate_footprint_list is forcing a full table scan, but our high-end computers are stepping in and sparing us the inefficiency. I really hope we can figure out the proper optimizations such that it's not necessary to read through 4Gb of data every time a part is selected. And hopefully not at startup either. 🤪

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants