Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Drawing many annotations is very slow #66

Open
Azeirah opened this issue Mar 22, 2023 · 2 comments
Open

Drawing many annotations is very slow #66

Azeirah opened this issue Mar 22, 2023 · 2 comments

Comments

@Azeirah
Copy link
Contributor

Azeirah commented Mar 22, 2023

I tried converting my private diary which has over 250 pages of hand-written text. It takes well over two minutes on my high-end PC to convert this file to PDFs.

The primary bottleneck lies in drawing the line segments, see the "update_annotations" call below.

afbeelding

See this code: https://github.com/lucasrla/remarks/blob/master/remarks/conversion/drawing.py#L178

@Azeirah
Copy link
Contributor Author

Azeirah commented Mar 22, 2023

Luckily, I already found a way to optimize this heavy bottleneck, and it runs waayyy faster.

My theory was that the "update_annot" call has a lot of overhead, so my idea was to batch the calls.

How the code works right now:

If you have 1000 lines on the page (which is not strange for when you have hand-drawn pages like in a diary), there will be 1000 calls to add_ink_annot and annot.update(). This is very slow due to the overhead.

My alternative idea is to batch the lines per tool configuration, so if for example the user drew 960 lines with the "pen" tool with width=medium and 40 with the "pen" tool with width=large, then there will only be two calls to add_ink_annot and annot.update():

  • Batch every line the user has drawn with the same tool-configuration (taking into account tool-type, stroke_width, color and opacity)
    • Set up a new fitz annotation per batch
    • Configure the tool
    • Finalize by calling update

I simulated this approach without the per-tool batching, so I just defaulted all lines to the same pen configuration and the performance improved drastically! This is a very promising speedup.

afbeelding

@Azeirah
Copy link
Contributor Author

Azeirah commented Mar 22, 2023

This one is ran with the actual batching:

Performance difference between the simulated batching and the real batching is negligible, which is great.

afbeelding

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant