Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Busy loop in GCODE analysis thread causes server to get blocked #4993

Closed
foosel opened this issue Apr 17, 2024 · 4 comments
Closed

Busy loop in GCODE analysis thread causes server to get blocked #4993

foosel opened this issue Apr 17, 2024 · 4 comments
Assignees
Labels
approved Issue has been approved by the bot or manually for further processing bug Issue describes a bug bugfix release relevant Issue should be looked at for backporting into the next bugfix release done Done but not yet released
Milestone

Comments

@foosel
Copy link
Member

foosel commented Apr 17, 2024

Problem

As found thanks to this forum thread, OctoPrint's GCODE analysis currently blocks the whole server, which is especially an issue when analysing a whole bunch of files on server startup, and less so an issue when processing a freshly uploaded single file.

The culprit seems to be this:

try:
# let's wait for stuff to finish
while p.returncode is None:
if self._aborted:
# oh, we shall abort, let's do so!
p.commands[0].terminate()
raise AnalysisAborted(reenqueue=self._reenqueue)
# else continue
p.commands[0].poll()
finally:
p.close()

Using poll here returns immediately, other than obviously understood when implementing things as they are it is not blocking. This leads to the whole thing being a busy loop with no sleep, meaning the CPU is completely taken up by this single thread and there's next to no chance for the system to schedule other threads.

Solution

This loop needs to be made non busy. One options is to instead of poll simply use wait, but that has the disadvantage that we won't be able to quickly abort an analysis in the middle of it, e.g. on start of a print or when the file gets deleted.

Any further points

Similar code snippets are also used in the implementation of octoprint.util.commandline.CommandlineCaller and the by now hopefully pretty much unused update-octoprint.py helper script of the software updater.

The former sees use in installing and updating plugins and such, and therefore needs to be investigated. The issue with using wait here is the need to report back any generated stdout or stderr lines to the user in intervals. The issue has possibly been made less problematic here by the associated readlines calls already containing a half second timeout each.

@foosel foosel added bug Issue describes a bug approved Issue has been approved by the bot or manually for further processing bugfix release relevant Issue should be looked at for backporting into the next bugfix release labels Apr 17, 2024
@foosel foosel added this to the 1.10.x milestone Apr 17, 2024
@foosel foosel self-assigned this Apr 17, 2024
@GitIssueBot
Copy link

This issue has been mentioned on OctoPrint Community Forum. There might be relevant details there:

https://community.octoprint.org/t/python-looping-after-installing-preheat-button/57933/7

foosel added a commit that referenced this issue Apr 17, 2024
foosel added a commit that referenced this issue Apr 17, 2024
Should *not* lead to a busy loop thanks to the blocking I/O.

See #4993
@foosel
Copy link
Member Author

foosel commented Apr 17, 2024

Fixed by the above commit, and I also took care of a refactoring to make CommandlineCaller thing clearer. The update-script turned out to not be affected either thanks to blocking I/O inside, it really looks like it was only the gcode analysis queue doing a busy loop.

@foosel foosel added the done Done but not yet released label Apr 17, 2024
@foosel
Copy link
Member Author

foosel commented Apr 17, 2024

Currently this is only available on maintenance, but the goal is to backport the two commits to 1.10.1 once that's going out.

foosel added a commit that referenced this issue Apr 25, 2024
Should *not* lead to a busy loop thanks to the blocking I/O.

See #4993
@foosel
Copy link
Member Author

foosel commented Apr 25, 2024

Backported to 1.10.1.

@foosel foosel closed this as completed in 8018abd May 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Issue has been approved by the bot or manually for further processing bug Issue describes a bug bugfix release relevant Issue should be looked at for backporting into the next bugfix release done Done but not yet released
Projects
Status: Done
Development

No branches or pull requests

2 participants