Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restart worker after idle period #9031

Open
4 tasks done
prhbrt opened this issue May 17, 2024 · 0 comments
Open
4 tasks done

Restart worker after idle period #9031

prhbrt opened this issue May 17, 2024 · 0 comments

Comments

@prhbrt
Copy link

prhbrt commented May 17, 2024

I'll work around this, so this is just a suggestion, to help the AI-comrads, as these often have long running tasks and hw/memory requirements.

  • I have checked the issues list
    for similar or identical feature requests.
  • I have checked the pull requests list
    for existing proposed implementations of this feature.
  • I have checked the commit log
    to find out if the same feature was already implemented in the
    main branch.
  • I have included all related issues and possible duplicate issues
    in this issue (If there are none, check this box anyway).

Related Issues and Possible Duplicates

  • None

Related Issues

  • None

Possible Duplicates

  • None

Brief Summary

To leverage loading (AI/LLM-)models in (GPU-)memory and having memory available, an option similar to worker_max_tasks_per_child that kills a worker after an idle-time to free memory would be helpful. This naturally implements caching and loading if needed.

Design

Architectural Considerations

I couldn't quit grasp the celery comsumer-loop, but I'm going to mimic this behavior by spinning off a thread that checks idle-time and releases memory when too idle.

However, my best guess is that this is the consumer loop. My next guess is that it would have some sort of condition-variable to wait for jobs. I'd suggest adding a timeout there to sometimes check worker_max_idle.

Proposed Behavior

After a worker has been idle for at least worker_max_idle, the worker is either killed or restarted.

Proposed UI/UX

worker_max_idle=120

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant