New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tasks are not allowed to start subprocesses #1709
Comments
This has not changed between 3.0 and 3.1, so I'm not sure why you would get this error now and not before. |
This is how can this error be reproduced. app.py:
sendtask.py:
I run worker using the following command: With Celery 3.0.24 task succeeds:
With Celery 3.1.5 it does not:
My understanding of the issue is the following: And it seems that I'm not alone with that problem: http://stackoverflow.com/questions/20149421/threads-in-celery-3-1-5 |
One difference is that the worker process is now a subclass of 'Process', where before it used the function argument: |
multiprocessing and old versions of billiard sets And it's the same in the latest version: |
I think that task process being a daemon presents a serious limitation for tasks implementation. celery worker --app=tasks -Q wb -l info --concurrency=1 But when I use celeryd script to start a worker, I get this exception: |
I figured out what caused the change in the behaviour. To my understanding, there was a bug prior to version 3.1 (tasks were allowed to create subprocesses, which could result in orphaned state) and now this bug has been fixed. |
The decision to not allow python daemon processes to fork seems rather arbitrary to me. While I recognize the good faith of it, I feel like I should be able to have a full control over this behavior if I choose to. Being bound to one process per task seems to be a serious limitation to me. Thoughts? |
I wonder why that limitation is there in the first place, a warning I can understand but outright disallowing it seems silly when you are perfectly able to fork processes using other means. |
@ask, would that be possible to initialize celery worker process with daemon flag being False? Or make this configurable? |
@ilyastam seems we were commenting at the same time I agree that it seems like an arbitrary limitation, but I wish I knew the rationale behind adding it in the first place. This is a well known pitfall in posix systems, but it's still allowed. You may clean up child processes in a signal handler, though that does not protect you against SIGKILL. I think we should remove the limitation from billiard, even though that would diverge from the multiprocessing behavior. You can still create child processes using the |
@ilyastam Should be able to remove the raise statement, don't have to make the processes "non-daemon" That is, daemon processes will be allowed to create child processes even if it will not be able to reap them, |
Btw, note that this is not a |
billiard 3.3.0.11 is on PyPI including this change |
@ask thank you. Any idea what version of celery will see this improvement? |
This limitation is documented and I don't think that it is a good idea for Celery to silently monkey-patch I can think of the following example (it may seem a bit contrived, though):
Being run as a plain Python function, this code works correctly. But being run as a Celery task (using Celery version 3.0.*), it leaves three subprocesses that will hang forever; when the Celery worker quits, these subprocesses will become orphaned. |
It doesn't explain why, it just states the unix behavior that you would expect when starting a child-child process. Even though it's an infamous limitation in unix it doesn't stop people from doing it. This is no different from The way to do your example:
To kill (-9) this you would have to also kill -9 the child processes, but that is something you will have Not that I advocate creating a Pool for every task, but I don't see why users, who know what they're Also, we don't monkey patch anything this is a change in billiard only. |
By "monkey patching" I mean this assignment, which replaces I agree that there is nothing wrong with starting child-child processes if they are handled right (like in your example). My point is that |
@aromanovich It cannot be written any other way, it's not a limitation of multiprocessing it's a limitation of unix. It sets _current_process so that the logging modules |
And btw, you would have to use billiard for the limitation to be lifted, using multiprocessing will still raise the exception. |
Could also fix this issue using this approach: |
I get this error when calling a @parallel fabric task from within a celery task.
|
@frodopwns use ENV |
@xiaods I think I solved that issue with something like this:
|
ProblemI have a task which calculates some data and loads a scikit-learn classifier to make predictions based on that data. When I run the task by itself, everything is OK, but when I run it using Celery, I get an error when the task attempts to load the pickled classifier:
To reproduceCreate an empty classifier and save it as a pickle:
Create a simple app (
Start the celery worker:
Run the app:
Error message:
SolutionI think there should be an option to "monkeypatch" Celery to allow tasks to start sub-processes, especially if such a "feature" existed in the past. Right now, people are simply moving away to other frameworks when they encounter this problem: http://stackoverflow.com/questions/27904162/using-multiprocessing-pool-from-celery-task-raises-exception. Here is another example of this error: http://stackoverflow.com/questions/22674950/python-multiprocessing-job-to-celery-task-but-attributeerror. This issue should be re-opened... |
Fortunately,I find this issue when I am trying to run ansible playbook in Celery task. Is there some solutions or other implementetion to allow running multiprocess in task? |
@HeartUnchange : recently we are hard working on a big data project, that we wish to use celery as the distributed component. and with your guide ,we are so lucky to solve the problem. see the task configuration:
The solution is ok! we begin the project at 2017.1 and now the prototype is finished! nine months have passed! I own my thanks to you ! and my thanks is beyond expression! |
Hi , I have a pretty standard set-up: Django + Rabbitmq + celery-4.0.2 + python-2.7 + centOS-7 I am trying to spawn a process using standard python multiprocessing module in celery. Daemon processes are not allowed to create child processes and, as a result, tasks that use multiprocessing package are not working:
What could be the reason for not spawning the process?
Thanks you. |
billiard removes limitation of multiprocessing not being able to create demonic forks. Ref: celery/celery#1709 (comment)
try master and report if it's still the issue |
It still has the error. I tried to use a subprocess with:
but with celery master branch it sais:
EDIT I changed multiprocessing with billiard and it works!
|
Starting with Celery 3.1.0 the processes pool (
celery.concurrency.prefork
, formercelery.concurrency.processes
) uses daemon processes to perform tasks.Daemon processes are not allowed to create child processes and, as a result, tasks that use
multiprocessing
package are not working:The text was updated successfully, but these errors were encountered: