Can preforked worker processes ignore TERM signals? #4795

croth1 · 2018-06-04T09:35:47Z

General

Affected version: latest (4.1.1)

Expected behavior

preforked workers ignore sigterm signal - only parent process responsible for the warm shutdown reacts to it and gracefully shuts down the child processes.

Current behavior

When preforked worker processes get a sigterm, they immediately shut down (e.g. kill <pid>):

celery_1        | [2018-06-04 09:19:57,376: ERROR/MainProcess] Process 'ForkPoolWorker-2' pid:26 exited with 'signal 15 (SIGTERM)'

When a sigterm reaches the parent worker process, warm shutdown is performed. If in the meantime the preforked workers also got a term signal, the tasks are just killed.

Relevance

I cannot find a way to do a proper shutdown celery in a docker container with an init system.
(see mailing list: https://groups.google.com/forum/#!topic/celery-users/9UF_VyzRt8Q). It appears that I have to make sure that the term signal only reaches the parent process, but not any of the preforked workers. This seems to be very difficult when running a bash script with several celery instances (e.g. beat, and two queues)

Possible solution

Is it possible to add a feature that allows preforked worker processes ignore the sigterm?

The text was updated successfully, but these errors were encountered:

georgepsarakis · 2018-06-04T16:48:06Z

In order to have proper signal propagation when starting multiple processes inside a Docker container, you probably need a process manager such as supervisord.

xirdneh · 2018-06-04T18:01:56Z

For more information here's an overview of how docker stops stuff: https://www.ctl.io/developers/blog/post/gracefully-stopping-docker-containers/#docker-stop

Also, if you're already using docker I would recommend firing one worker per container.
It's easier to manage and containers do not have that much overhead. Also easier to do auto scaling when using tools like k8s.

One last thing. If you really want to roll your own bash script to manage multiple workers you would have to fire up each worker and then sleep indefinitely until a SIGTERM comes along.
When that happens you can go and get the workers PID and then gracefully stop each one.
Which is basically what supervisord does as @georgepsarakis pointed out.

croth1 · 2018-06-04T20:10:04Z

Thanks @georgepsarakis and @xirdneh for your helpful suggestions. I managed to get a working setup with supervisord! 🎉
Having one worker per container is a very good suggestion, too! Thanks a lot!

georgepsarakis added Category: Deployment Component: Prefork Workers Pool labels Jun 4, 2018

georgepsarakis closed this as completed Jun 5, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can preforked worker processes ignore TERM signals? #4795

Can preforked worker processes ignore TERM signals? #4795

croth1 commented Jun 4, 2018

georgepsarakis commented Jun 4, 2018

xirdneh commented Jun 4, 2018 •

edited

croth1 commented Jun 4, 2018 •

edited

Can preforked worker processes ignore TERM signals? #4795

Can preforked worker processes ignore TERM signals? #4795

Comments

croth1 commented Jun 4, 2018

General

Expected behavior

Current behavior

Relevance

Possible solution

georgepsarakis commented Jun 4, 2018

xirdneh commented Jun 4, 2018 • edited

croth1 commented Jun 4, 2018 • edited

xirdneh commented Jun 4, 2018 •

edited

croth1 commented Jun 4, 2018 •

edited