Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

worker getting stuck #2606

Closed
girishbin opened this issue May 5, 2015 · 45 comments
Closed

worker getting stuck #2606

girishbin opened this issue May 5, 2015 · 45 comments

Comments

@girishbin
Copy link

Celery worker getting stuck consuming lot of resident memory.

Version is celery (3.1.17)

Strace


celery]# strace -p 8401
Process 8401 attached - interrupt to quit
read(10,


celery]# lsof -n -p 8401 | egrep -v '(DIR|REG)'
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
python 8401 dsl 0r FIFO 0,8 0t0 124716100 pipe
python 8401 dsl 1w FIFO 0,8 0t0 124716101 pipe
python 8401 dsl 2w FIFO 0,8 0t0 124716101 pipe
python 8401 dsl 6r FIFO 0,8 0t0 124716462 pipe
python 8401 dsl 7w FIFO 0,8 0t0 124716462 pipe
python 8401 dsl 8r FIFO 0,8 0t0 124716463 pipe
python 8401 dsl 9w FIFO 0,8 0t0 124716463 pipe
python 8401 dsl 10r FIFO 0,8 0t0 124716464 pipe
python 8401 dsl 13w FIFO 0,8 0t0 124716465 pipe
python 8401 dsl 14r FIFO 0,8 0t0 124716466 pipe
python 8401 dsl 15r CHR 1,3 0t0 3662 /dev/null
python 8401 dsl 16w FIFO 0,8 0t0 124716467 pipe

Pstack dump


celery]# pstack 8401
#0 0x0000003056c0e740 in __read_nocancel () from /lib64/libpthread.so.0
#1 0x00007fa96b97b4c6 in _Billiard_conn_recvall () from /home/apps/analy/app/venv/lib/python2.6/site-packages/_billiard.so
#2 0x00007fa96b97b552 in Billiard_conn_recv_string () from /home/apps/analy/app/venv/lib/python2.6/site-packages/_billiard.so
#3 0x00007fa96b97b668 in Billiard_connection_recv_payload () from home/apps/analy/app/venv/lib/python2.6/site-packages/_billiard.so
#4 0x00000030574d5916 in PyEval_EvalFrameEx () from /usr/lib64/libpython2.6.so.1.0
#5 0x00000030574d7657 in PyEval_EvalCodeEx () from /usr/lib64/libpython2.6.so.1.0
#6 0x00000030574d5aa4 in PyEval_EvalFrameEx () from /usr/lib64/libpython2.6.so.1.0
#7 0x00000030574d7657 in PyEval_EvalCodeEx () from /usr/lib64/libpython2.6.so.1.0
#8 0x00000030574d5aa4 in PyEval_EvalFrameEx () from /usr/lib64/libpython2.6.so.1.0
#9 0x00000030574d7657 in PyEval_EvalCodeEx () from /usr/lib64/libpython2.6.so.1.0
#10 0x00000030574d5aa4 in PyEval_EvalFrameEx () from /usr/lib64/libpython2.6.so.1.0
#11 0x00000030574d7657 in PyEval_EvalCodeEx () from /usr/lib64/libpython2.6.so.1.0
#12 0x00000030574d5aa4 in PyEval_EvalFrameEx () from /usr/lib64/libpython2.6.so.1.0
#13 0x00000030574d7657 in PyEval_EvalCodeEx () from /usr/lib64/libpython2.6.so.1.0
#14 0x00000030574d5aa4 in PyEval_EvalFrameEx () from /usr/lib64/libpython2.6.so.1.0
#15 0x00000030574d7657 in PyEval_EvalCodeEx () from /usr/lib64/libpython2.6.so.1.0
#16 0x000000305746acb0 in ?? () from /usr/lib64/libpython2.6.so.1.0
#17 0x0000003057443c63 in PyObject_Call () from /usr/lib64/libpython2.6.so.1.0
#18 0x00000030574566af in ?? () from /usr/lib64/libpython2.6.so.1.0
#19 0x0000003057443c63 in PyObject_Call () from /usr/lib64/libpython2.6.so.1.0
#20 0x000000305749568e in ?? () from /usr/lib64/libpython2.6.so.1.0
#21 0x0000003057494298 in ?? () from /usr/lib64/libpython2.6.so.1.0
#22 0x0000003057443c63 in PyObject_Call () from /usr/lib64/libpython2.6.so.1.0
#23 0x00000030574d4f74 in PyEval_EvalFrameEx () from /usr/lib64/libpython2.6.so.1.0
#24 0x00000030574d7657 in PyEval_EvalCodeEx () from /usr/lib64/libpython2.6.so.1.0
#25 0x00000030574d5aa4 in PyEval_EvalFrameEx () from /usr/lib64/libpython2.6.so.1.0
#26 0x00000030574d7657 in PyEval_EvalCodeEx () from /usr/lib64/libpython2.6.so.1.0
#27 0x00000030574d5aa4 in PyEval_EvalFrameEx () from /usr/lib64/libpython2.6.so.1.0
#28 0x00000030574d7657 in PyEval_EvalCodeEx () from /usr/lib64/libpython2.6.so.1.0
#29 0x000000305746adad in ?? () from /usr/lib64/libpython2.6.so.1.0
#30 0x0000003057443c63 in PyObject_Call () from /usr/lib64/libpython2.6.so.1.0
#31 0x00000030574d4470 in PyEval_EvalFrameEx () from /usr/lib64/libpython2.6.so.1.0
#32 0x00000030574d7657 in PyEval_EvalCodeEx () from /usr/lib64/libpython2.6.so.1.0
#33 0x000000305746adad in ?? () from /usr/lib64/libpython2.6.so.1.0
#34 0x0000003057443c63 in PyObject_Call () from /usr/lib64/libpython2.6.so.1.0
#35 0x00000030574566af in ?? () from /usr/lib64/libpython2.6.so.1.0
#36 0x0000003057443c63 in PyObject_Call () from /usr/lib64/libpython2.6.so.1.0
#37 0x000000305749568e in ?? () from /usr/lib64/libpython2.6.so.1.0
#38 0x0000003057494298 in ?? () from /usr/lib64/libpython2.6.so.1.0
#39 0x0000003057443c63 in PyObject_Call () from /usr/lib64/libpython2.6.so.1.0
#40 0x00000030574d4470 in PyEval_EvalFrameEx () from /usr/lib64/libpython2.6.so.1.0
#41 0x00000030574d7657 in PyEval_EvalCodeEx () from /usr/lib64/libpython2.6.so.1.0
#42 0x00000030574d5aa4 in PyEval_EvalFrameEx () from /usr/lib64/libpython2.6.so.1.0
#43 0x00000030574d7657 in PyEval_EvalCodeEx () from /usr/lib64/libpython2.6.so.1.0
#44 0x00000030574d5aa4 in PyEval_EvalFrameEx () from /usr/lib64/libpython2.6.so.1.0
#45 0x00000030574d7657 in PyEval_EvalCodeEx () from /usr/lib64/libpython2.6.so.1.0
#46 0x00000030574d5aa4 in PyEval_EvalFrameEx () from /usr/lib64/libpython2.6.so.1.0
#47 0x00000030574d7657 in PyEval_EvalCodeEx () from /usr/lib64/libpython2.6.so.1.0
#48 0x00000030574d5aa4 in PyEval_EvalFrameEx () from /usr/lib64/libpython2.6.so.1.0
#49 0x00000030574d7657 in PyEval_EvalCodeEx () from /usr/lib64/libpython2.6.so.1.0
#50 0x00000030574d5aa4 in PyEval_EvalFrameEx () from /usr/lib64/libpython2.6.so.1.0
#51 0x00000030574d7657 in PyEval_EvalCodeEx () from /usr/lib64/libpython2.6.so.1.0
#52 0x000000305746adad in ?? () from /usr/lib64/libpython2.6.so.1.0
#53 0x0000003057443c63 in PyObject_Call () from /usr/lib64/libpython2.6.so.1.0
#54 0x00000030574d4470 in PyEval_EvalFrameEx () from /usr/lib64/libpython2.6.so.1.0
#55 0x00000030574d7657 in PyEval_EvalCodeEx () from /usr/lib64/libpython2.6.so.1.0
#56 0x000000305746adad in ?? () from /usr/lib64/libpython2.6.so.1.0
#57 0x0000003057443c63 in PyObject_Call () from /usr/lib64/libpython2.6.so.1.0
#58 0x00000030574566af in ?? () from /usr/lib64/libpython2.6.so.1.0
#59 0x0000003057443c63 in PyObject_Call () from /usr/lib64/libpython2.6.so.1.0
#60 0x0000003057495a54 in ?? () from /usr/lib64/libpython2.6.so.1.0
#61 0x0000003057443c63 in PyObject_Call () from /usr/lib64/libpython2.6.so.1.0
#62 0x00000030574d4470 in PyEval_EvalFrameEx () from /usr/lib64/libpython2.6.so.1.0
#63 0x00000030574d7657 in PyEval_EvalCodeEx () from /usr/lib64/libpython2.6.so.1.0
#64 0x00000030574d5aa4 in PyEval_EvalFrameEx () from /usr/lib64/libpython2.6.so.1.0
#65 0x00000030574d7657 in PyEval_EvalCodeEx () from /usr/lib64/libpython2.6.so.1.0
#66 0x00000030574d5aa4 in PyEval_EvalFrameEx () from /usr/lib64/libpython2.6.so.1.0
#67 0x00000030574d7657 in PyEval_EvalCodeEx () from /usr/lib64/libpython2.6.so.1.0
#68 0x00000030574d5aa4 in PyEval_EvalFrameEx () from /usr/lib64/libpython2.6.so.1.0
#69 0x00000030574d7657 in PyEval_EvalCodeEx () from /usr/lib64/libpython2.6.so.1.0
#70 0x00000030574d5aa4 in PyEval_EvalFrameEx () from /usr/lib64/libpython2.6.so.1.0
#71 0x00000030574d7657 in PyEval_EvalCodeEx () from /usr/lib64/libpython2.6.so.1.0
#72 0x00000030574d5aa4 in PyEval_EvalFrameEx () from /usr/lib64/libpython2.6.so.1.0
#73 0x00000030574d7657 in PyEval_EvalCodeEx () from /usr/lib64/libpython2.6.so.1.0
#74 0x00000030574d5aa4 in PyEval_EvalFrameEx () from /usr/lib64/libpython2.6.so.1.0
#75 0x00000030574d6b8f in PyEval_EvalFrameEx () from /usr/lib64/libpython2.6.so.1.0
#76 0x00000030574d7657 in PyEval_EvalCodeEx () from /usr/lib64/libpython2.6.so.1.0
#77 0x00000030574d5aa4 in PyEval_EvalFrameEx () from /usr/lib64/libpython2.6.so.1.0
#78 0x00000030574d7657 in PyEval_EvalCodeEx () from /usr/lib64/libpython2.6.so.1.0
#79 0x00000030574d7732 in PyEval_EvalCode () from /usr/lib64/libpython2.6.so.1.0
#80 0x00000030574f1bac in ?? () from /usr/lib64/libpython2.6.so.1.0
#81 0x00000030574f1c80 in PyRun_FileExFlags () from /usr/lib64/libpython2.6.so.1.0
#82 0x00000030574f316c in PyRun_SimpleFileExFlags () from /usr/lib64/libpython2.6.so.1.0
#83 0x00000030574ff8a2 in Py_Main () from /usr/lib64/libpython2.6.so.1.0
#84 0x000000305681ed5d in __libc_start_main () from /lib64/libc.so.6
#85 0x0000000000400649 in _start ()

@girishbin
Copy link
Author

Its stuck on pipe read.

proc]# ls -l /proc/8401/fd
total 0
lr-x------ 1 dsl dsl 64 May 5 17:26 0 -> pipe:[124716100]
l-wx------ 1 dsl dsl 64 May 5 17:26 1 -> pipe:[124716101]
lr-x------ 1 dsl dsl 64 May 5 17:26 10 -> pipe:[124716464]
l-wx------ 1 dsl dsl 64 May 5 17:26 13 -> pipe:[124716465]

@joostdevries
Copy link

@girishbin

  • When does this typically occur?
  • What what does the worker log look like?
  • Is there also a lot of CPU usage or just memory consumption?

@idealopamp
Copy link

Are you using redis as the broker? Seeing a similar symptom with redis broker on celery 3.1.8 and billiard 3.3.0.16 . No high memory consumption though.

@domenkozar
Copy link

Same here. @joostdevries it happens quite often to us, hard to say under what conditions. We have 4 workers using redis backend.

workers log before they are stuck:

[2015-08-07 16:50:40,140: INFO/MainProcess] Task feeds.transformers.rss_atom.by[6dbd5f0d-222b-4c5c-bd22-5e05bb63447b] succeeded in 0.0153002970037s: {}
[2015-08-07 16:50:40,141: INFO/MainProcess] Received task: feeds.transformers.rss_atom.by[8a582425-456d-4e49-93c8-eb375967cac5]
[2015-08-07 16:50:40,155: INFO/MainProcess] Task feeds.transformers.rss_atom.by[3f60a721-2a6e-4494-bda7-b7b939efe66a] succeeded in 0.00693402900652s: {}
[2015-08-07 16:50:40,157: INFO/MainProcess] Received task: feeds.transformers.rss_atom.by[486cc62d-d330-467f-b30c-02005c2038b6]
[2015-08-07 16:50:40,171: INFO/MainProcess] Task feeds.transformers.rss_atom.by[8a582425-456d-4e49-93c8-eb375967cac5] succeeded in 0.0071912699932s: {}
[2015-08-07 16:50:40,173: INFO/MainProcess] Received task: feeds.transformers.rss_atom.by[538a71a6-0c33-494a-adfe-575f7465e9d4]
[2015-08-07 16:50:40,188: INFO/MainProcess] Task feeds.transformers.rss_atom.by[486cc62d-d330-467f-b30c-02005c2038b6] succeeded in 0.0155014329939s: {}
[2015-08-07 16:50:40,189: INFO/MainProcess] Received task: feeds.transformers.rss_atom.by[2ad9294a-7553-468a-a3dc-73a8f2cea188]
[2015-08-07 16:50:40,203: INFO/MainProcess] Task feeds.transformers.rss_atom.by[538a71a6-0c33-494a-adfe-575f7465e9d4] succeeded in 0.0153862849984s: {}
[2015-08-07 16:50:40,205: INFO/MainProcess] Received task: feeds.transformers.rss_atom.by[f7371090-aa05-4194-a327-cb41d1165b7e]
[2015-08-07 16:50:40,220: INFO/MainProcess] Task feeds.transformers.rss_atom.by[2ad9294a-7553-468a-a3dc-73a8f2cea188] succeeded in 0.0158518639946s: {}
[2015-08-07 16:50:40,222: INFO/MainProcess] Received task: feeds.transformers.rss_atom.by[6d35c0b2-9c5a-425f-9405-9f7e1fb3aa41]
[2015-08-07 16:50:40,236: INFO/MainProcess] Task feeds.transformers.rss_atom.by[f7371090-aa05-4194-a327-cb41d1165b7e] succeeded in 0.00751440098975s: {}
[2015-08-07 16:50:40,238: INFO/MainProcess] Received task: feeds.transformers.rss_atom.by[ed664157-a065-4b15-9bbe-633edf96d230]
[2015-08-07 16:50:40,252: INFO/MainProcess] Task feeds.transformers.rss_atom.by[6d35c0b2-9c5a-425f-9405-9f7e1fb3aa41] succeeded in 0.00709322700277s: {}
[2015-08-07 16:50:40,254: INFO/MainProcess] Received task: feeds.transformers.rss_atom.by[683bc593-44c1-4145-b698-1f2ba66a43bd]
[2015-08-07 16:50:40,260: INFO/MainProcess] Task feeds.transformers.rss_atom.by[ed664157-a065-4b15-9bbe-633edf96d230] succeeded in 0.015573162993s: {}
[2015-08-07 16:50:40,275: INFO/MainProcess] Received task: feeds.transformers.rss_atom.by[d8018f2b-a1f0-4112-97ba-335cd597be1c]
[2015-08-07 16:50:40,282: INFO/MainProcess] Task feeds.transformers.rss_atom.by[683bc593-44c1-4145-b698-1f2ba66a43bd] succeeded in 0.0205264859978s: {}
[2015-08-07 16:50:40,292: INFO/MainProcess] Received task: feeds.transformers.rss_atom.by[92f938ce-894a-49f4-8c89-a090c37b71c8]
[2015-08-07 16:50:40,299: INFO/MainProcess] Task feeds.transformers.rss_atom.by[d8018f2b-a1f0-4112-97ba-335cd597be1c] succeeded in 0.016343981988s: {}
[2015-08-07 16:50:40,302: INFO/MainProcess] Received task: feeds.transformers.rss_atom.by[ff12882d-61b8-403e-97ac-b096580de5f0]
[2015-08-07 16:50:40,318: INFO/MainProcess] Task feeds.transformers.rss_atom.by[92f938ce-894a-49f4-8c89-a090c37b71c8] succeeded in 0.0184662840038s: {}
[2015-08-07 16:50:40,330: INFO/MainProcess] Received task: feeds.transformers.rss_atom.by[3e0ff326-cd83-4d88-8dee-f9c167a44f6a]
[2015-08-07 16:50:40,338: INFO/MainProcess] Task feeds.transformers.rss_atom.by[ff12882d-61b8-403e-97ac-b096580de5f0] succeeded in 0.0187683639961s: {}
[2015-08-07 16:50:40,341: INFO/MainProcess] Received task: feeds.transformers.rss_atom.by[f12efd44-b2be-42fd-8872-5c979421bea3]
[2015-08-07 16:50:40,357: INFO/MainProcess] Task feeds.transformers.rss_atom.by[3e0ff326-cd83-4d88-8dee-f9c167a44f6a] succeeded in 0.0182138609962s: {}
[2015-08-07 16:50:40,359: INFO/MainProcess] Received task: feeds.transformers.rss_atom.by[57d8ba02-db1f-40d4-88f9-1d3d83939c2a]
[2015-08-07 16:50:40,374: INFO/MainProcess] Task feeds.transformers.rss_atom.by[f12efd44-b2be-42fd-8872-5c979421bea3] succeeded in 0.0165290409932s: {}
[2015-08-07 16:50:40,385: INFO/MainProcess] Received task: feeds.transformers.rss_atom.by[8ae76799-d19d-4acf-b2b4-cfe9d77e3000]
[2015-08-07 16:50:40,393: INFO/MainProcess] Task feeds.transformers.rss_atom.by[57d8ba02-db1f-40d4-88f9-1d3d83939c2a] succeeded in 0.0185826079978s: {}
[2015-08-07 16:50:40,405: INFO/MainProcess] Received task: feeds.transformers.rss_atom.by[ca5e9d23-f660-4f88-b7d1-d30b91750727]
[2015-08-07 16:50:40,414: INFO/MainProcess] Task feeds.transformers.rss_atom.by[8ae76799-d19d-4acf-b2b4-cfe9d77e3000] succeeded in 0.0205218030023s: {}
[2015-08-07 16:50:40,417: INFO/MainProcess] Received task: feeds.transformers.rss_atom.by[5cfdbbe8-3da1-45b5-b83e-599303ad02eb]
[2015-08-07 16:50:40,433: INFO/MainProcess] Task feeds.transformers.rss_atom.by[ca5e9d23-f660-4f88-b7d1-d30b91750727] succeeded in 0.0175381449953s: {}
[2015-08-07 16:50:40,436: INFO/MainProcess] Received task: feeds.transformers.rss_atom.by[cb864c28-b3c5-4def-a5de-8c15049c6d29]
[2015-08-07 16:50:40,453: INFO/MainProcess] Task feeds.transformers.rss_atom.by[5cfdbbe8-3da1-45b5-b83e-599303ad02eb] succeeded in 0.0190771400084s: {}
[2015-08-07 16:50:40,456: INFO/MainProcess] Received task: feeds.transformers.rss_atom.by[cc85e444-7969-45ed-98a4-f9ab074db260]
[2015-08-07 16:50:40,473: INFO/MainProcess] Task feeds.transformers.rss_atom.by[cb864c28-b3c5-4def-a5de-8c15049c6d29] succeeded in 0.0191195910011s: {}
[2015-08-07 16:50:40,476: INFO/MainProcess] Received task: feeds.transformers.rss_atom.by[4696c784-819d-4278-87fd-b74c2aab4c57]
[2015-08-07 16:50:40,491: INFO/MainProcess] Task feeds.transformers.rss_atom.by[cc85e444-7969-45ed-98a4-f9ab074db260] succeeded in 0.0098425810138s: {}
[2015-08-07 16:50:40,494: INFO/MainProcess] Received task: feeds.transformers.rss_atom.by[bce5ad3e-dd4a-4bbc-bb69-9585597ae010]
[2015-08-07 16:50:40,501: INFO/MainProcess] Task feeds.transformers.rss_atom.by[4696c784-819d-4278-87fd-b74c2aab4c57] succeeded in 0.0174191809929s: {}
[2015-08-07 16:50:40,512: INFO/MainProcess] Received task: feeds.transformers.rss_atom.by[cfefb0bb-399c-423e-9032-add27dabd6df]
[2015-08-07 16:50:40,526: INFO/MainProcess] Task feeds.transformers.rss_atom.by[bce5ad3e-dd4a-4bbc-bb69-9585597ae010] succeeded in 0.0162074130058s: {}
[2015-08-07 16:50:40,528: INFO/MainProcess] Received task: feeds.transformers.rss_atom.by[51f11154-3dbc-448a-a3a4-00c2fe7ab782]
[2015-08-07 16:50:40,543: INFO/MainProcess] Task feeds.transformers.rss_atom.by[cfefb0bb-399c-423e-9032-add27dabd6df] succeeded in 0.0163563999959s: {}
[2015-08-07 16:50:40,545: INFO/MainProcess] Received task: feeds.transformers.rss_atom.by[fb900a0c-403f-4941-aa6a-f242312e5d85]
[2015-08-07 16:50:40,560: INFO/MainProcess] Task feeds.transformers.rss_atom.by[51f11154-3dbc-448a-a3a4-00c2fe7ab782] succeeded in 0.0162344239943s: {}
[2015-08-07 16:50:40,563: INFO/MainProcess] Received task: feeds.transformers.rss_atom.by[330fff18-71e0-4d9e-9ad1-0111d44f7e03]

No CPU usage, memory usage (output from top):

31059 celery    20   0  517176  64388   3456 S  0.0  1.7   0:14.86 celery                                                                                                                                                                                   
31058 celery    20   0  517204  64308   3456 S  0.0  1.7   0:14.32 celery                                                                                                                                                                                   
31062 celery    20   0  517044  64308   3456 S  0.0  1.7   0:14.88 celery                                                                                                                                                                                   
31061 celery    20   0  516912  63508   3064 S  0.0  1.7   0:14.32 celery                                                                                                                                                                                   
31046 celery    20   0  369344  63396   6964 S  0.0  1.7   1:09.70 celery                                                                                                                                                                                   
16967 celery    20   0  366648  57396   7612 S  0.0  1.6   0:04.04 celery        

@domenkozar
Copy link

will try setting BROKER_TRANSPORT_OPTIONS = {'socket_timeout': 10} and see if it helps.

@dtao
Copy link

dtao commented Sep 2, 2015

@domenkozar Did it help?

@domenkozar
Copy link

See redis/redis-py#306

@domenkozar
Copy link

Maybe it's not:

[root@ip-172-30-0-183 ec2-user]# lsof -p 9253|grep 6379
celery  9253 celery   10u     IPv4            2738405       0t0      TCP ip-172-30-0-183.ec2.internal:37155->ip-172-30-3-169.ec2.internal:6379 (ESTABLISHED)
celery  9253 celery   23u     IPv4            2737905       0t0      TCP ip-172-30-0-183.ec2.internal:37090->ip-172-30-3-169.ec2.internal:6379 (ESTABLISHED)
celery  9253 celery   33u     IPv4            2737956       0t0      TCP ip-172-30-0-183.ec2.internal:37098->ip-172-30-3-169.ec2.internal:6379 (ESTABLISHED)
celery  9253 celery   36u     IPv4            2737990       0t0      TCP ip-172-30-0-183.ec2.internal:37105->ip-172-30-3-169.ec2.internal:6379 (ESTABLISHED)
celery  9253 celery   46u     IPv4            2739860       0t0      TCP ip-172-30-0-183.ec2.internal:37297->ip-172-30-3-169.ec2.internal:6379 (ESTABLISHED)
[root@ip-172-30-0-183 ec2-user]# lsof -p 9253|grep CLOSE_WAIT
celery  9253 celery   11u     IPv4            2739163       0t0      TCP ip-172-30-0-183.ec2.internal:45206->wordpress.com:http (CLOSE_WAIT)
celery  9253 celery   16u     IPv4            2737802       0t0      TCP ip-172-30-0-183.ec2.internal:53865->205.251.242.33:https (CLOSE_WAIT)
celery  9253 celery   32u     IPv4            2764510       0t0      TCP ip-172-30-0-183.ec2.internal:48242->ec2-52-2-102-195.compute-1.amazonaws.com:http (CLOSE_WAIT)
celery  9253 celery   37u     IPv4            2739073       0t0      TCP ip-172-30-0-183.ec2.internal:45198->wordpress.com:http (CLOSE_WAIT)
celery  9253 celery   38u     IPv4            2738052       0t0      TCP ip-172-30-0-183.ec2.internal:36962->230-156-220-74-available.ilandcloud.com:http (CLOSE_WAIT)
celery  9253 celery   39u     IPv4            2738313       0t0      TCP ip-172-30-0-183.ec2.internal:46067->qb-in-f118.1e100.net:http (CLOSE_WAIT)
celery  9253 celery   40u     IPv4            2739202       0t0      TCP ip-172-30-0-183.ec2.internal:43252->r2.ycpi.vip.dcb.yahoo.net:http (CLOSE_WAIT)
celery  9253 celery   42u     IPv4            2739382       0t0      TCP ip-172-30-0-183.ec2.internal:45228->wordpress.com:http (CLOSE_WAIT)
celery  9253 celery   43u     IPv4            2739488       0t0      TCP ip-172-30-0-183.ec2.internal:38920->wordpress.com:http (CLOSE_WAIT)
celery  9253 celery   45u     IPv4            2739667       0t0      TCP ip-172-30-0-183.ec2.internal:57721->ec2-54-165-198-100.compute-1.amazonaws.com:https (CLOSE_WAIT)

@domenkozar
Copy link

Flower is showing all workers as offline, workers appear to be just waiting for a task.

@domenkozar
Copy link

new hypothesis: I think the major culprit was that redis ran out of memory (there are some OOM traces), and master is unable to send new tasks to workers.

@ask
Copy link
Contributor

ask commented Sep 3, 2015

The strace you are seeing looks normal to me, is it not just waiting for more tasks? I guess it's possible that it was sent incomplete data (seeing as there is something in the buffer)

@domenkozar
Copy link

Yes, workers appear to just be waiting for more tasks, so something is going on with master celery process. The task queue has 100k items,.

@domenkozar
Copy link

So I've cleared celery redis queue and celery beat on master process is not scheduling new tasks. Looking at gdb it's doing what it should:

(gdb) py-list
 115            buf.seek(self.bytes_written)
 116            marker = 0
 117
 118            try:
 119                while True:
>120                    data = self._sock.recv(socket_read_size)
 121                    # an empty string indicates the server shutdown the socket
 122                    if isinstance(data, bytes) and len(data) == 0:
 123                        raise socket.error(SERVER_CLOSED_CONNECTION_ERROR)
 124                    buf.write(data)
 125                    data_length = len(data)

@domenkozar
Copy link

Aha, found which of the connection was the problem, from CLIENT LIST:

id=497 addr=172.30.0.183:45591 fd=20 name= age=1142 idle=1142 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=NULL

@domenkozar
Copy link

It's interesting that it sends a cmd=NULL and the connction is coming from the master celery worker. @ask any ideas?

@domenkozar
Copy link

So the way I see it, redis connection pool gets exausted. If I kill some random connection on redis side, they go to CLOSE_WAIT state, but celery master process keeps them open.

We have many many small tasks, so tasks are getting picked up really fast.

You can see there are 11 connections and 2 of them were killed from redis side with CLIENT KILL command:

[root@ip-172-30-0-183 ec2-user]# lsof -p 24747|grep TCP
celery  24747 celery   10u     IPv4            3619402       0t0      TCP ip-172-30-0-183.ec2.internal:45535->ip-172-30-3-169.ec2.internal:6379 (ESTABLISHED)
celery  24747 celery   11u     IPv4            3619425       0t0      TCP ip-172-30-0-183.ec2.internal:45544->ip-172-30-3-169.ec2.internal:6379 (ESTABLISHED)
celery  24747 celery   16u     sock                0,6       0t0  3113866 protocol: TCP
celery  24747 celery   34u     IPv4            3619406       0t0      TCP ip-172-30-0-183.ec2.internal:45537->ip-172-30-3-169.ec2.internal:6379 (ESTABLISHED)
celery  24747 celery   35u     IPv4            3619421       0t0      TCP ip-172-30-0-183.ec2.internal:45543->ip-172-30-3-169.ec2.internal:6379 (ESTABLISHED)
celery  24747 celery   36u     IPv4            3619417       0t0      TCP ip-172-30-0-183.ec2.internal:45542->ip-172-30-3-169.ec2.internal:6379 (CLOSE_WAIT)
celery  24747 celery   37u     IPv4            3619431       0t0      TCP ip-172-30-0-183.ec2.internal:45545->ip-172-30-3-169.ec2.internal:6379 (ESTABLISHED)
celery  24747 celery   38u     IPv4            3619435       0t0      TCP ip-172-30-0-183.ec2.internal:45546->ip-172-30-3-169.ec2.internal:6379 (CLOSE_WAIT)
celery  24747 celery   39u     IPv4            3619437       0t0      TCP ip-172-30-0-183.ec2.internal:45547->ip-172-30-3-169.ec2.internal:6379 (ESTABLISHED)
celery  24747 celery   40u     IPv4            3703870       0t0      TCP ip-172-30-0-183.ec2.internal:46884->ip-172-30-3-169.ec2.internal:6379 (ESTABLISHED)
celery  24747 celery   41u     IPv4            3619441       0t0      TCP ip-172-30-0-183.ec2.internal:45549->ip-172-30-3-169.ec2.internal:6379 (ESTABLISHED)
celery  24747 celery   42u     IPv4            3619443       0t0      TCP ip-172-30-0-183.ec2.internal:45550->ip-172-30-3-169.ec2.internal:6379 (ESTABLISHED)

@domenkozar
Copy link

So it's also not connection pool being exausted. I've increased it to 20 and currently it got stuck with 8 connections.

@domenkozar
Copy link

This is the traceback that happens when I kill cmd=NULL connection:

[2015-09-06 11:34:44,336: WARNING/MainProcess] consumer: Connection to broker lost. Trying to re-establish the connection...
Traceback (most recent call last):
  File "/opt/cmgd/venv/lib/python2.7/site-packages/celery/worker/consumer.py", line 278, in start
    blueprint.start(self)
  File "/opt/cmgd/venv/lib/python2.7/site-packages/celery/bootsteps.py", line 123, in start
    step.start(parent)
  File "/opt/cmgd/venv/lib/python2.7/site-packages/celery/worker/consumer.py", line 821, in start
    c.loop(*c.loop_args())
  File "/opt/cmgd/venv/lib/python2.7/site-packages/celery/worker/loops.py", line 70, in asynloop
    next(loop)
  File "/opt/cmgd/venv/lib/python2.7/site-packages/kombu/async/hub.py", line 267, in create_loop
    tick_callback()
  File "/opt/cmgd/venv/lib/python2.7/site-packages/kombu/transport/redis.py", line 942, in on_poll_start
    [add_reader(fd, on_readable, fd) for fd in cycle.fds]
  File "/opt/cmgd/venv/lib/python2.7/site-packages/kombu/async/hub.py", line 201, in add_reader
    return self.add(fds, callback, READ | ERR, args)
  File "/opt/cmgd/venv/lib/python2.7/site-packages/kombu/async/hub.py", line 152, in add
    self.poller.register(fd, flags)
  File "/opt/cmgd/venv/lib/python2.7/site-packages/kombu/utils/eventio.py", line 78, in register
    self._epoll.register(fd, events)
IOError: [Errno 9] Bad file descriptor
[2015-09-06 11:34:44,339: WARNING/MainProcess] Restoring 1 unacknowledged message(s).
[2015-09-06 11:34:44,398: INFO/MainProcess] Connected to redis://mas-re-13j3y6r7dd9ui.vfavzi.0001.use1.cache.am

@domenkozar
Copy link

I think I found a way for everyone to reproduce this, schedule few thousands of dummy tasks that don't do anything.

4 Workers process around 170 tasks/s then get stuck.

@domenkozar
Copy link

And flower graphs showing the hang http://i.imgur.com/1q0A8DN.png

@wavenator
Copy link

We're issuing the same problems here.
RabbitMQ does the job at the moment just right.
We prefer redis because of performance issues.

Our workers hangs after a short period with redis, not error messages or anything that looks suspicious.
set socket_timeout did not resolve the issue.

@ask
Copy link
Contributor

ask commented Oct 27, 2015

@domenkozar What configuration are you using? I execute 100k tasks using redis often with the stress test suite.

If it's consuming a lot of memory, do you have the fanout_prefix and fanout_patterns transport options set? http://docs.celeryproject.org/en/latest/getting-started/brokers/redis.html#caveats

@domenkozar
Copy link

@ask unfortunately I don't have access to that environment anymore, but we didn't change fanout_* settings

@domenkozar
Copy link

@akuchling can you use redis-cli and post the output of CLIENT LIST?

@domenkozar
Copy link

Also, I'd double check all servers are using kombu-3.0.35

@akuchling
Copy link

Version # of kombu is in fact 3.0.35. 'Client list' returns the following: http://dpaste.com/2GGCECS (IP addresses slightly redacted).

Interesting! In that dpaste, there isn't a client listed where cmd=null, so maybe this is a new failure mode. It doesn't seem to me that we're filling up Redis's memory at the moment; our data is around 3Gb on a 6Gb AWS machine.

We do have socket_timeout set to 10, but that doesn't fix the problem. Instead celery ends up doing a few tasks every 10sec and then freezes again. Note how the seconds figure is incrementing: 37, 47, 57 in this dpaste: http://dpaste.com/2AZYEH4

@akuchling
Copy link

akuchling commented May 27, 2016

I found myself looking at on_stop_not_started() in celery/concurrent/asyncpool.py because it showed up in a GDB stacktrace, and was puzzled by the use of pending_remove_fd.

It's initially given the value of an empty set. Then the code does: for fd in outqueues: self._flush_outqueue(fd, pending_remove_fd.discard, ...) and removes the contents of pending_remove_fd from outqueues.

Bug: The set is initially empty, and _flush_outqueues() is calling pending_remove_fd.discard(). So how does the pending_remove_fd set ever get filled in? Was it supposed to be initialized as = set(outqueues) to make a copy of the starting set of fds? Or have I misread something?

@ask
Copy link
Contributor

ask commented May 27, 2016

@akuchling good catch, I'm guess it should be pending_remove_fds.add instead of pending_remove_fds.discard there

@ask
Copy link
Contributor

ask commented May 27, 2016

So at the end of that loop it will remove the flushed outqueues when it does the outqueues.difference_update(pending_remove_fd)

ask added a commit that referenced this issue May 27, 2016
akuchling pushed a commit to akuchling/celery that referenced this issue May 28, 2016
@akuchling
Copy link

Thanks! Should this change also be committed to the 3.1 branch (which I've done in a fork so that I can test it)?

@akuchling
Copy link

Unfortunately the pending_remove_fd patch doesn't fix all hanging for us, though it doesn't seem to cause any problems. With the patch, I've also gotten a worker/master hanging: the master is doing a read() from the worker, and the worker is in a read() in Billiard_conn_recv(), being called from pickle.load(). Any suggestions for what this might be?

@ask
Copy link
Contributor

ask commented Jun 24, 2016

Closing this, as we don't have the resources to complete this task.

May be fixed in master, let's see if comes back after 4.0 release.

@JonPeel
Copy link

JonPeel commented Sep 15, 2017

@akuchling @domenkozar - was the discussion on this issue continued anywhere / did you find any resolution?

Trawling through the issues list there's a lot of subjective "something is broken" but this appears to be the closest to the issues we're seeing.

@domenkozar
Copy link

Sadly none of us work at the company anymore where this was happening in production.

@akuchling took over my previous efforts, maybe he can report the last update if this bug was really fixed or not.

@akuchling
Copy link

@JonPeel: I don't remember if we ever found a resolution. To summarize the changes we applied:

  • in BROKER_TRANSPORT_OPTIONS in settings.py, we added:
 'socket_timeout': 10,                # timeout in case of network errors
 'fanout_prefix': True,
 'fanout_patterns': True,
  • we used a fork of Celery 3.1 with commit 3207776 (the on_stop_not_started() bugfix I found above).

@mvaled
Copy link
Contributor

mvaled commented Sep 15, 2017

@JonPeel Now I'm running 4.0.22 (and planning to update to 4.1). I haven't found this issue anymore.

I do find from time time one worker with -c1 (for serialization) being unresponsive after reaching the maximum jobs (2000). It seems that when trying to kill and respawn the child worker it gets stuck. But I haven't explored deeply: Oddly enough, restarting the worker makes it work ok: even recycling the child worker after reaching another 2000 jobs. So I don't think is related to Celery but to our App.

@JonPeel
Copy link

JonPeel commented Sep 19, 2017

Thanks for the replies; will open a new ticket if our issues re-occurs (For ref: Celery 4.1, Redis broker and result backend: 2.10.5 redis-py, Redis 3.2.0, SSL enabled).

From what I've seen we're getting workers stuck listening on connections that are doing nothing (CLOSE_WAIT) state and a restart resumes business as usual.

From that, redis/redis-py#306 looks more relevant to us than it was above. If so, it's a) an issue in redis-py not celery and b) seems to be positively affected by the socket_timeout in BROKER_TRANSPORT_OPTIONS as suggested above. Restart was required every 24 hours or so, no longer appears to be a problem.

@taylor-cedar
Copy link

I am running into this issue on

Celery 4.1.0
redis-py 2.10.6

Celery hangs on redis-py connection.py

return sock.recv(*args, **kwargs)

Here is the stack

 [2017-11-24 04:11:08,423: WARNING/MainProcess] File "/usr/bin/celery", line 11, in <module>
     sys.exit(main())
 [2017-11-24 04:11:08,423: WARNING/MainProcess] File "/usr/lib/python3.6/site-packages/celery/__main__.py", line 14, in main
     _main()
 [2017-11-24 04:11:08,423: WARNING/MainProcess] File "/usr/lib/python3.6/site-packages/celery/bin/celery.py", line 326, in main
     cmd.execute_from_commandline(argv)
 [2017-11-24 04:11:08,424: WARNING/MainProcess] File "/usr/lib/python3.6/site-packages/celery/bin/celery.py", line 488, in execute_from_commandline
     super(CeleryCommand, self).execute_from_commandline(argv)))
 [2017-11-24 04:11:08,425: WARNING/MainProcess] File "/usr/lib/python3.6/site-packages/celery/bin/base.py", line 281, in execute_from_commandline
     return self.handle_argv(self.prog_name, argv[1:])
 [2017-11-24 04:11:08,425: WARNING/MainProcess] File "/usr/lib/python3.6/site-packages/celery/bin/celery.py", line 480, in handle_argv
     return self.execute(command, argv)
 [2017-11-24 04:11:08,425: WARNING/MainProcess] File "/usr/lib/python3.6/site-packages/celery/bin/celery.py", line 412, in execute
     ).run_from_argv(self.prog_name, argv[1:], command=argv[0])
 [2017-11-24 04:11:08,426: WARNING/MainProcess] File "/usr/lib/python3.6/site-packages/celery/bin/worker.py", line 221, in run_from_argv
     return self(*args, **options)
 [2017-11-24 04:11:08,426: WARNING/MainProcess] File "/usr/lib/python3.6/site-packages/celery/bin/base.py", line 244, in __call__
     ret = self.run(*args, **kwargs)
 [2017-11-24 04:11:08,427: WARNING/MainProcess] File "/usr/lib/python3.6/site-packages/celery/bin/worker.py", line 256, in run
     worker.start()
 [2017-11-24 04:11:08,427: WARNING/MainProcess] File "/usr/lib/python3.6/site-packages/celery/worker/worker.py", line 203, in start
     self.blueprint.start(self)
 [2017-11-24 04:11:08,428: WARNING/MainProcess] File "/usr/lib/python3.6/site-packages/celery/bootsteps.py", line 119, in start
     step.start(parent)
 [2017-11-24 04:11:08,428: WARNING/MainProcess] File "/usr/lib/python3.6/site-packages/celery/bootsteps.py", line 370, in start
     return self.obj.start()
 [2017-11-24 04:11:08,429: WARNING/MainProcess] File "/usr/lib/python3.6/site-packages/celery/worker/consumer/consumer.py", line 320, in start
     blueprint.start(self)
 [2017-11-24 04:11:08,429: WARNING/MainProcess] File "/usr/lib/python3.6/site-packages/celery/bootsteps.py", line 119, in start
     step.start(parent)
 [2017-11-24 04:11:08,429: WARNING/MainProcess] File "/usr/lib/python3.6/site-packages/celery/worker/consumer/consumer.py", line 596, in start
     c.loop(*c.loop_args())
 [2017-11-24 04:11:08,430: WARNING/MainProcess] File "/usr/lib/python3.6/site-packages/celery/worker/loops.py", line 88, in asynloop
     next(loop)
 [2017-11-24 04:11:08,430: WARNING/MainProcess] File "/usr/lib/python3.6/site-packages/kombu/async/hub.py", line 354, in create_loop
     cb(*cbargs)
 [2017-11-24 04:11:08,431: WARNING/MainProcess] File "/usr/lib/python3.6/site-packages/kombu/transport/redis.py", line 1040, in on_readable
     self.cycle.on_readable(fileno)
 [2017-11-24 04:11:08,431: WARNING/MainProcess] File "/usr/lib/python3.6/site-packages/kombu/transport/redis.py", line 337, in on_readable
     chan.handlers[type]()
 [2017-11-24 04:11:08,431: WARNING/MainProcess] File "/usr/lib/python3.6/site-packages/kombu/transport/redis.py", line 667, in _receive
     ret.append(self._receive_one(c))
 [2017-11-24 04:11:08,432: WARNING/MainProcess] File "/usr/lib/python3.6/site-packages/kombu/transport/redis.py", line 678, in _receive_one
     response = c.parse_response()
 [2017-11-24 04:11:08,432: WARNING/MainProcess] File "/usr/local/cedar/utils/redis/client.py", line 2448, in parse_response
     return self._execute(connection, connection.read_response)
 [2017-11-24 04:11:08,432: WARNING/MainProcess] File "/usr/local/cedar/utils/redis/client.py", line 2437, in _execute
     return command(*args)
 [2017-11-24 04:11:08,433: WARNING/MainProcess] File "/usr/local/cedar/utils/redis/connection.py", line 604, in read_response
     response = self._parser.read_response()
 [2017-11-24 04:11:08,433: WARNING/MainProcess] File "/usr/local/cedar/utils/redis/connection.py", line 259, in read_response
     response = self._buffer.readline()
 [2017-11-24 04:11:08,434: WARNING/MainProcess] File "/usr/local/cedar/utils/redis/connection.py", line 185, in readline
     traceback.print_stack()

This happens when a connection is reset by a timeout (load balancer in my case). An error event is sent on the socket, but then celery tries to read from the socket. Since the socket is closed, no data is available and it hangs forever. It specifically happens on celery's pubsub redis connection, since that is a long running connection.

A socket_timeout param will fix the issue of blocking and will throw an error after the timeout, but there is an underlying issue that celery should not read from a closed socket.

The imperfect fix I made was to use this patch in redis-py.
redis/redis-py#886

as well as this custom fix for pubsub
cedar-team/redis-py@f934896...cedar-team:6aae12c96b733ad0b6e896001f0cf576fa26280a

@ericholscher
Copy link

+1. Hitting this issue as well:

  • Celery is still running, but not processing tasks
  • strace shows it blocked on recvfrom
  • Seeing cmd=NULL in Redis for the connection

@mvaled
Copy link
Contributor

mvaled commented Aug 15, 2018

@ericholscher If you're using hiredis, try without it. See #4321 and #3898.

@ericholscher
Copy link

@mvaled We are -- I will try without it. Thanks.

ericholscher added a commit to readthedocs/readthedocs.org that referenced this issue Aug 20, 2018
This might be the cause of our random celery disconnects.

Refs celery/celery#2606 (comment)
@ericholscher
Copy link

Reporting back, removing hiredis seems to have fixed the issue. Thanks! 👍

@auvipy
Copy link
Member

auvipy commented Sep 12, 2018

thanks Eric

vgatica-eb referenced this issue in eventbrite/celery Jul 8, 2021
Changelog Details:
Change history
================
This document contains change notes for bugfix releases in the 3.1.x series
(Cipater), please see :ref:`whatsnew-3.1` for an overview of what's
new in Celery 3.1.
.. _version-3.1.26:
3.1.26
======
:release-date: 2018-23-03 16:00 PM IST
:release-by: Omer Katz
- Fixed a crash caused by tasks cycling between Celery 3 and Celery 4 workers.
.. _version-3.1.25:
3.1.25
======
:release-date: 2016-10-10 12:00 PM PDT
:release-by: Ask Solem
- **Requirements**
- Now depends on :ref:`Kombu 3.0.37 <kombu:version-3.0.37>`
- Fixed problem with chords in group introduced in 3.1.24 (Issue #3504).
.. _version-3.1.24:
3.1.24
======
:release-date: 2016-09-30 04:21 PM PDT
:release-by: Ask Solem
- **Requirements**
- Now depends on :ref:`Kombu 3.0.36 <kombu:version-3.0.36>`.
- Now supports Task protocol 2 from the future 4.0 release.
Workers running 3.1.24 are now able to process messages
sent using the `new task message protocol`_ to be introduced
in Celery 4.0.
Users upgrading to Celery 4.0 when this is released are encouraged
to upgrade to this version as an intermediate step, as this
means workers not yet upgraded will be able to process
messages from clients/workers running 4.0.
.. _`new task message protocol`:
http://docs.celeryproject.org/en/master/internals/protocol.html#version-2
- ``Task.send_events`` can now be set to disable sending of events
for that task only.
Example when defining the task:
.. code-block:: python
@app.task(send_events=False)
def add(x, y):
return x + y
- **Utils**: Fixed compatibility with recent :pypi:`psutil` versions
(Issue #3262).
- **Canvas**: Chord now forwards partial arguments to its subtasks.
Fix contributed by Tayfun Sen.
- **App**: Arguments to app such as ``backend``, ``broker``, etc
are now pickled and sent to the child processes on Windows.
Fix contributed by Jeremy Zafran.
- **Deployment**: Generic init scripts now supports being symlinked
in runlevel directories (Issue #3208).
- **Deployment**: Updated CentOS scripts to work with CentOS 7.
Contributed by Joe Sanford.
- **Events**: The curses monitor no longer crashes when the
result of a task is empty.
Fix contributed by Dongweiming.
- **Worker**: ``repr(worker)`` would crash when called early
in the startup process (Issue #2514).
- **Tasks**: GroupResult now defines __bool__ and __nonzero__.
This is to fix an issue where a ResultSet or GroupResult with an empty
result list are not properly tupled with the as_tuple() method when it is
a parent result. This is due to the as_tuple() method performing a logical
and operation on the ResultSet.
Fix contributed by Colin McIntosh.
- **Worker**: Fixed wrong values in autoscale related logging message.
Fix contributed by ``@raducc``.
- Documentation improvements by
* Alexandru Chirila
* Michael Aquilina
* Mikko Ekström
* Mitchel Humpherys
* Thomas A. Neil
* Tiago Moreira Vieira
* Yuriy Syrovetskiy
* ``@dessant``
.. _version-3.1.23:
3.1.23
======
:release-date: 2016-03-09 06:00 P.M PST
:release-by: Ask Solem
- **Programs**: Last release broke support for the ``--hostnmame`` argument
to :program:`celery multi` and :program:`celery worker --detach`
(Issue #3103).
- **Results**: MongoDB result backend could crash the worker at startup
if not configured using an URL.
.. _version-3.1.22:
3.1.22
======
:release-date: 2016-03-07 01:30 P.M PST
:release-by: Ask Solem
- **Programs**: The worker would crash immediately on startup on
``backend.as_uri()`` when using some result backends (Issue #3094).
- **Programs**: :program:`celery multi`/:program:`celery worker --detach`
would create an extraneous logfile including literal formats (e.g. ``%I``)
in the filename (Issue #3096).
.. _version-3.1.21:
3.1.21
======
:release-date: 2016-03-04 11:16 A.M PST
:release-by: Ask Solem
- **Requirements**
- Now depends on :ref:`Kombu 3.0.34 <kombu:version-3.0.34>`.
- Now depends on :mod:`billiard` 3.3.0.23.
- **Prefork pool**: Fixes 100% CPU loop on Linux epoll (Issue #1845).
Also potential fix for: Issue #2142, Issue #2606
- **Prefork pool**: Fixes memory leak related to processes exiting
(Issue #2927).
- **Worker**: Fixes crash at startup when trying to censor passwords
in MongoDB and Cache result backend URLs (Issue #3079, Issue #3045,
Issue #3049, Issue #3068, Issue #3073).
Fix contributed by Maxime Verger.
- **Task**: An exception is now raised if countdown/expires is less
than -2147483648 (Issue #3078).
- **Programs**: :program:`celery shell --ipython` now compatible with newer
IPython versions.
- **Programs**: The DuplicateNodeName warning emitted by inspect/control
now includes a list of the node names returned.
Contributed by Sebastian Kalinowski.
- **Utils**: The ``.discard(item)`` method of
:class:`~celery.datastructures.LimitedSet` did not actually remove the item
(Issue #3087).
Fix contributed by Dave Smith.
- **Worker**: Node name formatting now emits less confusing error message
for unmatched format keys (Issue #3016).
- **Results**: amqp/rpc backends: Fixed deserialization of JSON exceptions
(Issue #2518).
Fix contributed by Allard Hoeve.
- **Prefork pool**: The `process inqueue damaged` error message now includes
the original exception raised.
- **Documentation**: Includes improvements by:
- Jeff Widman.
.. _version-3.1.20:
3.1.20
======
:release-date: 2016-01-22 06:50 P.M UTC
:release-by: Ask Solem
- **Requirements**
- Now depends on :ref:`Kombu 3.0.33 <kombu:version-3.0.33>`.
- Now depends on :mod:`billiard` 3.3.0.22.
Includes binary wheels for Microsoft Windows x86 and x86_64!
- **Task**: Error emails now uses ``utf-8`` charset by default (Issue #2737).
- **Task**: Retry now forwards original message headers (Issue #3017).
- **Worker**: Bootsteps can now hook into ``on_node_join``/``leave``/``lost``.
See :ref:`extending-consumer-gossip` for an example.
- **Events**: Fixed handling of DST timezones (Issue #2983).
- **Results**: Redis backend stopped respecting certain settings.
Contributed by Jeremy Llewellyn.
- **Results**: Database backend now properly supports JSON exceptions
(Issue #2441).
- **Results**: Redis ``new_join`` did not properly call task errbacks on chord
error (Issue #2796).
- **Results**: Restores Redis compatibility with redis-py < 2.10.0
(Issue #2903).
- **Results**: Fixed rare issue with chord error handling (Issue #2409).
- **Tasks**: Using queue-name values in :setting:`CELERY_ROUTES` now works
again (Issue #2987).
- **General**: Result backend password now sanitized in report output
(Issue #2812, Issue #2004).
- **Configuration**: Now gives helpful error message when the result backend
configuration points to a module, and not a class (Issue #2945).
- **Results**: Exceptions sent by JSON serialized workers are now properly
handled by pickle configured workers.
- **Programs**: ``celery control autoscale`` now works (Issue #2950).
- **Programs**: ``celery beat --detached`` now runs after fork callbacks.
- **General**: Fix for LRU cache implementation on Python 3.5 (Issue #2897).
Contributed by Dennis Brakhane.
Python 3.5's ``OrderedDict`` does not allow mutation while it is being
iterated over. This breaks "update" if it is called with a dict
larger than the maximum size.
This commit changes the code to a version that does not iterate over
the dict, and should also be a little bit faster.
- **Init scripts**: The beat init script now properly reports service as down
when no pid file can be found.
Eric Zarowny
- **Beat**: Added cleaning of corrupted scheduler files for some storage
backend errors (Issue #2985).
Fix contributed by Aleksandr Kuznetsov.
- **Beat**: Now syncs the schedule even if the schedule is empty.
Fix contributed by Colin McIntosh.
- **Supervisord**: Set higher process priority in supervisord example.
Contributed by George Tantiras.
- **Documentation**: Includes improvements by:
- Bryson
- Caleb Mingle
- Christopher Martin
- Dieter Adriaenssens
- Jason Veatch
- Jeremy Cline
- Juan Rossi
- Kevin Harvey
- Kevin McCarthy
- Kirill Pavlov
- Marco Buttu
- Mayflower
- Mher Movsisyan
- Michael Floering
- michael-k
- Nathaniel Varona
- Rudy Attias
- Ryan Luckie
- Steven Parker
- squfrans
- Tadej Janež
- TakesxiSximada
- Tom S
.. _version-3.1.19:
3.1.19
======
:release-date: 2015-10-26 01:00 P.M UTC
:release-by: Ask Solem
- **Requirements**
- Now depends on :ref:`Kombu 3.0.29 <kombu:version-3.0.29>`.
- Now depends on :mod:`billiard` 3.3.0.21.
-  **Results**: Fixed MongoDB result backend URL parsing problem
(Issue celery/kombu#375).
- **Worker**: Task request now properly sets ``priority`` in delivery_info.
Fix contributed by Gerald Manipon.
- **Beat**: PyPy shelve may raise ``KeyError`` when setting keys
(Issue #2862).
- **Programs**: :program:`celery beat --deatched` now working on PyPy.
Fix contributed by Krzysztof Bujniewicz.
- **Results**: Redis result backend now ensures all pipelines are cleaned up.
Contributed by Justin Patrin.
- **Results**: Redis result backend now allows for timeout to be set in the
query portion of the result backend URL.
E.g. ``CELERY_RESULT_BACKEND = 'redis://?timeout=10'``
Contributed by Justin Patrin.
- **Results**: ``result.get`` now properly handles failures where the
exception value is set to :const:`None` (Issue #2560).
- **Prefork pool**: Fixed attribute error ``proc.dead``.
- **Worker**: Fixed worker hanging when gossip/heartbeat disabled
(Issue #1847).
Fix contributed by Aaron Webber and Bryan Helmig.
- **Results**: MongoDB result backend now supports pymongo 3.x
(Issue #2744).
Fix contributed by Sukrit Khera.
- **Results**: RPC/amqp backends did not deserialize exceptions properly
(Issue #2691).
Fix contributed by Sukrit Khera.
- **Programs**: Fixed problem with :program:`celery amqp`'s
``basic_publish`` (Issue #2013).
- **Worker**: Embedded beat now properly sets app for thread/process
(Issue #2594).
- **Documentation**: Many improvements and typos fixed.
Contributions by:
Carlos Garcia-Dubus
D. Yu
jerry
Jocelyn Delalande
Josh Kupershmidt
Juan Rossi
kanemra
Paul Pearce
Pavel Savchenko
Sean Wang
Seungha Kim
Zhaorong Ma
.. _version-3.1.18:
3.1.18
======
:release-date: 2015-04-22 05:30 P.M UTC
:release-by: Ask Solem
- **Requirements**
- Now depends on :ref:`Kombu 3.0.25 <kombu:version-3.0.25>`.
- Now depends on :mod:`billiard` 3.3.0.20.
- **Django**: Now supports Django 1.8 (Issue #2536).
Fix contributed by Bence Tamas and Mickaël Penhard.
- **Results**: MongoDB result backend now compatible with pymongo 3.0.
Fix contributed by Fatih Sucu.
- **Tasks**: Fixed bug only happening when a task has multiple callbacks
(Issue #2515).
Fix contributed by NotSqrt.
- **Commands**: Preload options now support ``--arg value`` syntax.
Fix contributed by John Anderson.
- **Compat**: A typo caused ``celery.log.setup_logging_subsystem`` to be
undefined.
Fix contributed by Gunnlaugur Thor Briem.
- **init scripts**: The celerybeat generic init script now uses
``/bin/sh`` instead of bash (Issue #2496).
Fix contributed by Jelle Verstraaten.
- **Django**: Fixed a :exc:`TypeError` sometimes occurring in logging
when validating models.
Fix contributed by Alexander.
- **Commands**: Worker now supports new ``--executable`` argument that can
be used with ``--detach``.
Contributed by Bert Vanderbauwhede.
- **Canvas**: Fixed crash in chord unlock fallback task (Issue #2404).
- **Worker**: Fixed rare crash occurring with ``--autoscale`` enabled
(Issue #2411).
- **Django**: Properly recycle worker Django database connections when the
Django ``CONN_MAX_AGE`` setting is enabled (Issue #2453).
Fix contributed by Luke Burden.
.. _version-3.1.17:
3.1.17
======
:release-date: 2014-11-19 03:30 P.M UTC
:release-by: Ask Solem
.. admonition:: Do not enable the :setting:`CELERYD_FORCE_EXECV` setting!
Please review your configuration and disable this option if you're using the
RabbitMQ or Redis transport.
Keeping this option enabled after 3.1 means the async based prefork pool will
be disabled, which can easily cause instability.
- **Requirements**
- Now depends on :ref:`Kombu 3.0.24 <kombu:version-3.0.24>`.
Includes the new Qpid transport coming in Celery 3.2, backported to
support those who may still require Python 2.6 compatibility.
- Now depends on :mod:`billiard` 3.3.0.19.
- ``celery[librabbitmq]`` now depends on librabbitmq 1.6.1.
- **Task**: The timing of ETA/countdown tasks were off after the example ``LocalTimezone``
implementation in the Python documentation no longer works in Python 3.4.
(Issue #2306).
- **Task**: Raising :exc:`~celery.exceptions.Ignore` no longer sends
``task-failed`` event (Issue #2365).
- **Redis result backend**: Fixed unbound local errors.
Fix contributed by Thomas French.
- **Task**: Callbacks was not called properly if ``link`` was a list of
signatures (Issuse #2350).
- **Canvas**: chain and group now handles json serialized signatures
(Issue #2076).
- **Results**: ``.join_native()`` would accidentally treat the ``STARTED``
state as being ready (Issue #2326).
This could lead to the chord callback being called with invalid arguments
when using chords with the :setting:`CELERY_TRACK_STARTED` setting
enabled.
- **Canvas**: The ``chord_size`` attribute is now set for all canvas primitives,
making sure more combinations will work with the ``new_join`` optimization
for Redis (Issue #2339).
- **Task**: Fixed problem with app not being properly propagated to
``trace_task`` in all cases.
Fix contributed by kristaps.
- **Worker**: Expires from task message now associated with a timezone.
Fix contributed by Albert Wang.
- **Cassandra result backend**: Fixed problems when using detailed mode.
When using the Cassandra backend in detailed mode, a regression
caused errors when attempting to retrieve results.
Fix contributed by Gino Ledesma.
- **Mongodb Result backend**: Pickling the backend instance will now include
the original url (Issue #2347).
Fix contributed by Sukrit Khera.
- **Task**: Exception info was not properly set for tasks raising
:exc:`~celery.exceptions.Reject` (Issue #2043).
- **Worker**: Duplicates are now removed when loading the set of revoked tasks
from the worker state database (Issue #2336).
- **celery.contrib.rdb**: Fixed problems with ``rdb.set_trace`` calling stop
from the wrong frame.
Fix contributed by llllllllll.
- **Canvas**: ``chain`` and ``chord`` can now be immutable.
- **Canvas**: ``chord.apply_async`` will now keep partial args set in
``self.args`` (Issue #2299).
- **Results**: Small refactoring so that results are decoded the same way in
all result backends.
- **Logging**: The ``processName`` format was introduced in Py2.6.2 so for
compatibility this format is now excluded when using earlier versions
(Issue #1644).
.. _version-3.1.16:
3.1.16
======
:release-date: 2014-10-03 06:00 P.M UTC
:release-by: Ask Solem
- **Worker**: 3.1.15 broke ``-Ofair`` behavior (Issue #2286).
This regression could result in all tasks executing
in a single child process if ``-Ofair`` was enabled.
- **Canvas**: ``celery.signature`` now properly forwards app argument
in all cases.
- **Task**: ``.retry()`` did not raise the exception correctly
when called without a current exception.
Fix contributed by Andrea Rabbaglietti.
- **Worker**: The ``enable_events`` remote control command
disabled worker-related events by mistake (Issue #2272).
Fix contributed by Konstantinos Koukopoulos.
- **Django**: Adds support for Django 1.7 class names in INSTALLED_APPS
when using ``app.autodiscover_tasks()``  (Issue #2248).
- **Sphinx**: ``celery.contrib.sphinx`` now uses ``getfullargspec``
on Python 3 (Issue #2302).
- **Redis/Cache Backends**: Chords will now run at most once if one or more tasks
in the chord are executed multiple times for some reason.
.. _version-3.1.15:
3.1.15
======
:release-date: 2014-09-14 11:00 P.M UTC
:release-by: Ask Solem
- **Django**: Now makes sure ``django.setup()`` is called
before importing any task modules (Django 1.7 compatibility, Issue #2227)
- **Results**: ``result.get()`` was misbehaving by calling
``backend.get_task_meta`` in a finally call leading to
AMQP result backend queues not being properly cleaned up (Issue #2245).
.. _version-3.1.14:
3.1.14
======
:release-date: 2014-09-08 03:00 P.M UTC
:release-by: Ask Solem
- **Requirements**
- Now depends on :ref:`Kombu 3.0.22 <kombu:version-3.0.22>`.
- **Init scripts**: The generic worker init scripts ``status`` command
now gets an accurate pidfile list (Issue #1942).
- **Init scripts**: The generic beat script now implements the ``status``
command.
Contributed by John Whitlock.
- **Commands**: Multi now writes informational output to stdout instead of stderr.
- **Worker**: Now ignores not implemented error for ``pool.restart``
(Issue #2153).
- **Task**: Retry no longer raises retry exception when executed in eager
mode (Issue #2164).
- **AMQP Result backend**: Now ensured ``on_interval`` is called at least
every second for blocking calls to properly propagate parent errors.
- **Django**: Compatibility with Django 1.7 on Windows (Issue #2126).
- **Programs**: `--umask` argument can be now specified in both octal (if starting
with 0) or decimal.
.. _version-3.1.13:
3.1.13
======
eventbritebuild referenced this issue in eventbrite/celery Jul 8, 2021
Changelog Details:
Change history
================
This document contains change notes for bugfix releases in the 3.1.x series
(Cipater), please see :ref:`whatsnew-3.1` for an overview of what's
new in Celery 3.1.
.. _version-3.1.26:
3.1.26
======
:release-date: 2018-23-03 16:00 PM IST
:release-by: Omer Katz
- Fixed a crash caused by tasks cycling between Celery 3 and Celery 4 workers.
.. _version-3.1.25:
3.1.25
======
:release-date: 2016-10-10 12:00 PM PDT
:release-by: Ask Solem
- **Requirements**
- Now depends on :ref:`Kombu 3.0.37 <kombu:version-3.0.37>`
- Fixed problem with chords in group introduced in 3.1.24 (Issue #3504).
.. _version-3.1.24:
3.1.24
======
:release-date: 2016-09-30 04:21 PM PDT
:release-by: Ask Solem
- **Requirements**
- Now depends on :ref:`Kombu 3.0.36 <kombu:version-3.0.36>`.
- Now supports Task protocol 2 from the future 4.0 release.
Workers running 3.1.24 are now able to process messages
sent using the `new task message protocol`_ to be introduced
in Celery 4.0.
Users upgrading to Celery 4.0 when this is released are encouraged
to upgrade to this version as an intermediate step, as this
means workers not yet upgraded will be able to process
messages from clients/workers running 4.0.
.. _`new task message protocol`:
http://docs.celeryproject.org/en/master/internals/protocol.html#version-2
- ``Task.send_events`` can now be set to disable sending of events
for that task only.
Example when defining the task:
.. code-block:: python
@app.task(send_events=False)
def add(x, y):
return x + y
- **Utils**: Fixed compatibility with recent :pypi:`psutil` versions
(Issue #3262).
- **Canvas**: Chord now forwards partial arguments to its subtasks.
Fix contributed by Tayfun Sen.
- **App**: Arguments to app such as ``backend``, ``broker``, etc
are now pickled and sent to the child processes on Windows.
Fix contributed by Jeremy Zafran.
- **Deployment**: Generic init scripts now supports being symlinked
in runlevel directories (Issue #3208).
- **Deployment**: Updated CentOS scripts to work with CentOS 7.
Contributed by Joe Sanford.
- **Events**: The curses monitor no longer crashes when the
result of a task is empty.
Fix contributed by Dongweiming.
- **Worker**: ``repr(worker)`` would crash when called early
in the startup process (Issue #2514).
- **Tasks**: GroupResult now defines __bool__ and __nonzero__.
This is to fix an issue where a ResultSet or GroupResult with an empty
result list are not properly tupled with the as_tuple() method when it is
a parent result. This is due to the as_tuple() method performing a logical
and operation on the ResultSet.
Fix contributed by Colin McIntosh.
- **Worker**: Fixed wrong values in autoscale related logging message.
Fix contributed by ``@raducc``.
- Documentation improvements by
* Alexandru Chirila
* Michael Aquilina
* Mikko Ekström
* Mitchel Humpherys
* Thomas A. Neil
* Tiago Moreira Vieira
* Yuriy Syrovetskiy
* ``@dessant``
.. _version-3.1.23:
3.1.23
======
:release-date: 2016-03-09 06:00 P.M PST
:release-by: Ask Solem
- **Programs**: Last release broke support for the ``--hostnmame`` argument
to :program:`celery multi` and :program:`celery worker --detach`
(Issue #3103).
- **Results**: MongoDB result backend could crash the worker at startup
if not configured using an URL.
.. _version-3.1.22:
3.1.22
======
:release-date: 2016-03-07 01:30 P.M PST
:release-by: Ask Solem
- **Programs**: The worker would crash immediately on startup on
``backend.as_uri()`` when using some result backends (Issue #3094).
- **Programs**: :program:`celery multi`/:program:`celery worker --detach`
would create an extraneous logfile including literal formats (e.g. ``%I``)
in the filename (Issue #3096).
.. _version-3.1.21:
3.1.21
======
:release-date: 2016-03-04 11:16 A.M PST
:release-by: Ask Solem
- **Requirements**
- Now depends on :ref:`Kombu 3.0.34 <kombu:version-3.0.34>`.
- Now depends on :mod:`billiard` 3.3.0.23.
- **Prefork pool**: Fixes 100% CPU loop on Linux epoll (Issue #1845).
Also potential fix for: Issue #2142, Issue #2606
- **Prefork pool**: Fixes memory leak related to processes exiting
(Issue #2927).
- **Worker**: Fixes crash at startup when trying to censor passwords
in MongoDB and Cache result backend URLs (Issue #3079, Issue #3045,
Issue #3049, Issue #3068, Issue #3073).
Fix contributed by Maxime Verger.
- **Task**: An exception is now raised if countdown/expires is less
than -2147483648 (Issue #3078).
- **Programs**: :program:`celery shell --ipython` now compatible with newer
IPython versions.
- **Programs**: The DuplicateNodeName warning emitted by inspect/control
now includes a list of the node names returned.
Contributed by Sebastian Kalinowski.
- **Utils**: The ``.discard(item)`` method of
:class:`~celery.datastructures.LimitedSet` did not actually remove the item
(Issue #3087).
Fix contributed by Dave Smith.
- **Worker**: Node name formatting now emits less confusing error message
for unmatched format keys (Issue #3016).
- **Results**: amqp/rpc backends: Fixed deserialization of JSON exceptions
(Issue #2518).
Fix contributed by Allard Hoeve.
- **Prefork pool**: The `process inqueue damaged` error message now includes
the original exception raised.
- **Documentation**: Includes improvements by:
- Jeff Widman.
.. _version-3.1.20:
3.1.20
======
:release-date: 2016-01-22 06:50 P.M UTC
:release-by: Ask Solem
- **Requirements**
- Now depends on :ref:`Kombu 3.0.33 <kombu:version-3.0.33>`.
- Now depends on :mod:`billiard` 3.3.0.22.
Includes binary wheels for Microsoft Windows x86 and x86_64!
- **Task**: Error emails now uses ``utf-8`` charset by default (Issue #2737).
- **Task**: Retry now forwards original message headers (Issue #3017).
- **Worker**: Bootsteps can now hook into ``on_node_join``/``leave``/``lost``.
See :ref:`extending-consumer-gossip` for an example.
- **Events**: Fixed handling of DST timezones (Issue #2983).
- **Results**: Redis backend stopped respecting certain settings.
Contributed by Jeremy Llewellyn.
- **Results**: Database backend now properly supports JSON exceptions
(Issue #2441).
- **Results**: Redis ``new_join`` did not properly call task errbacks on chord
error (Issue #2796).
- **Results**: Restores Redis compatibility with redis-py < 2.10.0
(Issue #2903).
- **Results**: Fixed rare issue with chord error handling (Issue #2409).
- **Tasks**: Using queue-name values in :setting:`CELERY_ROUTES` now works
again (Issue #2987).
- **General**: Result backend password now sanitized in report output
(Issue #2812, Issue #2004).
- **Configuration**: Now gives helpful error message when the result backend
configuration points to a module, and not a class (Issue #2945).
- **Results**: Exceptions sent by JSON serialized workers are now properly
handled by pickle configured workers.
- **Programs**: ``celery control autoscale`` now works (Issue #2950).
- **Programs**: ``celery beat --detached`` now runs after fork callbacks.
- **General**: Fix for LRU cache implementation on Python 3.5 (Issue #2897).
Contributed by Dennis Brakhane.
Python 3.5's ``OrderedDict`` does not allow mutation while it is being
iterated over. This breaks "update" if it is called with a dict
larger than the maximum size.
This commit changes the code to a version that does not iterate over
the dict, and should also be a little bit faster.
- **Init scripts**: The beat init script now properly reports service as down
when no pid file can be found.
Eric Zarowny
- **Beat**: Added cleaning of corrupted scheduler files for some storage
backend errors (Issue #2985).
Fix contributed by Aleksandr Kuznetsov.
- **Beat**: Now syncs the schedule even if the schedule is empty.
Fix contributed by Colin McIntosh.
- **Supervisord**: Set higher process priority in supervisord example.
Contributed by George Tantiras.
- **Documentation**: Includes improvements by:
- Bryson
- Caleb Mingle
- Christopher Martin
- Dieter Adriaenssens
- Jason Veatch
- Jeremy Cline
- Juan Rossi
- Kevin Harvey
- Kevin McCarthy
- Kirill Pavlov
- Marco Buttu
- Mayflower
- Mher Movsisyan
- Michael Floering
- michael-k
- Nathaniel Varona
- Rudy Attias
- Ryan Luckie
- Steven Parker
- squfrans
- Tadej Janež
- TakesxiSximada
- Tom S
.. _version-3.1.19:
3.1.19
======
:release-date: 2015-10-26 01:00 P.M UTC
:release-by: Ask Solem
- **Requirements**
- Now depends on :ref:`Kombu 3.0.29 <kombu:version-3.0.29>`.
- Now depends on :mod:`billiard` 3.3.0.21.
-  **Results**: Fixed MongoDB result backend URL parsing problem
(Issue celery/kombu#375).
- **Worker**: Task request now properly sets ``priority`` in delivery_info.
Fix contributed by Gerald Manipon.
- **Beat**: PyPy shelve may raise ``KeyError`` when setting keys
(Issue #2862).
- **Programs**: :program:`celery beat --deatched` now working on PyPy.
Fix contributed by Krzysztof Bujniewicz.
- **Results**: Redis result backend now ensures all pipelines are cleaned up.
Contributed by Justin Patrin.
- **Results**: Redis result backend now allows for timeout to be set in the
query portion of the result backend URL.
E.g. ``CELERY_RESULT_BACKEND = 'redis://?timeout=10'``
Contributed by Justin Patrin.
- **Results**: ``result.get`` now properly handles failures where the
exception value is set to :const:`None` (Issue #2560).
- **Prefork pool**: Fixed attribute error ``proc.dead``.
- **Worker**: Fixed worker hanging when gossip/heartbeat disabled
(Issue #1847).
Fix contributed by Aaron Webber and Bryan Helmig.
- **Results**: MongoDB result backend now supports pymongo 3.x
(Issue #2744).
Fix contributed by Sukrit Khera.
- **Results**: RPC/amqp backends did not deserialize exceptions properly
(Issue #2691).
Fix contributed by Sukrit Khera.
- **Programs**: Fixed problem with :program:`celery amqp`'s
``basic_publish`` (Issue #2013).
- **Worker**: Embedded beat now properly sets app for thread/process
(Issue #2594).
- **Documentation**: Many improvements and typos fixed.
Contributions by:
Carlos Garcia-Dubus
D. Yu
jerry
Jocelyn Delalande
Josh Kupershmidt
Juan Rossi
kanemra
Paul Pearce
Pavel Savchenko
Sean Wang
Seungha Kim
Zhaorong Ma
.. _version-3.1.18:
3.1.18
======
:release-date: 2015-04-22 05:30 P.M UTC
:release-by: Ask Solem
- **Requirements**
- Now depends on :ref:`Kombu 3.0.25 <kombu:version-3.0.25>`.
- Now depends on :mod:`billiard` 3.3.0.20.
- **Django**: Now supports Django 1.8 (Issue #2536).
Fix contributed by Bence Tamas and Mickaël Penhard.
- **Results**: MongoDB result backend now compatible with pymongo 3.0.
Fix contributed by Fatih Sucu.
- **Tasks**: Fixed bug only happening when a task has multiple callbacks
(Issue #2515).
Fix contributed by NotSqrt.
- **Commands**: Preload options now support ``--arg value`` syntax.
Fix contributed by John Anderson.
- **Compat**: A typo caused ``celery.log.setup_logging_subsystem`` to be
undefined.
Fix contributed by Gunnlaugur Thor Briem.
- **init scripts**: The celerybeat generic init script now uses
``/bin/sh`` instead of bash (Issue #2496).
Fix contributed by Jelle Verstraaten.
- **Django**: Fixed a :exc:`TypeError` sometimes occurring in logging
when validating models.
Fix contributed by Alexander.
- **Commands**: Worker now supports new ``--executable`` argument that can
be used with ``--detach``.
Contributed by Bert Vanderbauwhede.
- **Canvas**: Fixed crash in chord unlock fallback task (Issue #2404).
- **Worker**: Fixed rare crash occurring with ``--autoscale`` enabled
(Issue #2411).
- **Django**: Properly recycle worker Django database connections when the
Django ``CONN_MAX_AGE`` setting is enabled (Issue #2453).
Fix contributed by Luke Burden.
.. _version-3.1.17:
3.1.17
======
:release-date: 2014-11-19 03:30 P.M UTC
:release-by: Ask Solem
.. admonition:: Do not enable the :setting:`CELERYD_FORCE_EXECV` setting!
Please review your configuration and disable this option if you're using the
RabbitMQ or Redis transport.
Keeping this option enabled after 3.1 means the async based prefork pool will
be disabled, which can easily cause instability.
- **Requirements**
- Now depends on :ref:`Kombu 3.0.24 <kombu:version-3.0.24>`.
Includes the new Qpid transport coming in Celery 3.2, backported to
support those who may still require Python 2.6 compatibility.
- Now depends on :mod:`billiard` 3.3.0.19.
- ``celery[librabbitmq]`` now depends on librabbitmq 1.6.1.
- **Task**: The timing of ETA/countdown tasks were off after the example ``LocalTimezone``
implementation in the Python documentation no longer works in Python 3.4.
(Issue #2306).
- **Task**: Raising :exc:`~celery.exceptions.Ignore` no longer sends
``task-failed`` event (Issue #2365).
- **Redis result backend**: Fixed unbound local errors.
Fix contributed by Thomas French.
- **Task**: Callbacks was not called properly if ``link`` was a list of
signatures (Issuse #2350).
- **Canvas**: chain and group now handles json serialized signatures
(Issue #2076).
- **Results**: ``.join_native()`` would accidentally treat the ``STARTED``
state as being ready (Issue #2326).
This could lead to the chord callback being called with invalid arguments
when using chords with the :setting:`CELERY_TRACK_STARTED` setting
enabled.
- **Canvas**: The ``chord_size`` attribute is now set for all canvas primitives,
making sure more combinations will work with the ``new_join`` optimization
for Redis (Issue #2339).
- **Task**: Fixed problem with app not being properly propagated to
``trace_task`` in all cases.
Fix contributed by kristaps.
- **Worker**: Expires from task message now associated with a timezone.
Fix contributed by Albert Wang.
- **Cassandra result backend**: Fixed problems when using detailed mode.
When using the Cassandra backend in detailed mode, a regression
caused errors when attempting to retrieve results.
Fix contributed by Gino Ledesma.
- **Mongodb Result backend**: Pickling the backend instance will now include
the original url (Issue #2347).
Fix contributed by Sukrit Khera.
- **Task**: Exception info was not properly set for tasks raising
:exc:`~celery.exceptions.Reject` (Issue #2043).
- **Worker**: Duplicates are now removed when loading the set of revoked tasks
from the worker state database (Issue #2336).
- **celery.contrib.rdb**: Fixed problems with ``rdb.set_trace`` calling stop
from the wrong frame.
Fix contributed by llllllllll.
- **Canvas**: ``chain`` and ``chord`` can now be immutable.
- **Canvas**: ``chord.apply_async`` will now keep partial args set in
``self.args`` (Issue #2299).
- **Results**: Small refactoring so that results are decoded the same way in
all result backends.
- **Logging**: The ``processName`` format was introduced in Py2.6.2 so for
compatibility this format is now excluded when using earlier versions
(Issue #1644).
.. _version-3.1.16:
3.1.16
======
:release-date: 2014-10-03 06:00 P.M UTC
:release-by: Ask Solem
- **Worker**: 3.1.15 broke ``-Ofair`` behavior (Issue #2286).
This regression could result in all tasks executing
in a single child process if ``-Ofair`` was enabled.
- **Canvas**: ``celery.signature`` now properly forwards app argument
in all cases.
- **Task**: ``.retry()`` did not raise the exception correctly
when called without a current exception.
Fix contributed by Andrea Rabbaglietti.
- **Worker**: The ``enable_events`` remote control command
disabled worker-related events by mistake (Issue #2272).
Fix contributed by Konstantinos Koukopoulos.
- **Django**: Adds support for Django 1.7 class names in INSTALLED_APPS
when using ``app.autodiscover_tasks()``  (Issue #2248).
- **Sphinx**: ``celery.contrib.sphinx`` now uses ``getfullargspec``
on Python 3 (Issue #2302).
- **Redis/Cache Backends**: Chords will now run at most once if one or more tasks
in the chord are executed multiple times for some reason.
.. _version-3.1.15:
3.1.15
======
:release-date: 2014-09-14 11:00 P.M UTC
:release-by: Ask Solem
- **Django**: Now makes sure ``django.setup()`` is called
before importing any task modules (Django 1.7 compatibility, Issue #2227)
- **Results**: ``result.get()`` was misbehaving by calling
``backend.get_task_meta`` in a finally call leading to
AMQP result backend queues not being properly cleaned up (Issue #2245).
.. _version-3.1.14:
3.1.14
======
:release-date: 2014-09-08 03:00 P.M UTC
:release-by: Ask Solem
- **Requirements**
- Now depends on :ref:`Kombu 3.0.22 <kombu:version-3.0.22>`.
- **Init scripts**: The generic worker init scripts ``status`` command
now gets an accurate pidfile list (Issue #1942).
- **Init scripts**: The generic beat script now implements the ``status``
command.
Contributed by John Whitlock.
- **Commands**: Multi now writes informational output to stdout instead of stderr.
- **Worker**: Now ignores not implemented error for ``pool.restart``
(Issue #2153).
- **Task**: Retry no longer raises retry exception when executed in eager
mode (Issue #2164).
- **AMQP Result backend**: Now ensured ``on_interval`` is called at least
every second for blocking calls to properly propagate parent errors.
- **Django**: Compatibility with Django 1.7 on Windows (Issue #2126).
- **Programs**: `--umask` argument can be now specified in both octal (if starting
with 0) or decimal.
.. _version-3.1.13:
3.1.13
======
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests