Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#8796: Worker not consuming tasks after Redis broker restart, maybe not fixed in version 5.4.0 #8990

Open
zjqzero opened this issue Apr 29, 2024 · 4 comments
Assignees
Milestone

Comments

@zjqzero
Copy link

zjqzero commented Apr 29, 2024

[worker-0.log] after restart redis

[2024-04-29 15:39:32,181: WARNING/MainProcess] /usr/local/lib/python3.12/site-packages/celery/worker/consumer/consumer.py:508: CPendingDeprecationWarning: The broker_connection_retry configuration setting will no longer determine
whether broker connection retries are made during startup in Celery 6.0 and above.
If you wish to retain the existing behavior for retrying connections on startup,
you should set broker_connection_retry_on_startup to True.
warnings.warn(

[2024-04-29 15:39:32,183: ERROR/MainProcess] consumer: Cannot connect to redis://192.168.6.21:6379/0: Error 111 connecting to 192.168.6.21:6379. Connection refused..
Trying again in 2.00 seconds... (1/100)

[2024-04-29 15:39:34,190: INFO/MainProcess] Connected to redis://192.168.6.21:6379/0
[2024-04-29 15:39:34,191: WARNING/MainProcess] /usr/local/lib/python3.12/site-packages/celery/worker/consumer/consumer.py:508: CPendingDeprecationWarning: The broker_connection_retry configuration setting will no longer determine
whether broker connection retries are made during startup in Celery 6.0 and above.
If you wish to retain the existing behavior for retrying connections on startup,
you should set broker_connection_retry_on_startup to True.
warnings.warn(

[2024-04-29 15:39:34,196: INFO/MainProcess] mingle: searching for neighbors
[2024-04-29 15:39:35,205: INFO/MainProcess] mingle: all alone
[2024-04-29 15:39:40,218: INFO/MainProcess] missed heartbeat from 55c7a36024c3

[version]

python 3.12 + redis 7.2.4 + celery 5.4.0

`
Package Version


amqp 5.2.0
APScheduler 3.10.4
beautifulsoup4 4.12.3
bidict 0.23.1
billiard 4.2.0
blinker 1.8.1
bs4 0.0.2
cachelib 0.9.0
celery 5.4.0
certifi 2024.2.2
charset-normalizer 3.3.2
click 8.1.7
click-didyoumean 0.3.1
click-plugins 1.1.1
click-repl 0.3.0
dnspython 2.6.1
et-xmlfile 1.1.0
Flask 3.0.3
Flask-APScheduler 1.13.1
Flask-Caching 2.2.0
Flask-Cors 4.0.0
h11 0.14.0
hiredis 2.3.2
idna 3.7
itsdangerous 2.2.0
Jinja2 3.1.3
kombu 5.3.7
lml 0.1.0
Logbook 1.7.0.post0
loguru 0.7.2
lxml 5.2.1
MarkupSafe 2.1.5
marshmallow 3.21.1
mongoengine 0.28.2
numpy 1.26.4
openpyxl 3.1.2
packaging 24.0
pandas 2.2.2
pip 24.0
prompt-toolkit 3.0.43
pyexcel-io 0.6.6
pyexcel-xlsx 0.6.0
pymongo 4.7.0
python-dateutil 2.9.0.post0
python-engineio 4.9.0
python-redis-lock 4.0.0
python-socketio 5.11.2
pytz 2024.1
raven 6.10.0
redis 5.0.4
requests 2.31.0
setuptools 69.2.0
simple-websocket 1.0.0
six 1.16.0
soupsieve 2.5
supervisor 4.2.5
tenacity 8.2.3
thulac 0.2.2
tzdata 2024.1
tzlocal 5.2
urllib3 2.2.1
uWSGI 2.0.25.1
vine 5.1.0
wcwidth 0.2.13
Werkzeug 3.0.2
wheel 0.43.0
wsproto 1.2.0
`

Therefor I have to revert to celery 5.2.2 .

@zjqzero
Copy link
Author

zjqzero commented May 7, 2024

I have tested it many times on Celery 5.2.2 + Kombu 5.2.2 without any problems.
The production environment also uses this version.

However, upgrading Kombu to a version higher than 5.2.2, such as 5.3.7, frequently introduces issues.

I have reproduced this scenario manually. Here is some logs.

Based on my observations, Kombu seems to be the most likely source of this issue.

`
Package Version


...
celery 5.2.2
...
kombu 5.3.7
...

user@gamma:~$ celery -A test.app worker -l INFO

-------------- celery@gamma v5.2.2 (dawn-chorus)
--- ***** -----
-- ******* ---- Linux-5.15.0-101-generic-x86_64-with-glibc2.35 2024-05-07 10:34:32

  • *** --- * ---
  • ** ---------- [config]
  • ** ---------- .> app: hello:0x7f5caf7f23e0
  • ** ---------- .> transport: redis://192.168.6.21:6379/0
  • ** ---------- .> results: disabled://
  • *** --- * --- .> concurrency: 20 (prefork)
    -- ******* ---- .> task events: OFF (enable -E to monitor tasks in this worker)
    --- ***** -----
    -------------- [queues]
    .> celery exchange=celery(direct) key=celery

[tasks]
. test.hello

[2024-05-07 10:34:32,591: INFO/MainProcess] Connected to redis://192.168.6.21:6379/0
[2024-05-07 10:34:32,597: INFO/MainProcess] mingle: searching for neighbors
[2024-05-07 10:34:33,609: INFO/MainProcess] mingle: sync with 2 nodes
[2024-05-07 10:34:33,609: INFO/MainProcess] mingle: sync complete
[2024-05-07 10:34:33,623: INFO/MainProcess] celery@gamma ready.
[2024-05-07 10:34:51,097: INFO/MainProcess] Task test.hello[8ead87e5-2c63-4449-86e7-a19f4a168dc4] received
[2024-05-07 10:34:51,099: INFO/ForkPoolWorker-16] Task test.hello[8ead87e5-2c63-4449-86e7-a19f4a168dc4] succeeded in 0.00034789182245731354s: 'hello world'

run celery worker and have a test, it works then restart redis

[2024-05-07 10:35:25,473: WARNING/MainProcess] consumer: Connection to broker lost. Trying to re-establish the connection...
Traceback (most recent call last):
...
redis.exceptions.ConnectionError: Connection closed by server.
...

[2024-05-07 10:35:25,480: ERROR/MainProcess] consumer: Cannot connect to redis://192.168.6.21:6379/0: Error 111 connecting to 192.168.6.21:6379. Connection refused..
Trying again in 2.00 seconds... (1/100)

[2024-05-07 10:35:27,488: INFO/MainProcess] Connected to redis://192.168.6.21:6379/0
[2024-05-07 10:35:27,493: INFO/MainProcess] mingle: searching for neighbors
[2024-05-07 10:35:28,498: INFO/MainProcess] mingle: all alone

[2024-05-07 10:35:38,419: INFO/MainProcess] Task test.hello[9739f7cd-5a4d-4cfa-9473-ebe131a10bce] received
[2024-05-07 10:35:38,420: INFO/ForkPoolWorker-16] Task test.hello[9739f7cd-5a4d-4cfa-9473-ebe131a10bce] succeeded in 8.22567380964756e-05s: 'hello world'

test again, it works then restart redis again

[2024-05-07 10:35:54,940: WARNING/MainProcess] consumer: Connection to broker lost. Trying to re-establish the connection...
Traceback (most recent call last):
...

[2024-05-07 10:35:54,946: ERROR/MainProcess] consumer: Cannot connect to redis://192.168.6.21:6379/0: Error 111 connecting to 192.168.6.21:6379. Connection refused..
Trying again in 2.00 seconds... (1/100)

[2024-05-07 10:35:56,952: INFO/MainProcess] Connected to redis://192.168.6.21:6379/0
[2024-05-07 10:35:56,957: INFO/MainProcess] mingle: searching for neighbors
[2024-05-07 10:35:57,963: INFO/MainProcess] mingle: all alone

[2024-05-07 10:36:04,115: INFO/MainProcess] Task test.hello[ea4fd100-6d29-47c3-82b0-41e03a635781] received
[2024-05-07 10:36:04,117: INFO/ForkPoolWorker-16] Task test.hello[ea4fd100-6d29-47c3-82b0-41e03a635781] succeeded in 8.298782631754875e-05s: 'hello world'

test again, it works then restart redis again

[2024-05-07 10:36:11,417: WARNING/MainProcess] consumer: Connection to broker lost. Trying to re-establish the connection...
Traceback (most recent call last):

...

[2024-05-07 10:36:11,422: ERROR/MainProcess] consumer: Cannot connect to redis://192.168.6.21:6379/0: Error 111 connecting to 192.168.6.21:6379. Connection refused..
Trying again in 2.00 seconds... (1/100)

[2024-05-07 10:36:13,431: INFO/MainProcess] Connected to redis://192.168.6.21:6379/0
[2024-05-07 10:36:13,436: INFO/MainProcess] mingle: searching for neighbors
[2024-05-07 10:36:14,445: INFO/MainProcess] mingle: all alone
[2024-05-07 10:36:19,456: INFO/MainProcess] missed heartbeat from 9d80da8705ff
[2024-05-07 10:36:19,456: INFO/MainProcess] missed heartbeat from 59dbc68b852c

at this time it doesn't work.

`

@Nusnus

@zjqzero
Copy link
Author

zjqzero commented May 7, 2024

I have made some tests
celery 5.4.0 + kombu 5.2.2 passed
celery 5.3.6 + kombu 5.2.2 passed
celery 5.3.6 + kombu 5.2.3 failed
celery 5.2.2 + kombu 5.2.3 failed

I hope this is helpful! @Nusnus

@Nusnus
Copy link
Member

Nusnus commented May 7, 2024

celery 5.4.0 + kombu 5.2.2 passed

Did this mix have Redis working after multiple restarts?? 🙏 👀
@zjqzero

@Nusnus Nusnus self-assigned this May 7, 2024
@Nusnus Nusnus added this to the 5.4.x milestone May 7, 2024
@zjqzero
Copy link
Author

zjqzero commented May 7, 2024

celery 5.4.0 + kombu 5.2.2 passed

Did this mix have Redis working after multiple restarts?? 🙏 👀 @zjqzero

Yes, it works. I've tried it manually several times, although it was not automated, I'm not confident that I was able to reproduce the bug.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants