Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AttributeError 'list' object has no attribute 'decode' with redis backend #4363

Closed
1 task done
Twista opened this issue Nov 3, 2017 · 60 comments
Closed
1 task done

Comments

@Twista
Copy link

Twista commented Nov 3, 2017

Checklist

  • I have included the output of celery -A proj report in the issue.
software -> celery:4.1.0 (latentcall) kombu:4.1.0 py:3.5.2
            billiard:3.5.0.3 redis:2.10.5
platform -> system:Linux arch:64bit, ELF imp:CPython
loader   -> celery.loaders.app.AppLoader
settings -> transport:redis results:redis://:**@****************

task_ignore_result: True
accept_content: {'pickle'}
result_serializer: 'pickle'
result_backend: 'redis://:********@************************'
task_serializer: 'pickle'
task_send_sent_event: True
broker_url: 'redis://:********@************************'

redis-server version, both 2.x and 3.x

Steps to reproduce

Hello, i'm not sure what can cause the problems and already tried to find a simillar solution, but no luck so far. Therefore opening issue here, hopefully it helps

The issue is described there as well (not by me): redis/redis-py#612

So far the experience is, its happen in both cases, where backend is and isn't involved (means just when calling apply_async(...))

Exception when calling apply_async()
attributeerror___list__object_has_no_attribute__decode_

Exception when calling .get() (also this one has int, instead of list)
attributeerror___list__object_has_no_attribute__decode_

Hope it helps

Expected behavior

To not throw the error.

Actual behavior

AttributeError: 'list' object has no attribute 'decode'

Thanks!

@Twista
Copy link
Author

Twista commented Nov 3, 2017

Also, there is a full stack trace, which includes all the parameters

attributeerror___list__object_has_no_attribute__decode

@georgepsarakis
Copy link
Contributor

This seems like a race condition in the Redis connection pool from concurrent worker operations. Which worker pool type are you using? I think that if you use the prefork pool you will not run into this issue. Let me know what the outcome is, if you try it.

@Twista
Copy link
Author

Twista commented Nov 4, 2017

Hey @georgepsarakis,

thanks for your response, apparently it seems we are running on prefork

when checked output when starting celery (which runs under systemd) i got this output:

 -------------- celery@autoscaled-dashboard-worker v4.1.0 (latentcall)
---- **** -----
--- * ***  * -- Linux-4.4.0-45-generic-x86_64-with-Ubuntu-16.04-xenial 2017-11-04 20:41:15
-- * - **** ---
- ** ---------- [config]
- ** ---------- .> app:         worker:0x7f984dbcce48
- ** ---------- .> transport:   redis://:**@**
- ** ---------- .> results:     redis://:**@**
- *** --- * --- .> concurrency: {min=3, max=12} (prefork)
-- ******* ---- .> task events: ON

some other info:

  • our workers are on different machines then app (which dispatches tasks)
  • have 2+ workers all the time (all with same settings below)
  • all of them running under systemd (celery multi start -A ... -E --autoscale=12,3 -Ofair + some other not so important agruments, all specified as Type=forking service)

let me know if i can add anything else, would like to help as much as possible to resolve that :)

thanks!

@Twista
Copy link
Author

Twista commented Nov 8, 2017

another thing (even thoung i'm not sure if it could be related) - sometimes we get this exception (same setup as above)

File "/usr/local/lib/python3.5/dist-packages/celery/app/base.py", line 737, in send_task
    amqp.send_task_message(P, name, message, **options)
  File "/usr/lib/python3.5/contextlib.py", line 77, in __exit__
    self.gen.throw(type, value, traceback)
  File "/usr/local/lib/python3.5/dist-packages/kombu/connection.py", line 419, in _reraise_as_library_errors
    sys.exc_info()[2])
  File "/usr/local/lib/python3.5/dist-packages/vine/five.py", line 178, in reraise
    raise value.with_traceback(tb)
File "/usr/local/lib/python3.5/dist-packages/kombu/connection.py", line 414, in _reraise_as_library_errors
    yield
  File "/usr/local/lib/python3.5/dist-packages/celery/app/base.py", line 736, in send_task
    self.backend.on_task_call(P, task_id)
  File "/usr/local/lib/python3.5/dist-packages/celery/backends/redis.py", line 189, in on_task_call
    self.result_consumer.consume_from(task_id)
  File "/usr/local/lib/python3.5/dist-packages/celery/backends/redis.py", line 76, in consume_from
    self._consume_from(task_id)
  File "/usr/local/lib/python3.5/dist-packages/celery/backends/redis.py", line 82, in _consume_from
    self._pubsub.subscribe(key)
  File "/usr/local/lib/python3.5/dist-packages/redis/client.py", line 2482, in subscribe
    ret_val = self.execute_command('SUBSCRIBE', *iterkeys(new_channels))
  File "/usr/local/lib/python3.5/dist-packages/redis/client.py", line 2404, in execute_command
    self._execute(connection, connection.send_command, *args)
  File "/usr/local/lib/python3.5/dist-packages/redis/client.py", line 2408, in _execute
    return command(*args)
  File "/usr/local/lib/python3.5/dist-packages/redis/connection.py", line 610, in send_command
    self.send_packed_command(self.pack_command(*args))
  File "/usr/local/lib/python3.5/dist-packages/redis/connection.py", line 585, in send_packed_command
    self.connect()
  File "/usr/local/lib/python3.5/dist-packages/redis/connection.py", line 493, in connect
    self.on_connect()
  File "/usr/local/lib/python3.5/dist-packages/redis/connection.py", line 567, in on_connect
    if nativestr(self.read_response()) != 'OK':
  File "/usr/local/lib/python3.5/dist-packages/redis/connection.py", line 629, in read_response
    raise response
kombu.exceptions.OperationalError: only (P)SUBSCRIBE / (P)UNSUBSCRIBE / PING / QUIT allowed in this context

we tried to update py-redis (will see, but its just minor one)

any hint is very appreciate :)

@georgepsarakis
Copy link
Contributor

@Twista can you try a patch on the Redis backend?

If you can add here the following code:

def on_after_fork(self):
    logger.info('Resetting Redis client.')
    del self.backend.client

I hope that this will force the client cached property to generate a new Redis client after each worker fork.

Let me know if this has any result.

@Twista
Copy link
Author

Twista commented Dec 5, 2017

Hey, sorry for long response.

Just a followup - we just patched it and will see if it helps. Hopefully soon :)

thanks for help :)

@auvipy
Copy link
Member

auvipy commented Dec 21, 2017

lets us know your feedback

@tomaszhlawiczka
Copy link

Hi,

I'm experiencing pretty the same issue (using Celery.send_task) and have a related question:
Why, when calling an async task (so the result is not expected), Celery starts to listen a redis publisher (self._pubsub.subscribe(key) called by a signal handler celery/backends/redis.py, line 189, in on_task_call)?

self.result_consumer.consume_from(task_id)

@wimby
Copy link

wimby commented Mar 12, 2018

@georgepsarakis unfortunately the patch didn't help. :-(

here is another stack trace:
screen shot 2018-03-12 at 10 56 04
screen shot 2018-03-12 at 10 56 26
screen shot 2018-03-12 at 10 56 45
screen shot 2018-03-12 at 10 57 06

Please let me know if you want some logs or other patches tested.

@georgepsarakis
Copy link
Contributor

@wimby thanks a lot for the feedback. Can you please tell me what options are you using for starting the worker?

@Twista
Copy link
Author

Twista commented Apr 3, 2018

Hey @georgepsarakis
Will answer this one instead of @wimby

that's whole command we use to start celery

celery multi start worker -A ... --pidfile=... --logfile=... --loglevel=... -E --time-limit=300 --autoscale=10,3 --queues=... -Ofair

@dmitry-kostin
Copy link

faced with the same issue when decided to try redis broker, rolled back to rabbit for now..

@asgoel
Copy link

asgoel commented May 11, 2018

Any update on this? Running into this with a similar setup to @Twista / @wimby where we have cron workers running on a separate machine than the process that schedules our tasks.

celery==4.0.2
redis==2.10.5.

To clarify, this happens on the machine that is sending the tasks, not the worker machine.

@auvipy auvipy added this to the v4.3 milestone May 27, 2018
@deterb
Copy link

deterb commented Jun 13, 2018

I'm seeing this as well. Celery 4.2.0, kombu 4.2.1, redis 2.10.6. Using redis for both the broker and the results. Redis traffic goes between servers over an encrypted stunnel connection over ports.

Running into it in a Django 1.11 running on mod_wsgi (2 processes, 15 threads each), and it's not just limited to the above exception. I can't copy-paste full stack traces. The application loads up a web page with a bunch of ajax requests (90ish per page load), each of which runs a background task via celery. The tasks complete successfully, but getting the results back is difficult.

When submitting the tasks, I've gotten ConnectionErrors, AttributeErrors (list has no attribute encode).

When getting the results back, I've gotten InvalidResponse, ResponseError, ValueError, AttributeError, TypeError, IndexError, pretty much all from redis. I've also seen decode errors from kombu when trying to parse the json responses. I suspect traffic is getting mixed up between the requests - in some cases I've seen what looks like partial responses.

I'm switching back to doing bulkier tasks, which should at least minimize this happening. However, I've seen cases where protocol issues will still happen when things have a minimal load. I wouldn't be surprised if this was being caused by underlying networking issues. I am dding additional retry logic as well for submitting requests.

@asgoel
Copy link

asgoel commented Jun 13, 2018

@deterb if you set task_ignore_result in your celery config, this should prevent this from happening. Unless you care about task results, in which case it obviously won't help.

The fix to respect ignore_result when scheduling tasks was made in 4.2.0, so you should be good.

@deterb
Copy link

deterb commented Jun 13, 2018

@asgoel The whole point is to get the results for the bulk of the requests (it's not so much running in the background for them, but running heavy calculations on a server built for it instead of the web server). I will add the ignore result for the others though and see if that helps.

@georgepsarakis
Copy link
Contributor

@deterb this sounds like an issue with non thread-safe operations. Is it possible to try not to use multithreading?

@deterb
Copy link

deterb commented Jun 15, 2018

@georgepsarakis I agree that it sounds like a multithreading issue. I'll try configuring mod_wsgi to run with 15 processes, 1 thread per process and see if I still see that behavior. I'm not doing any additional threading or multiprocessing outside of mod_wsgi. I saw similar behavior (though with less frequency) running with Django's runserver. The only interfacing with redis clients in the web application is through Celery, namely the submitting of tasks and retrieval of their results. redis-py claims to be thread safe.

I'll try to do more testing tomorrow, and see if I can recreate outside of Django and without the stunnel proxy.

@deterb
Copy link

deterb commented Jun 15, 2018

@georgepsarakis I didn't get a chance to try today, but I believe #4670 (namely the shared PubSub objects not being thread safe) is related, along with #4480. It's likely what I'm seeing is separate from the original ticket and the other issues that were mentioned. While probably related to the issues @asgoel brought up, I think a separate ticket for concurrency issues with the redis result backend would be appropriate (keep #4480 related to the RPC result backend). I can start pulling out relevant parts of stack traces as well.

@christiansaiki
Copy link

christiansaiki commented Jul 18, 2018

Hey guys, I was facing this errors a lot.
I was using AWS Elasticache and my app was deployed in AWS Elastic Beanstalk.
We changed the AWS Elasticache to Redis Labs and we increased the timeout of AWS Elastic Load Balancer to 180s and the Apache server to 180s too.
The errors decreased a lot.

Nevertheless they were still ocurring from time to time. Therefore I decided to change the result backend to PostgreSQL and then the errors disappeared completely.

@ewjoachim
Copy link

ewjoachim commented Aug 9, 2018

An intersting thing: we started experiencing this issue right after deploying a new release that

  • switched from python 2.7 to python 3.6
    ( and updated Django 1.11 (incremental. 13 to .14))

So, it's interesting to know this may be something that depends on the python version. Hoping it may help tracking the problem down.

(For the rest, we're using celery 4.2.1 and redis 2.10.6)

Also, the problematic task is launched by Celery beat, and we've had the same problem in a task launched by another task

@georgepsarakis
Copy link
Contributor

Thanks for the feedback everyone. Just to clarify a few things:

  • ResultConsumer is the Redis Backend component that asynchronously retrieves the result
  • ResultConsumer initializes a PubSub instance
  • ResultConsumer instances can be created either on a worker (Canvas workflows) or when a Task is enqueued and results are not ignored
  • the same PubSub instance cannot be used simultaneously from multiple threads
  • as far as Workers are concerned, this change should cover the case of starting new forks

I am not aware if the corresponding operations can be performed upon Django startup, perhaps this callback could help; calling ResultConsumer.on_after_fork would then create new instances and the issue will most probably not occur.

@auvipy
Copy link
Member

auvipy commented Jul 11, 2019

not sure but it is planned for 4.5.

@nicklyra
Copy link

nicklyra commented Dec 9, 2019

FWIW, even with the code snippet above, we still see periodic Protocol Errors, though less frequently:

  File "/opt/python/current/app/app/XXXX", line #, in _check_celery
    result = async_result.get(timeout=self.service_timeout)
  File "/opt/python/run/venv/local/lib/python3.6/site-packages/celery/result.py", line 226, in get
    on_message=on_message,
  File "/opt/python/run/venv/local/lib/python3.6/site-packages/celery/backends/asynchronous.py", line 188, in wait_for_pending
    for _ in self._wait_for_pending(result, **kwargs):
  File "/opt/python/run/venv/local/lib/python3.6/site-packages/celery/backends/asynchronous.py", line 255, in _wait_for_pending
    on_interval=on_interval):
  File "/opt/python/run/venv/local/lib/python3.6/site-packages/celery/backends/asynchronous.py", line 56, in drain_events_until
    yield self.wait_for(p, wait, timeout=1)
  File "/opt/python/run/venv/local/lib/python3.6/site-packages/celery/backends/asynchronous.py", line 65, in wait_for
    wait(timeout=timeout)
  File "/opt/python/run/venv/local/lib/python3.6/site-packages/celery/backends/redis.py", line 127, in drain_events
    message = self._pubsub.get_message(timeout=timeout)
  File "/opt/python/run/venv/local/lib/python3.6/site-packages/redis/client.py", line 3297, in get_message
    response = self.parse_response(block=False, timeout=timeout)
  File "/opt/python/run/venv/local/lib/python3.6/site-packages/redis/client.py", line 3185, in parse_response
    response = self._execute(conn, conn.read_response)
  File "/opt/python/run/venv/local/lib/python3.6/site-packages/redis/client.py", line 3159, in _execute
    return command(*args, **kwargs)
  File "/opt/python/run/venv/local/lib/python3.6/site-packages/redis/connection.py", line 700, in read_response
    response = self._parser.read_response()
  File "/opt/python/run/venv/local/lib/python3.6/site-packages/redis/connection.py", line 318, in read_response
    (str(byte), str(response)))
redis.exceptions.InvalidResponse: Protocol Error: , b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00*3'

The all-zeros response is the most common one, though I just saw a ProtocolError: 1, b'575932362]' go by.

I don't discount the possibility that this could also just be a glitch in Redis and nothing to do with Celery though. It's kind of hard to tell.

This is using Celery 4.3.0, Kombu 4.6.6, and Redis 3.3.11

@jmoz
Copy link

jmoz commented Mar 13, 2020

I have started getting this error frequently after implementing an async call. A flask app calls to celery. It works some times and then others I get the \x00 result:

web_1     | Traceback (most recent call last):
web_1     |   File "/root/.local/share/virtualenvs/app-lp47FrbD/lib/python3.7/site-packages/flask/app.py", line 2463, in __call__
web_1     |     return self.wsgi_app(environ, start_response)
web_1     |   File "/root/.local/share/virtualenvs/app-lp47FrbD/lib/python3.7/site-packages/flask/app.py", line 2449, in wsgi_app
web_1     |     response = self.handle_exception(e)
web_1     |   File "/root/.local/share/virtualenvs/app-lp47FrbD/lib/python3.7/site-packages/flask/app.py", line 1866, in handle_exception
web_1     |     reraise(exc_type, exc_value, tb)
web_1     |   File "/root/.local/share/virtualenvs/app-lp47FrbD/lib/python3.7/site-packages/flask/_compat.py", line 39, in reraise
web_1     |     raise value
web_1     |   File "/root/.local/share/virtualenvs/app-lp47FrbD/lib/python3.7/site-packages/flask/app.py", line 2446, in wsgi_app
web_1     |     response = self.full_dispatch_request()
web_1     |   File "/root/.local/share/virtualenvs/app-lp47FrbD/lib/python3.7/site-packages/flask/app.py", line 1951, in full_dispatch_request
web_1     |     rv = self.handle_user_exception(e)
web_1     |   File "/root/.local/share/virtualenvs/app-lp47FrbD/lib/python3.7/site-packages/flask/app.py", line 1820, in handle_user_exception
web_1     |     reraise(exc_type, exc_value, tb)
web_1     |   File "/root/.local/share/virtualenvs/app-lp47FrbD/lib/python3.7/site-packages/flask/_compat.py", line 39, in reraise
web_1     |     raise value
web_1     |   File "/root/.local/share/virtualenvs/app-lp47FrbD/lib/python3.7/site-packages/flask/app.py", line 1949, in full_dispatch_request
web_1     |     rv = self.dispatch_request()
web_1     |   File "/root/.local/share/virtualenvs/app-lp47FrbD/lib/python3.7/site-packages/flask/app.py", line 1935, in dispatch_request
web_1     |     return self.view_functions[rule.endpoint](**req.view_args)
web_1     |   File "/usr/src/app/web.py", line 81, in balances
web_1     |     result = group(balance.s(name) for name in factories.LAZY_STRATEGY_MAP.keys())().get()
web_1     |   File "/root/.local/share/virtualenvs/app-lp47FrbD/lib/python3.7/site-packages/celery/result.py", line 703, in get
web_1     |     on_interval=on_interval,
web_1     |   File "/root/.local/share/virtualenvs/app-lp47FrbD/lib/python3.7/site-packages/celery/result.py", line 822, in join_native
web_1     |     on_message, on_interval):
web_1     |   File "/root/.local/share/virtualenvs/app-lp47FrbD/lib/python3.7/site-packages/celery/backends/asynchronous.py", line 151, in iter_native
web_1     |     for _ in self._wait_for_pending(result, no_ack=no_ack, **kwargs):
web_1     |   File "/root/.local/share/virtualenvs/app-lp47FrbD/lib/python3.7/site-packages/celery/backends/asynchronous.py", line 268, in _wait_for_pending
web_1     |     on_interval=on_interval):
web_1     |   File "/root/.local/share/virtualenvs/app-lp47FrbD/lib/python3.7/site-packages/celery/backends/asynchronous.py", line 55, in drain_events_until
web_1     |     yield self.wait_for(p, wait, timeout=interval)
web_1     |   File "/root/.local/share/virtualenvs/app-lp47FrbD/lib/python3.7/site-packages/celery/backends/asynchronous.py", line 64, in wait_for
web_1     |     wait(timeout=timeout)
web_1     |   File "/root/.local/share/virtualenvs/app-lp47FrbD/lib/python3.7/site-packages/celery/backends/redis.py", line 161, in drain_events
web_1     |     message = self._pubsub.get_message(timeout=timeout)
web_1     |   File "/root/.local/share/virtualenvs/app-lp47FrbD/lib/python3.7/site-packages/redis/client.py", line 3565, in get_message
web_1     |     response = self.parse_response(block=False, timeout=timeout)
web_1     |   File "/root/.local/share/virtualenvs/app-lp47FrbD/lib/python3.7/site-packages/redis/client.py", line 3453, in parse_response
web_1     |     response = self._execute(conn, conn.read_response)
web_1     |   File "/root/.local/share/virtualenvs/app-lp47FrbD/lib/python3.7/site-packages/redis/client.py", line 3427, in _execute
web_1     |     return command(*args, **kwargs)
web_1     |   File "/root/.local/share/virtualenvs/app-lp47FrbD/lib/python3.7/site-packages/redis/connection.py", line 734, in read_response
web_1     |     response = self._parser.read_response()
web_1     |   File "/root/.local/share/virtualenvs/app-lp47FrbD/lib/python3.7/site-packages/redis/connection.py", line 324, in read_response
web_1     |     (str(byte), str(response)))
web_1     | redis.exceptions.InvalidResponse: Protocol Error: , b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00*3'

More weird errors:

web_1 | redis.exceptions.InvalidResponse: Protocol Error: s, b'ubscribe'

And:

web_1     |   File "/root/.local/share/virtualenvs/app-lp47FrbD/lib/python3.7/site-packages/redis/connection.py", line 350, in read_response
web_1     |     response = self._buffer.read(length)
web_1     | AttributeError: 'NoneType' object has no attribute 'read'

Python 3.7, Celery 4.4.1, Redis 3.4.1.

@auvipy
Copy link
Member

auvipy commented Mar 18, 2020

could any of you try this patch #5145?

@auvipy auvipy modified the milestones: 4.5, 4.4.x Mar 18, 2020
@nicklyra
Copy link

could any of you try this patch #5145?

I wouldn't mind trying it, but I note there's a great deal of contention on the approach in that patch, builds with that patch are failing, and it's been sitting there for 17 months. It kind of looks like the idea has been abandoned.

@auvipy
Copy link
Member

auvipy commented Mar 18, 2020

give it a try first and don't' mind review the PR and prepare a failing test for it.

@nicklyra
Copy link

FWIW, confirmed today this is still an issue with Celery 4.4.2, Kombu 4.6.8, Redis 3.4.1.

@jheld
Copy link
Contributor

jheld commented Apr 4, 2020

Considering how old of an issue this is, is there any hint of what causes it? I don't think my project has seen it [much] recently, but nonetheless, at least a couple projects continue to see this bug occur.

@deterb
Copy link

deterb commented Apr 5, 2020

@jheld Comments from #4363 (comment) fit with what I saw last time I poked at it - namely the result consumer initializes a PubSub which ends up getting shared across threads and that PubSubs are not thread safe. My workaround was ignoring the results and saving/watching the results independently from my Django app.

Thanks for the feedback everyone. Just to clarify a few things:

  • ResultConsumer is the Redis Backend component that asynchronously retrieves the result
  • ResultConsumer initializes a PubSub instance
  • ResultConsumer instances can be created either on a worker (Canvas workflows) or when a Task is enqueued and results are not ignored
  • the same PubSub instance cannot be used simultaneously from multiple threads
  • as far as Workers are concerned, this change should cover the case of starting new forks

I am not aware if the corresponding operations can be performed upon Django startup, perhaps this callback could help; calling ResultConsumer.on_after_fork would then create new instances and the issue will most probably not occur.

@mlissner
Copy link
Contributor

mlissner commented Mar 2, 2021

Earlier in the issue, somebody mentioned this started happening after they upgraded Django:

and updated Django 1.11 (incremental. 13 to .14))

We just upgraded from 1.11 to 2.2, and we're started seeing it. I can't imagine why, but thought I'd echo the above.

@auvipy auvipy modified the milestones: Future, 5.2 Mar 3, 2021
@auvipy
Copy link
Member

auvipy commented Mar 3, 2021

I just checked this redis/redis-py#612 (comment) but not fully sure though

@Novarg
Copy link

Novarg commented May 31, 2021

We just made an upgrade from python2 to python3 including upgraded celery from v3 to v4 and i started to get this error in workers of one certain queue.

Celery 4.4.6. We have 2 celery workers started with this command --concurrency=8 -P gevent -Q default,celery,ldapsync_tasks

Here is stack trace:

AttributeError: 'list' object has no attribute 'decode'
  File "celery/worker/worker.py", line 208, in start
    self.blueprint.start(self)
  File "celery/bootsteps.py", line 119, in start
    step.start(parent)
  File "celery/bootsteps.py", line 369, in start
    return self.obj.start()
  File "celery/worker/consumer/consumer.py", line 318, in start
    blueprint.start(self)
  File "celery/bootsteps.py", line 119, in start
    step.start(parent)
  File "celery/worker/consumer/consumer.py", line 599, in start
    c.loop(*c.loop_args())
  File "celery/worker/loops.py", line 113, in synloop
    connection.drain_events(timeout=2.0)
  File "kombu/connection.py", line 324, in drain_events
    return self.transport.drain_events(self.connection, **kwargs)
  File "kombu/transport/virtual/base.py", line 963, in drain_events
    get(self._deliver, timeout=timeout)
  File "kombu/transport/redis.py", line 369, in get
    self._register_BRPOP(channel)
  File "kombu/transport/redis.py", line 310, in _register_BRPOP
    channel._brpop_start()
  File "kombu/transport/redis.py", line 727, in _brpop_start
    self.client.connection.send_command('BRPOP', *keys)
  File "redis/connection.py", line 726, in send_command
    check_health=kwargs.get('check_health', True))
  File "redis/connection.py", line 701, in send_packed_command
    self.check_health()
  File "redis/connection.py", line 685, in check_health
    if nativestr(self.read_response()) != 'PONG':
  File "redis/_compat.py", line 168, in nativestr
    return x if isinstance(x, str) else x.decode('utf-8', 'replace')

Screenshot 2021-05-31 at 12 56 10

@auvipy
Copy link
Member

auvipy commented Oct 30, 2021

this should have been fixed in 5.1.x versions

@auvipy auvipy closed this as completed Oct 30, 2021
@RasgoDerek
Copy link

RasgoDerek commented Dec 13, 2021

I'm seeing this issue on Celery 5.2.1, should I open a new issue or can we discuss here?

AttributeError: 'list' object has no attribute 'decode'
return x if isinstance(x, str) else x.decode('utf-8', 'replace')
File "/usr/local/lib/python3.7/site-packages/redis/_compat.py", line 168, in nativestr
if nativestr(self.read_response()) != 'PONG':
File "/usr/local/lib/python3.7/site-packages/redis/connection.py", line 685, in check_health
self.check_health()
File "/usr/local/lib/python3.7/site-packages/redis/connection.py", line 701, in send_packed_command
check_health=kwargs.get('check_health', True))
File "/usr/local/lib/python3.7/site-packages/redis/connection.py", line 726, in send_command
self.client.connection.send_command(*command_args)
File "/usr/local/lib/python3.7/site-packages/kombu/transport/redis.py", line 885, in _brpop_start
channel._brpop_start()
File "/usr/local/lib/python3.7/site-packages/kombu/transport/redis.py", line 456, in _register_BRPOP
self._register_BRPOP(channel)
File "/usr/local/lib/python3.7/site-packages/kombu/transport/redis.py", line 515, in get
get(self._deliver, timeout=timeout)
File "/usr/local/lib/python3.7/site-packages/kombu/transport/virtual/base.py", line 958, in drain_events
return self.transport.drain_events(self.connection, **kwargs)
File "/usr/local/lib/python3.7/site-packages/kombu/connection.py", line 317, in drain_events
connection.drain_events(timeout=2.0)
File "/usr/local/lib/python3.7/site-packages/celery/worker/loops.py", line 130, in synloop
c.loop(*c.loop_args())
File "/usr/local/lib/python3.7/site-packages/celery/worker/consumer/consumer.py", line 618, in start
step.start(parent)
File "/usr/local/lib/python3.7/site-packages/celery/bootsteps.py", line 116, in start
blueprint.start(self)
File "/usr/local/lib/python3.7/site-packages/celery/worker/consumer/consumer.py", line 326, in start
return self.obj.start()
File "/usr/local/lib/python3.7/site-packages/celery/bootsteps.py", line 365, in start
step.start(parent)
File "/usr/local/lib/python3.7/site-packages/celery/bootsteps.py", line 116, in start
self.blueprint.start(self)
File "/usr/local/lib/python3.7/site-packages/celery/worker/worker.py", line 203, in start
Traceback (most recent call last):
[2021-12-13 18:22:42,948: CRITICAL/MainProcess] Unrecoverable error: AttributeError("'list' object has no attribute 'decode'")

@auvipy
Copy link
Member

auvipy commented Dec 14, 2021

AttributeError: 'list' object has no attribute 'decode'
return x if isinstance(x, str) else x.decode('utf-8', 'replace')
File "/usr/local/lib/python3.7/site-packages/redis/_compat.py", line 168, in nativestr
if nativestr(self.read_response()) != 'PONG':
File "/usr/local/lib/python3.7/site-packages/redis/connection.py", line 685, in check_health
self.check_health()

not sure if its a py-redis issue redis/redis-py#612 (comment) or celery, but we have to investigate. I am busy with other backlogs now. if you try to dig it deeper? and check related merged PRs?

@RasgoDerek
Copy link

From that link it does seem like this is a gevent issue and not celery

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.