Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: Acquire on closed pool #1839

Closed
ionelmc opened this issue Jan 31, 2014 · 36 comments
Closed

RuntimeError: Acquire on closed pool #1839

ionelmc opened this issue Jan 31, 2014 · 36 comments

Comments

@ionelmc
Copy link
Contributor

ionelmc commented Jan 31, 2014

[2014-01-31 23:15:13,294: WARNING/Worker-353] +++ Stresstest suite end (repetition 1) +++
[2014-01-31 23:19:53,971: WARNING/Worker-353] +++ Stresstest suite start (repetition 1) +++
[2014-01-31 23:19:53,973: WARNING/Worker-342] --- 1: manyshort(0/50) rep#1 runtime: 0 seconds/0 seconds  ---
[2014-01-31 23:19:53,974: WARNING/Worker-342] /home/ionel/projects/celery/celery/app/trace.py:343: RuntimeWarning: Exception raised outside body: RuntimeError('Acquire on closed pool',):
Traceback (most recent call last):
  File "/home/ionel/projects/celery/celery/app/trace.py", line 262, in trace_task
    uuid, retval, SUCCESS, request=task_request,
  File "/home/ionel/projects/celery/celery/backends/amqp.py", line 124, in store_result
    with self.app.amqp.producer_pool.acquire(block=True) as producer:
  File "/home/ionel/projects/kombu/kombu/connection.py", line 881, in acquire
    R = self.prepare(R)
  File "/home/ionel/projects/kombu/kombu/pools.py", line 63, in prepare
    conn = self._acquire_connection()
  File "/home/ionel/projects/kombu/kombu/pools.py", line 38, in _acquire_connection
    return self.connections.acquire(block=True)
  File "/home/ionel/projects/kombu/kombu/connection.py", line 872, in acquire
    raise RuntimeError('Acquire on closed pool')
RuntimeError: {'exc_message': 'Acquire on closed pool', 'exc_type': 'RuntimeError'}

  exc, exc_info.traceback)))
[2014-01-31 23:19:53,975: CRITICAL/MainProcess] Task stress.app._marker[b1838c4d-04f8-492c-b472-917e5db29fd8] INTERNAL ERROR: {'exc_message': 'Acquire on closed pool', 'exc_type': 'RuntimeError'}
Traceback (most recent call last):
  File "/home/ionel/projects/celery/celery/app/trace.py", line 262, in trace_task
    uuid, retval, SUCCESS, request=task_request,
  File "/home/ionel/projects/celery/celery/backends/amqp.py", line 124, in store_result
    with self.app.amqp.producer_pool.acquire(block=True) as producer:
  File "/home/ionel/projects/kombu/kombu/connection.py", line 881, in acquire
    R = self.prepare(R)
  File "/home/ionel/projects/kombu/kombu/pools.py", line 63, in prepare
    conn = self._acquire_connection()
  File "/home/ionel/projects/kombu/kombu/pools.py", line 38, in _acquire_connection
    return self.connections.acquire(block=True)
  File "/home/ionel/projects/kombu/kombu/connection.py", line 872, in acquire
    raise RuntimeError('Acquire on closed pool')
RuntimeError: {'exc_message': 'Acquire on closed pool', 'exc_type': 'RuntimeError'}
@ask
Copy link
Contributor

ask commented Feb 3, 2014

I'm not seeing this, what are you testing this with?

@ionelmc
Copy link
Contributor Author

ionelmc commented Feb 3, 2014

I had a bloated rabbimq (lots of stuff in /var/lib/rabbitmq...).

In the rabbimq logs i had lots of:

closing AMQP connection <0.470.0> (127.0.0.1:44024 -> 127.0.0.1:5672):
connection_closed_abruptly

Problem is some time later I filled all the disk and had to clean everything up. Didn't reproduce after that.

I'll try later to reproduce this, maybe there a way to "drop" connections (tcpkill maybe ?) ...

@ask
Copy link
Contributor

ask commented Feb 21, 2014

This should be fixed

@ask ask closed this as completed Feb 21, 2014
@alewisohn
Copy link

I'm experiencing this error as well. What was the fix and what version was it in?

@ionelmc
Copy link
Contributor Author

ionelmc commented Mar 24, 2014

@alewisohn what versions are you using ?

@alewisohn
Copy link

3.1.7

@ionelmc
Copy link
Contributor Author

ionelmc commented Mar 24, 2014

And billiard/kombu ?

You could try to upgrade to latest version.

@alewisohn
Copy link

I'm on the latest of both of those, 3.3.0.16/3.0.4. I will try upgrading to the latest of Celery, which looks like 3.1.10.

@chenwardT
Copy link

I just started getting this problem today.

[2015-06-29 18:02:58,103: ERROR/MainProcess] Task riot_api.wrapper.get_matches_from_ids[d01746a4-f039-4347-a498-df7653dd0689] raised unexpected: RuntimeError('Acquire on closed pool',)
Traceback (most recent call last):
  File "/home/chen/.virtualenvs/lol_stats2_dev/lib/python3.4/site-packages/celery/app/trace.py", line 240, in trace_task
    R = retval = fun(*args, **kwargs)
  File "/home/chen/.virtualenvs/lol_stats2_dev/lib/python3.4/site-packages/celery/app/trace.py", line 438, in __protected_call__
    return self.run(*args, **kwargs)
  File "/home/chen/python-projects/lol_stats2/lol_stats2/riot_api/wrapper.py", line 141, in get_matches_from_ids
    RiotAPI.get_match(match['matchId'], region=region, include_timeline=False)
  File "/home/chen/python-projects/lol_stats2/lol_stats2/riot_api/wrapper.py", line 105, in get_match
link=store_get_match.s())
  File "/home/chen/.virtualenvs/lol_stats2_dev/lib/python3.4/site-packages/celery/app/task.py", line 559, in apply_async
    **dict(self._get_exec_options(), **options)
  File "/home/chen/.virtualenvs/lol_stats2_dev/lib/python3.4/site-packages/celery/app/base.py", line 347, in send_task
    with self.producer_or_acquire(producer) as P:
  File "/home/chen/.virtualenvs/lol_stats2_dev/lib/python3.4/site-packages/celery/utils/objects.py", line 79, in __enter__
    ).__enter__()
  File "/home/chen/.virtualenvs/lol_stats2_dev/lib/python3.4/site-packages/kombu/connection.py", line 868, in acquire
    R = self.prepare(R)
  File "/home/chen/.virtualenvs/lol_stats2_dev/lib/python3.4/site-packages/kombu/pools.py", line 61, in prepare
    p = p()
  File "/home/chen/.virtualenvs/lol_stats2_dev/lib/python3.4/site-packages/kombu/utils/functional.py", line 29, in __call__
    return self.evaluate()
  File "/home/chen/.virtualenvs/lol_stats2_dev/lib/python3.4/site-packages/kombu/utils/functional.py", line 32, in evaluate
    return self._fun(*self._args, **self._kwargs)
  File "/home/chen/.virtualenvs/lol_stats2_dev/lib/python3.4/site-packages/kombu/pools.py", line 41, in create_producer
    conn = self._acquire_connection()
  File "/home/chen/.virtualenvs/lol_stats2_dev/lib/python3.4/site-packages/kombu/pools.py", line 38, in _acquire_connection
    return self.connections.acquire(block=True)
  File "/home/chen/.virtualenvs/lol_stats2_dev/lib/python3.4/site-packages/kombu/connection.py", line 859, in acquire
    raise RuntimeError('Acquire on closed pool')
RuntimeError: Acquire on closed pool

My machine's log in /var/log/rabbitmq is showing several of the following entries:

=ERROR REPORT==== 29-Jun-2015::18:01:08 ===
AMQP connection <0.11051.0> (closing), channel 1 - error:
{amqp_error,channel_error,"expected 'channel.open'",'channel.close'}

and

=WARNING REPORT==== 29-Jun-2015::18:01:07 ===
closing AMQP connection <0.11033.0> (127.0.0.1:35672 -> 127.0.0.1:5672):
connection_closed_abruptly

The rest are the usual INFO level reports for accepting connections.

Using the following (relevant) packages:
amqp==1.4.6
billiard==3.3.0.20
celery==3.1.18
kombu==3.0.26

I recently started using RabbitMQ/celery (and most of what it is dependent on) so please let me know what else would be useful to list; just following the lead of others here.

Edit:
This occurs when workers are started via celery multi, specifically:
celery multi start 3 -A lol_stats2 -l info -Q:1 default -l info -Q:2 match_ids -Q:3 store -l info

When running what I believe is effectively identical:
celery -A lol_stats2 worker -l info -Q default -n worker1.%h
celery -A lol_stats2 worker -l info -Q match_ids -n worker2.%h
celery -A lol_stats2 worker -l info -Q store -n worker3.%h
The problem does not occur.

@thedrow
Copy link
Member

thedrow commented Jul 5, 2015

Looks like we need to reopen.

@ask
Copy link
Contributor

ask commented Oct 22, 2015

What result backend are you using?

I'm not able to reproduce here using the rpc result backend..

Please also try to upgrade to the latest 3.1 version

@chenwardT
Copy link

It has been a bit since I touched the code, but based on my commit history, it looks like I had disabled the results backend at the time; no "backend" argument is passed to Celery():

chenwardT/lol_stats2@0dd4fa8

Will report back w/results of using latest version when I get a chance.

@edanayal
Copy link

I'm experiencing this issue too (Acquire on closed pool after few hours of work) with latest celery 3.1.19.
I'm trying to build and send simple code for reproducing the issue, but it works flawlessly...
Here's a description of my system:
I have few queues and few workers with different concurrency (2, 1, 5, ...). Backend is rabbit too.
special config I'm using:

-Ofair

Various tasks are injected by celerybeat every minute or so, sometimes few tasks together.
Tasks are mostly long duration but CPU is very low, as the tasks are waiting for external actions to complete (like DB queries...)
Some of the tasks wait for other task groups in other queues to complete (I know it's not good practice, but I inherited the code as is...)
Any idea?

@ask
Copy link
Contributor

ask commented Dec 29, 2015

Please include the traceback!

@edanayal
Copy link

Sorry, here it is:

/usr/lib/python2.6/site-packages/celery/app/trace.py:365: RuntimeWarning: Exception raised outside body: RuntimeError('Acquire on closed pool',):
Traceback (most recent call last):
  File "/usr/lib/python2.6/site-packages/celery/app/trace.py", line 253, in trace_task
    I, R, state, retval = on_error(task_request, exc, uuid)
  File "/usr/lib/python2.6/site-packages/celery/app/trace.py", line 201, in on_error
    R = I.handle_error_state(task, eager=eager)
  File "/usr/lib/python2.6/site-packages/celery/app/trace.py", line 85, in handle_error_state
    }[self.state](task, store_errors=store_errors)
  File "/usr/lib/python2.6/site-packages/celery/app/trace.py", line 118, in handle_failure
    req.id, exc, einfo.traceback, request=req,
  File "/usr/lib/python2.6/site-packages/celery/backends/base.py", line 121, in mark_as_failure
    traceback=traceback, request=request)
  File "/usr/lib/python2.6/site-packages/celery/backends/amqp.py", line 124, in store_result
    with self.app.amqp.producer_pool.acquire(block=True) as producer:
  File "/usr/lib/python2.6/site-packages/kombu/connection.py", line 868, in acquire
    R = self.prepare(R)
  File "/usr/lib/python2.6/site-packages/kombu/pools.py", line 61, in prepare
    p = p()
  File "/usr/lib/python2.6/site-packages/kombu/utils/functional.py", line 29, in __call__
    return self.evaluate()
  File "/usr/lib/python2.6/site-packages/kombu/utils/functional.py", line 32, in evaluate
    return self._fun(*self._args, **self._kwargs)
  File "/usr/lib/python2.6/site-packages/kombu/pools.py", line 41, in create_producer
    conn = self._acquire_connection()
  File "/usr/lib/python2.6/site-packages/kombu/pools.py", line 38, in _acquire_connection
    return self.connections.acquire(block=True)
  File "/usr/lib/python2.6/site-packages/kombu/connection.py", line 859, in acquire
    raise RuntimeError('Acquire on closed pool')
RuntimeError: Acquire on closed pool

Thank you!

@edanayal
Copy link

Still happens in latest version 3.1.20... :-(

Traceback (most recent call last):
  File "/usr/lib/python2.6/site-packages/celery/app/trace.py", line 283, in trace_task
    uuid, retval, SUCCESS, request=task_request,
  File "/usr/lib/python2.6/site-packages/celery/backends/amqp.py", line 124, in store_result
    with self.app.amqp.producer_pool.acquire(block=True) as producer:
  File "/usr/lib/python2.6/site-packages/kombu/connection.py", line 868, in acquire
    R = self.prepare(R)
  File "/usr/lib/python2.6/site-packages/kombu/pools.py", line 61, in prepare
    p = p()
  File "/usr/lib/python2.6/site-packages/kombu/utils/functional.py", line 29, in __call__
    return self.evaluate()
  File "/usr/lib/python2.6/site-packages/kombu/utils/functional.py", line 32, in evaluate
    return self._fun(*self._args, **self._kwargs)
  File "/usr/lib/python2.6/site-packages/kombu/pools.py", line 41, in create_producer
    conn = self._acquire_connection()
  File "/usr/lib/python2.6/site-packages/kombu/pools.py", line 38, in _acquire_connection
    return self.connections.acquire(block=True)
  File "/usr/lib/python2.6/site-packages/kombu/connection.py", line 859, in acquire
    raise RuntimeError('Acquire on closed pool')

In other workers, after this initial exception, the exception might be the same or a bit different:

Traceback (most recent call last):
  File "/usr/lib/python2.6/site-packages/celery/app/trace.py", line 253, in trace_task
    I, R, state, retval = on_error(task_request, exc, uuid)
  File "/usr/lib/python2.6/site-packages/celery/app/trace.py", line 201, in on_error
    R = I.handle_error_state(task, eager=eager)
  File "/usr/lib/python2.6/site-packages/celery/app/trace.py", line 85, in handle_error_state
    }[self.state](task, store_errors=store_errors)
  File "/usr/lib/python2.6/site-packages/celery/app/trace.py", line 118, in handle_failure
    req.id, exc, einfo.traceback, request=req,
  File "/usr/lib/python2.6/site-packages/celery/backends/base.py", line 121, in mark_as_failure
    traceback=traceback, request=request)
  File "/usr/lib/python2.6/site-packages/celery/backends/amqp.py", line 124, in store_result
    with self.app.amqp.producer_pool.acquire(block=True) as producer:
  File "/usr/lib/python2.6/site-packages/kombu/connection.py", line 868, in acquire
    R = self.prepare(R)
  File "/usr/lib/python2.6/site-packages/kombu/pools.py", line 61, in prepare
    p = p()
  File "/usr/lib/python2.6/site-packages/kombu/utils/functional.py", line 29, in __call__
    return self.evaluate()
  File "/usr/lib/python2.6/site-packages/kombu/utils/functional.py", line 32, in evaluate
    return self._fun(*self._args, **self._kwargs)
  File "/usr/lib/python2.6/site-packages/kombu/pools.py", line 41, in create_producer
    conn = self._acquire_connection()
  File "/usr/lib/python2.6/site-packages/kombu/pools.py", line 38, in _acquire_connection
    return self.connections.acquire(block=True)
  File "/usr/lib/python2.6/site-packages/kombu/connection.py", line 859, in acquire
    raise RuntimeError('Acquire on closed pool')

Thank you

@pthornton
Copy link

This issue is occurring daily for me now. Rebooting seems to fix it temporarily.
billiard 3.3.0.21
kombu 3.0.29

Traceback (most recent call last):
  File "/home/pthornton/ares/vem/com/circ/ares/vsphere/impl/tasks.py", line 166, in start_environment
    (mipk, step, None), countdown=10)
  File "/home/pthornton/ares/venv/lib/python2.7/site-packages/celery/app/task.py", line 560, in apply_async
    **dict(self._get_exec_options(), **options)
  File "/home/pthornton/ares/venv/lib/python2.7/site-packages/celery/app/base.py", line 348, in send_task
    with self.producer_or_acquire(producer) as P:
  File "/home/pthornton/ares/venv/lib/python2.7/site-packages/celery/utils/objects.py", line 78, in __enter__
    *self.fb_args, **self.fb_kwargs
  File "/home/pthornton/ares/venv/lib/python2.7/site-packages/kombu/connection.py", line 868, in acquire
    R = self.prepare(R)
  File "/home/pthornton/ares/venv/lib/python2.7/site-packages/kombu/pools.py", line 63, in prepare
    conn = self._acquire_connection()
  File "/home/pthornton/ares/venv/lib/python2.7/site-packages/kombu/pools.py", line 38, in _acquire_connection
    return self.connections.acquire(block=True)
  File "/home/pthornton/ares/venv/lib/python2.7/site-packages/kombu/connection.py", line 859, in acquire
    raise RuntimeError('Acquire on closed pool')
RuntimeError: Acquire on closed pool

@ask
Copy link
Contributor

ask commented Mar 15, 2016

Please include some more information, e.g. what broker, result backend, command-line arguments used to start the worker, and worker related configuration.

Even better would be a set of steps required to reproduce, as I have yet to see the issue happen in stress testing.

@pthornton
Copy link

Django application.
Using supervisor project to start celery worker:
[program:celery-worker]
command={{ PYTHON }} {{ PROJECT_DIR }}/manage.py celery worker -linfo -E -A com.circ.vem --concurrency=20
amqp://guest:**@127.0.0.1:5672//
rabbitmq-server-3.3.5-4.el7.noarch

Seems to be related to volume. Starting from a reboot it can take hours to reproduce, then happens repeatedly. This is running the same set of steps that work successfully earlier in the day.

@pthornton
Copy link

app.conf.update(
CELERY_RESULT_BACKEND='djcelery.backends.database:DatabaseBackend',
CELERY_TIMEZONE='UTC' # set timezone in here
)
CELERYBEAT_SCHEDULER = 'djcelery.schedulers.DatabaseScheduler'
mariadb database

@Rajlaxmi
Copy link

Hi,

I am facing similar errors.

[[2016-03-27 12:48:18,987: INFO/MainProcess] Received task: workerProg.exec_func[734a9822-8ca6-44f9-99fa-ab7d5b8526b4]
[2016-03-27 12:48:18,988: DEBUG/MainProcess] TaskPool: Apply <function _fast_trace_task at 0x1bb9578> (args:('workerProg.exec_func', '734a9822-8ca6-44f9-99fa-ab7d5b8526b4', .... ; kwargs:{})
[2016-03-27 12:48:18,989: DEBUG/MainProcess] Task accepted: workerProg.exec_func[734a9822-8ca6-44f9-99fa-ab7d5b8526b4] pid:31076
[2016-03-27 12:48:21,187: DEBUG/MainProcess] pidbox received method enable_events() [reply_to:None ticket:None]
[2016-03-27 12:48:25,277: CRITICAL/MainProcess] Task workerProg.exec_func[734a9822-8ca6-44f9-99fa-ab7d5b8526b4] INTERNAL ERROR: RuntimeError('Acquire on closed pool',)
Traceback (most recent call last):
  File "/usr/local/lib/python2.6/site-packages/celery/app/trace.py", line 283, in trace_task
    uuid, retval, SUCCESS, request=task_request,
  File "/usr/local/lib/python2.6/site-packages/celery/backends/amqp.py", line 124, in store_result
    with self.app.amqp.producer_pool.acquire(block=True) as producer:
  File "/usr/local/lib/python2.6/site-packages/kombu/connection.py", line 868, in acquire
    R = self.prepare(R)
  File "/usr/local/lib/python2.6/site-packages/kombu/pools.py", line 63, in prepare
    conn = self._acquire_connection()
  File "/usr/local/lib/python2.6/site-packages/kombu/pools.py", line 38, in _acquire_connection
    return self.connections.acquire(block=True)
  File "/usr/local/lib/python2.6/site-packages/kombu/connection.py", line 859, in acquire
    raise RuntimeError('Acquire on closed pool')
RuntimeError: Acquire on closed pool

This is my config file:

import sys
import os
sys.path.append(os.getcwd())

CELERYD_CONCURRENCY = 25

#Disable prefecting
CELERYD_PREFETCH_MULTIPLIER = 1

#Enable retrying of lost or failed tasks
CELERY_ACKS_LATE = True

# default RabbitMQ broker
BROKER_URL = 'amqp://...'

# default RabbitMQ backend
CELERY_RESULT_BACKEND = 'amqp'

#Delete a message in queue after given minutes
CELERY_EVENT_QUEUE_TTL = 20*60

#Delete temporary error queues
CELERY_STORE_ERRORS_EVEN_IF_IGNORED = False

#Expiry Time in secs of Celery Result Task Queue
CELERY_TASK_RESULT_EXPIRES=10*60

#kill worker executing task more than given seconds and replace with a new one
CELERYD_TASK_TIME_LIMIT=60*60

CELERY_QUEUES = {"autoscalegroup": {"exchange": "autoscalegroup", "routing_key": "autoscalegroup"}}

I run into this error every 5-6 days and I have to restart the processes again.

@pthornton
Copy link

This issue is keeping me from going live. I can provide webex access to look at this issue and can repeat it regularly. Just saw it scroll past in the logs now. Happens every 2-5 minutes in my current setup.

@edanayal
Copy link

If it's an option, you may try changing the broker+backend from RabbitMQ to Redis.
So far works fine for me (2 weeks already)

@Rajlaxmi
Copy link

Rajlaxmi commented Apr 4, 2016

After receiving this error, my workers get restarted. However, they do not accept any new tasks. Logs shows that heartbeat messages are received but tasks are not accepted.

Some part of log:

Restarting celery worker (/usr/local/bin/celery worker -A proj -l info --config=celeryconfig.py -Ofair)
[2016-04-02 03:19:07,788: ERROR/MainProcess] Process 'Worker-350' pid:32804 exited with 'exitcode 70'
..
..

 -------------- celery@.. v3.1.19 (Cipater)
---- **** -----
--- * ***  * -- Linux-...
-- * - **** ---
- ** ---------- [config]
- ** ---------- .> app:         __main__:0x16a9f50
- ** ---------- .> transport:   amqp://mainClient:**@... : .../vhostClient
- ** ---------- .> results:     amqp
- *** --- * --- .> concurrency: 25 (prefork)
-- ******* ----
--- ***** ----- [queues]
 -------------- .> autoscalegroup   exchange=autoscalegroup(direct) key=autoscalegroup


[tasks]
  . workerProg.exec_func

[2016-04-02 03:19:09,129: WARNING/MainProcess] /usr/local/lib/python2.6/site-packages/celery/apps/worker.py:161: CDeprecationWarning:

[2016-04-02 03:19:09,179: INFO/MainProcess] Connected to amqp://mainClient:**@52.91.139.132:5672/vhostClient
[2016-04-02 03:19:09,193: INFO/MainProcess] mingle: searching for neighbors
[2016-04-02 03:19:10,208: INFO/MainProcess] mingle: sync with 4 nodes
[2016-04-02 03:19:10,208: INFO/MainProcess] mingle: sync complete
[2016-04-02 03:19:10,248: WARNING/MainProcess] celery@ip-10-0-1-46 ready.
[2016-04-02 03:19:11,188: INFO/MainProcess] Events of group {task} enabled by remote.
[2016-04-02 04:03:36,346: INFO/MainProcess] sync with celery@ip-10-0-1-107
[2016-04-02 04:03:36,577: INFO/MainProcess] sync with celery@ip-10-0-1-87

@spicyramen
Copy link

spicyramen commented Apr 21, 2016

I see similar problem:
Machine is Ubuntu 14.04 2GB RAM Digital Ocean

amqp==1.4.6
flower==0.8.4
celery==3.1.18
kombu==3.0.26

Celery:
/usr/local/bin/celery worker -n celeryd@%h -f /usr/local/src/xxxxx/application/xxxxx/log/celeryd.log --loglevel=DEBUG --autoscale=50,10

This was working fine until today.

[2016-04-21 11:46:56,517: INFO/Worker-1171] deploy_uc() Valid Server. Discovery was successful
[2016-04-21 11:46:56,554: INFO/Worker-1171] deploy_uc() Job: XXXXXXX is in progress
[2016-04-21 11:46:56,587: CRITICAL/MainProcess] Task catalogue.cisco.uc.deploy_uc.imbue_uc[939B9C5407] INTERNAL ERROR: RuntimeError('Acquire on closed pool',)
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/celery/app/trace.py", line 253, in trace_task
I, R, state, retval = on_error(task_request, exc, uuid)
File "/usr/lib/python2.7/dist-packages/celery/app/trace.py", line 201, in on_error
R = I.handle_error_state(task, eager=eager)
File "/usr/lib/python2.7/dist-packages/celery/app/trace.py", line 85, in handle_error_state
}[self.state](task, store_errors=store_errors)
File "/usr/lib/python2.7/dist-packages/celery/app/trace.py", line 118, in handle_failure
req.id, exc, einfo.traceback, request=req,
File "/usr/lib/python2.7/dist-packages/celery/backends/base.py", line 121, in mark_as_failure
traceback=traceback, request=request)
File "/usr/lib/python2.7/dist-packages/celery/backends/amqp.py", line 124, in store_result
with self.app.amqp.producer_pool.acquire(block=True) as producer:
File "/usr/lib/python2.7/dist-packages/kombu/connection.py", line 868, in acquire
R = self.prepare(R)
File "/usr/lib/python2.7/dist-packages/kombu/pools.py", line 63, in prepare
conn = self._acquire_connection()
File "/usr/lib/python2.7/dist-packages/kombu/pools.py", line 38, in _acquire_connection
return self.connections.acquire(block=True)
File "/usr/lib/python2.7/dist-packages/kombu/connection.py", line 859, in acquire
raise RuntimeError('Acquire on closed pool')
RuntimeError: Acquire on closed pool
[2016-04-21 12:37:46,338: DEBUG/MainProcess] | Worker: Closing Hub...

ask added a commit that referenced this issue Apr 22, 2016
@ask
Copy link
Contributor

ask commented Apr 22, 2016

I'm working in the dark here, as I have yet to reproduce the issue. The patch above may or may not fix the issue, but this is the place in the code where the pool is closed after fork.

Either the problem is in that code, or alternatively something is keeping a reference to the pool and is that way unaffected by the after fork handling code.

Hopefully knowing where to start debugging, could help you finding the source of the problem

ask added a commit that referenced this issue Apr 22, 2016
@ask
Copy link
Contributor

ask commented Jun 23, 2016

Closing this, as we don't have the resources to complete this task.

@chrisspen
Copy link

I'm also experiencing this with Celery 4.1 with a RabbitMQ backend.

@thedrow
Copy link
Member

thedrow commented Nov 14, 2017

@chrisspen Can you please open a new issue with the necessary details to debug this problem?

@PrasadSidda
Copy link

Traceback (most recent call last): File "/home/ubuntu/django8/local/lib/python2.7/site-packages/celery/app/trace.py", line 367, in trace_task R = retval = fun(args, kwargs) File "/home/ubuntu/django8/local/lib/python2.7/site-packages/celery/app/trace.py", line 622, in protected_call return self.run(args, *kwargs) File "/home/ubuntu/integra/integra/sunrise/core/config/utils.py", line 59, in inner return func(args,kwargs) File "/home/ubuntu/integra/integra/sunrise/core/config/schedule.py", line 147, in inner return func(args,kwargs) File "/home/ubuntu/integra/integra/sunrise/core/config/schedule.py", line 301, in inner return func(args,kwargs) File "/home/ubuntu/integra/integra/sunrise/cashbook/bookvalues/fileimport.py", line 523, in importfiledata delete_temp_table(task_request_id,True) File "/home/ubuntu/integra/integra/sunrise/core/config/utils.py", line 199, in delete_temp_table running_tasks = i.active() File "/home/ubuntu/django8/local/lib/python2.7/site-packages/celery/app/control.py", line 94, in active return self._request('active') File "/home/ubuntu/django8/local/lib/python2.7/site-packages/celery/app/control.py", line 81, in _request timeout=self.timeout, reply=True, File "/home/ubuntu/django8/local/lib/python2.7/site-packages/celery/app/control.py", line 436, in broadcast limit, callback, channel=channel, File "/home/ubuntu/django8/local/lib/python2.7/site-packages/kombu/pidbox.py", line 315, in _broadcast serializer=serializer) File "/home/ubuntu/django8/local/lib/python2.7/site-packages/kombu/pidbox.py", line 285, in _publish with self.producer_or_acquire(producer, chan) as producer: File "/usr/lib/python2.7/contextlib.py", line 17, in enter return self.gen.next() File "/home/ubuntu/django8/local/lib/python2.7/site-packages/kombu/pidbox.py", line 247, in producer_or_acquire with self.producer_pool.acquire() as producer: File "/home/ubuntu/django8/local/lib/python2.7/site-packages/kombu/resource.py", line 74, in acquire raise RuntimeError('Acquire on closed pool') RuntimeError: Acquire on closed pool

@PrasadSidda
Copy link

def delete_temp_table(request_id,old=False):
cursor = connection.cursor()
cursor.execute("DROP TABLE IF EXISTS temp_worker_table_"+request_id)
if old:
cursor.execute("SELECT table_name FROM information_schema.tables WHERE table_schema='public' AND table_type='BASE TABLE' AND table_name LIKE 'temp_worker_table_%'")
tables_list = [table[0] for table in cursor.fetchall()]
if len(tables_list):
i = inspect()
running_tasks = i.active()
if running_tasks is not None:
active_tasks = ["temp_worker_table_"+item['id'] for cleary_task in running_tasks.values() for item in cleary_task]
tables_list = set(tables_list) - set(active_tasks)
if len(tables_list):
sql_deltab = ';'.join(["DROP TABLE IF EXISTS "+table for table in tables_list])+';'
cursor.execute(sql_deltab)
cursor.close()

@chrisspen
Copy link

chrisspen commented Nov 28, 2017

@thedrow I fixed this by implementing my own base task that calls Django's close_old_connections(). e.g.

import celery
from celery.task import task

from django.db import close_old_connections

class BaseTask(celery.Task):

    def __call__(self, *args, **kwargs):
        # Necessary to prevent OperationalError: (2006, 'MySQL server has gone away') caused by old connections not being cleaned up.
        close_old_connections()
        return super(BaseTask, self).__call__(*args, **kwargs)

@task(base=BaseTask)
def mytask():
    blah

@jheld
Copy link
Contributor

jheld commented Nov 28, 2017

I am also having this issue w/redis as backend and broker, using json.

File "/.../tasks.py", line 80, in workers_on_queue
    for k, v in six.viewitems(celery_app.control.inspect().active_queues()):
  File "/.../lib/python2.7/site-packages/celery/app/control.py", line 116, in active_queues
    return self._request('active_queues')
  File "/.../lib/python2.7/site-packages/celery/app/control.py", line 81, in _request
    timeout=self.timeout, reply=True,
  File "/.../lib/python2.7/site-packages/celery/app/control.py", line 436, in broadcast
    limit, callback, channel=channel,
  File "/.../lib/python2.7/site-packages/kombu/pidbox.py", line 315, in _broadcast
    serializer=serializer)
  File "/.../lib/python2.7/site-packages/kombu/pidbox.py", line 285, in _publish
    with self.producer_or_acquire(producer, chan) as producer:
  File "/usr/local/Cellar/python/2.7.13_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/contextlib.py", line 17, in __enter__
    return self.gen.next()
  File "/.../lib/python2.7/site-packages/kombu/pidbox.py", line 247, in producer_or_acquire
    with self.producer_pool.acquire() as producer:
  File "/.../lib/python2.7/site-packages/kombu/resource.py", line 74, in acquire
    raise RuntimeError('Acquire on closed pool')

This only happens when we're using the control module. Sometimes it works okay.

This code path was even in a retry-loop, so in the end it still failed to execute.

python2.7, redis, django 1.11, celery 4.0.2.

@maznu
Copy link

maznu commented Jul 2, 2018

Plenty of this happening on our systems running Celery==3.1.18 and kombu==3.0.37 and amqp==1.4.9 with RabbitMQ 3.6.15 as broker and result queue. Doesn't happen every time - just retried a job and it ran fine the second time - exactly the same job, exactly the same arguments to that job. Sometimes restarting RabbitMQ fixes things, but it's hard to tell whether this actually fixes things or whether it's just probability.

Traceback out of Sentry looks something like:

kombu/connection.py in acquire at line 874
        :raises LimitExceeded: if block is false
          and the limit has been exceeded.
        """
        if self._closed:
            raise RuntimeError('Acquire on closed pool')
        if self.limit:
            while 1:
                try:
                    R = self._resource.get(block=block, timeout=timeout)
                except Empty:
kombu/pools.py in _acquire_connection at line 38
kombu/pools.py in prepare at line 63
kombu/connection.py in acquire at line 883
celery/backends/amqp.py in store_result at line 124
celery/app/trace.py in trace_task at line 283

Django 1.11 application with the following settings:

CELERY_TASK_SERIALIZER = "yaml"
CELERY_RESULT_SERIALIZER = "yaml"
CELERY_RESULT_BACKEND = 'celery.backends.amqp:AMQPBackend'
CELERY_TASK_RESULT_EXPIRES = 300
CELERY_DISABLE_RATE_LIMITS = True
CELERY_MAX_TASKS_PER_CHILD = 1000
CELERY_SEND_EVENTS = False
CELERY_EVENT_QUEUE_EXPIRES = 60

Accompanied by these entries like these in RabbitMQ's logs:

=WARNING REPORT==== 2-Jul-2018::22:09:21 ===
closing AMQP connection <0.22189.0> (169.254.201.28:54265 -> 169.254.201.38:5672, vhost: 'XXXXXXXX', user: 'XXXXXX'):
client unexpectedly closed TCP connection

What information or help do you need so that we can debug and fix this, @ask and @thedrow?

@auvipy
Copy link
Member

auvipy commented Jul 3, 2018

that version is no longer supported. please update to 4.2 series

@auvipy
Copy link
Member

auvipy commented Jul 3, 2018

continue discussion in #4410

@celery celery locked as off-topic and limited conversation to collaborators Jul 3, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests