Replies: 25 comments
-
I'm not sure if it is related but doing workflow = (
noop.s() |
chord([noop.s() for i in range(2)], noop.s())
)
workflow.apply_async(serializer='yaml') Causes
The data passed to encoder looks like follows
Calling the encoder on data
|
Beta Was this translation helpful? Give feedback.
-
More digging... workflow = (
noop.s() |
chord([noop.s() for i in range(2)], noop.s()) |
chord([noop.s() for i in range(2)], noop.s()) |
noop.s()
) What I've found out is that nested chord seems to contain some options with another nested chain? [
[ ],
{ },
{
"chord": null,
"callbacks": null,
"errbacks": null,
"chain": [
{
"chord_size": null,
"task": "celery.chord",
"subtask_type": "chord",
"kwargs": {
"body": {},
"header": {
"chord_size": null,
"task": "celery.group",
"subtask_type": "group",
"kwargs": {
"tasks": [
{
"chord_size": null,
"task": "tasks.noop",
"subtask_type": null,
"kwargs": { },
"args": [ ],
"immutable": false,
"options": {
"reply_to": "a7b2268e-e9f8-3d93-b1d2-28c7b934e727",
"chord": { # <---------------------- HERE
"chord_size": null,
"task": "celery.chain",
"subtask_type": "chain",
"kwargs": {
"tasks": [
{},
{}
]
},
"args": [ ],
"options": { },
"immutable": false
},
"task_id": "0b94ed8a-20d3-4e9c-9fd7-b4dea0306c4d"
}
},
{}
]
},
"args": [ ],
"options": {},
"immutable": false
},
"kwargs": {}
},
"args": [ ],
"options": {},
"immutable": false
}
]
}
] |
Beta Was this translation helpful? Give feedback.
-
It seems that nested task structures are created somewhere around Got something.. The huge size of messages comes actually from 'chord' inside task options, produced by following line: But I have absolutely NO Idea how to fix it. |
Beta Was this translation helpful? Give feedback.
-
Does anybody know what's the purpose of putting 'chord' inside task options? I've commented out the line from https://github.com/celery/celery/blob/v4.2.1/celery/canvas.py#L278 Chords seem to work fine, messages are much smaller |
Beta Was this translation helpful? Give feedback.
-
According to |
Beta Was this translation helpful? Give feedback.
-
@thedrow I've executed
It seems that the problem no longer occurs
|
Beta Was this translation helpful? Give feedback.
-
Ugh CI fails after my patch but...
According to tests the 'chord' inside options should be chord.id? |
Beta Was this translation helpful? Give feedback.
-
I believe the string is simply a sentinel for the actual value. |
Beta Was this translation helpful? Give feedback.
-
@thedrow yep, you're right. No idea how to fix it, I don't have that much understanding of celery internals 😞 and different pieces of code aren't really easy to read. |
Beta Was this translation helpful? Give feedback.
-
So I've created a small extension-like project. It unifies the way all complex canvases are processed. |
Beta Was this translation helpful? Give feedback.
-
how about merging this back in celery? |
Beta Was this translation helpful? Give feedback.
-
Well... I'd love to have it in celery. Unfortunately I don't have time at the moment to do it on my own. I have a feeling that celery is getting too complex. |
Beta Was this translation helpful? Give feedback.
-
OK. will try to handle this. |
Beta Was this translation helpful? Give feedback.
-
We're going to refactor Canvases in Celery 5 because of this issue. |
Beta Was this translation helpful? Give feedback.
-
Are you going to draft some design docs? I'd be happy to participate |
Beta Was this translation helpful? Give feedback.
-
@thedrow is this still an issue in Celery 5? |
Beta Was this translation helpful? Give feedback.
-
is there any workaround for this bug? i constructed a complex canvas and the message size exceed the redis limits. And i have no time to wait for 5.1.0. thx |
Beta Was this translation helpful? Give feedback.
-
@yifanw80 it's hardly a bug. its a design problem. what you can do at the moment is either use something else like Kafka or simply redesign your app (which might be pretty doable / or not) |
Beta Was this translation helpful? Give feedback.
-
@pySilver How is this not a bug? Is your position that all hierarchal workflows are design problems? |
Beta Was this translation helpful? Give feedback.
-
@fcollman kind of, yes. As much as I'd love it to work you'd expect, there are limitations that I guess are very hard to overcome. Complex workflow should keep state somehow, so I'm not surprised it uses a lot of memory. Changing workflow to something that is stateless (or not at least not holding state in runtime) is the answer I guess. |
Beta Was this translation helpful? Give feedback.
-
@fcollman you might want to look at Airflow which plays nicely with Celery (you'd be able to define complex workflows in Airflow and execute them using Celery workers). |
Beta Was this translation helpful? Give feedback.
-
Yes, unfortunately. |
Beta Was this translation helpful? Give feedback.
-
Try to use compression. |
Beta Was this translation helpful? Give feedback.
-
@thedrow how to do the compression? |
Beta Was this translation helpful? Give feedback.
-
@yifanw80 https://docs.celeryproject.org/en/master/userguide/configuration.html#std-setting-task_compression |
Beta Was this translation helpful? Give feedback.
-
Checklist
celery -A proj report
in the issue.(if you are not able to do this, then at least specify the Celery
version affected).
master
branch of Celery.Steps to reproduce
Tasks file
Example workflow
Patch kombu to see serialized message size
Expected behavior
The workflow contains roughly 57 tasks. They shouldn't be serialized into such big messages
Actual behavior
apply_async
causes serialization of the workflow producing 2 messages with following sizesAs a result celery worker on my machine eats all memory.
Beta Was this translation helpful? Give feedback.
All reactions