Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When a worker pod is killed, no mechanism for retrying task #765

Open
NikBisht opened this issue May 11, 2022 · 3 comments
Open

When a worker pod is killed, no mechanism for retrying task #765

NikBisht opened this issue May 11, 2022 · 3 comments

Comments

@NikBisht
Copy link

We're using Redis broker + DyanmoDB backend, and we've noticed that when a worker pod is terminated (ungracefully) and the task was still running, the task stays in STARTED state. It seems as though Machinery doesn't have a timeout at which point it we re-queue tasks that have been in STARTED state for a long period of time. This seems like a critical feature for fault tolerance.

@taylorzhangyx
Copy link

I face the same issue here.

1 similar comment
@zhouhui521
Copy link

I face the same issue here.

@kushalhalder
Copy link

Do we have any updates or workarounds against this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants