Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow identifier with multiple workers #1191

Open
toncid opened this issue Jun 5, 2023 · 7 comments
Open

Allow identifier with multiple workers #1191

toncid opened this issue Jun 5, 2023 · 7 comments

Comments

@toncid
Copy link

toncid commented Jun 5, 2023

Hello,

Using identifiers with multiple workers has been prevented for 14 years at least. Maybe it is time to reconsider it.

The PR #1190 loosens up that restriction and allows multiple identified job workers to be started with the following process name format:

delayed_job(.<identifier>)(.<worker_index>)

Both fields are optional, so the default process name remains unchanged (delayed_job).

@albus522
Copy link
Member

albus522 commented Jun 5, 2023

Why are you trying to do this?

@toncid
Copy link
Author

toncid commented Jun 5, 2023

Why are you trying to do this?

In order to better utilize a worker node, we would like to spawn as many workers as possible (until a CPU or memory limit is hit).

Yes, I know that we can call the delayed_job -i <identifierX> start command N times, but this is cleaner.

@albus522
Copy link
Member

albus522 commented Jun 5, 2023

Why are you specifying an identifier?

@toncid
Copy link
Author

toncid commented Jun 5, 2023

Why are you specifying an identifier?

Because we have multiple nodes running delayed_job workers, so we need job worker identifiers to avoid nodes stepping on each other's toes.

More context for using identifiers can be found e.g. in #866.

@albus522
Copy link
Member

albus522 commented Jun 5, 2023

You are referencing a very old issue that had more than an identifier at play. Delayed Job automatically handles multiple nodes just fine except in rare non-standard scenarios. Are you not able to give each node a unique hostname?

@toncid
Copy link
Author

toncid commented Jun 5, 2023

Thanks, that's a good question. I have checked in the DB and I see that jobs are locked as expected (with the identifier and node's hostname + PID):

locked_by: "delayed_job.1685897776 host:ip-172-28-22-203.eu-west-1.compute.internal pid:2618",

Ought to be unique enough.

We have implemented identifiers back in January 2018. One-off jobs were fired multiple times back then, so adding identifiers helped. I do not have much information on why the locking mechanism did not work well back then.

@toncid
Copy link
Author

toncid commented Jun 7, 2023

Setting my particular use case aside, what about the PR? Looks good?

I don't see much harm with enabling it, especially since a Bash workaround is supported anyway:

delayed_job -i 123456.0 start
delayed_job -i 123456.1 start
delayed_job -i 123456.2 start
delayed_job -i 123456.3 start

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants