Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: avoid collisions in auto-generated Job IDs causing dropped jobs #237

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

hughsw
Copy link
Collaborator

@hughsw hughsw commented Apr 18, 2020

Alternate fix for #78 and #189
Builds on #193 , returning to its original intent

Uses random hex string for Job.id, generated and registered in Queue prior to adding job to Redis.

@hughsw hughsw changed the title WIP/RFC -- Fix dropped jobs UUID WIP/RFC -- Fix dropped jobs, unique hex Job.id Apr 18, 2020
@hughsw
Copy link
Collaborator Author

hughsw commented Apr 18, 2020

[ This issue is resolved ]

This approach breaks the semantic that custom job id's don't affect checkHealth().newestJob, so the Queue Health Check should not report the latest job for custom job ids test fails.

The semantic could be implemented, but:

  • it's additional complexity to support an obscure semantic
  • I don't know how much the existing semantic is relied upon, it takes some hoop-jumping (as per the test) to observe -- what is the justification for custom ids being treated differently in terms of newestJob ?

@hughsw
Copy link
Collaborator Author

hughsw commented Apr 18, 2020

Actually, it's not hard to preserve the existing semantic, but I'd like some discussion about the necessity of this semantic and the test.

@hughsw
Copy link
Collaborator Author

hughsw commented Apr 19, 2020

[ This issue is resolved for Job.fromId]

Wrapping Job.fromId in order to test removal from 'active' prior to processing is problematic: This approach replaces a global resource with a one-shot modification. In a parallel testing framework, there's no guarantee that the test that replaces fromId is the one that uses the one-shot replacement... I'm not sure if this problem exists in the current master where it's a Queue instance that gets fiddled...

@hughsw
Copy link
Collaborator Author

hughsw commented Apr 20, 2020

So the one remaining test failure is baffling to me. The deepEqual of two Set object fails because the items in the sets are in different orders (even though the set of items is the same). It is reasonable that the code changes for uuid Job.id would cause the ordering to have changed, but my understanding is that Sets should be deepEqual regardless of the order. Any insight into the failure?

If I sort the error arrays (and deepEqual the arrays rather than going to the trouble of making Sets), the test passes, so that's likely what I'm going to add to this PR.

Here's a typical failure I get locally (similar to Travis failures):

  1 test failed

  queue-test › Queue Processing jobs should capture error data

  /bee-queue/test/queue-test.js:1238

   1237:                      
   1238:       t.deepEqual(   
   1239:         failedErrors,

  Difference:

    Set {
      `Error: has stack␊
          at Queue.queue.process.job (/bee-queue/test/queue-test.js:1212:23)␊
          at /bee-queue/node_modules/promise-callbacks/dist/index.js:300:18␊
          at new Promise (<anonymous>)␊
          at Queue.asyncWrapper [as handler] (/bee-queue/node_modules/promise-callbacks/dist/index.js:293:19)␊
          at Queue._runJob (/bee-queue/lib/queue.js:531:24)␊
          at finally_._getNextJob.then (/bee-queue/lib/queue.js:690:23)␊
          at process._tickCallback (internal/process/next_tick.js:68:7)`,
  +   'has message',
  +   'is string',
      true,
  -   'is string',
  -   'has message',
    }

@hughsw hughsw requested a review from skeggse April 20, 2020 01:33
@hughsw hughsw marked this pull request as draft April 20, 2020 01:34
@skeggse
Copy link
Member

skeggse commented Apr 20, 2020

I'll try and take a look later tomorrow. I'm also curious what you think of #232.

@coveralls
Copy link

Coverage Status

Coverage remained the same at 100.0% when pulling d018be3 on hughsw:fix-dropped-jobs-uuid into 3430506 on bee-queue:master.

@coveralls
Copy link

coveralls commented Apr 20, 2020

Coverage Status

Coverage remained the same at 100.0% when pulling f6a1d1a on hughsw:fix-dropped-jobs-uuid into 3430506 on bee-queue:master.

@hughsw hughsw marked this pull request as ready for review April 20, 2020 12:49
@hughsw hughsw changed the title WIP/RFC -- Fix dropped jobs, unique hex Job.id Fix dropped jobs with unique hex Job.id Apr 20, 2020
@hughsw hughsw requested a review from bradvogel April 20, 2020 12:50
lib/helpers.js Outdated Show resolved Hide resolved
lib/helpers.js Outdated Show resolved Hide resolved
@compwright compwright added the bug label Nov 22, 2022
@compwright compwright changed the title Fix dropped jobs with unique hex Job.id fix: avoid collisions in auto-generated Job IDs causing dropped jobs Nov 22, 2022
Copy link
Collaborator

@compwright compwright left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved pending rebase

@@ -1,5 +1,5 @@
--[[
key 1 -> bq:name:id (job ID counter)
key 1 -> bq:name:id (job counter)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is stored in bq:name:id? My understanding from this PR is that it's the number of jobs that don't have a custom id? What's that used for?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that is correct, it's poorly named, and appears to come from the fact that auto-generated IDs are currently sequential. So job ID 1000 would be the 1000th job. This PR of course changes that but does not rename to try to maximize compatibility, replacing the stat with the actual count instead of the actual last job ID from which you would infer the count.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is that information ("the number of jobs that don't have a custom id") used?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants