Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to reach quorum #168

Open
paymog opened this issue Apr 26, 2022 · 30 comments · May be fixed by #180
Open

Unable to reach quorum #168

paymog opened this issue Apr 26, 2022 · 30 comments · May be fixed by #180

Comments

@paymog
Copy link

paymog commented Apr 26, 2022

I'm having some trouble using this library. I have an endpoint which gets called twice upon page load and I'm trying to use a lock to perform sensitive database operations safely.

I'm using this library and I keep getting the following error in unusual ways which I haven't been able to figure out (maybe I have the library misconfigured?). Here's the code I'm using to initialize redlock:

import { getRedisURI } from './settings/settings.js';
import Redis from 'ioredis';
import Redlock from 'redlock';

export const redis = new Redis(getRedisURI());
export const redlock = new Redlock([redis]);

here's the code which does the sensitive database work:

async function createNewScheduleCardsIfNecessary(tripId) {
  let lock;
  console.log('attempting to make new schedule card items for trip', tripId);

  try {
    lock = await redlock.acquire([`${tripId}`], 5000, {
      retryDelay: 1000,
      retryCount: 30,
      automaticExtensionThreshold: 100,
      retryJitter: 1000,
    });
    console.log('acquired lock', tripId);
    const expectedNumberOfCards = await getExpectedNumberOfScheduleCards(tripId);
    const currentCards = await ScheduleCards.find({
      tripId,
    }).toArray();

    await createMissingScheduleCards(currentCards, expectedNumberOfCards, tripId);
    console.log('finished making schedule cards', tripId);
  } finally {
    if (lock) {
      console.log('releasing lock', tripId);
      await lock.release();
    }
  }
  console.log('all redlock stuff finished for trip', tripId);
}

here's the full output I'm getting:

Server running at http://localhost:5001/graphql


schedule cards called for trip vMEKsgzFy9EGdQKbp <------------ this indicates a request came in from the web 
attempting to make new schedule card items for trip vMEKsgzFy9EGdQKbp
acquired lock vMEKsgzFy9EGdQKbp
schedule cards called for trip vMEKsgzFy9EGdQKbp <--------------- here's the second request from the web
attempting to make new schedule card items for trip vMEKsgzFy9EGdQKbp
need to create schedule card for trip { part: 'morning', i: 0 }
need to create schedule card for trip { part: 'morning', i: 1 }
need to create schedule card for trip { part: 'morning', i: 2 }
need to create schedule card for trip { part: 'morning', i: 3 }
need to create schedule card for trip { part: 'morning', i: 4 }
need to create schedule card for trip { part: 'afternoon', i: 0 }
need to create schedule card for trip { part: 'afternoon', i: 1 }
need to create schedule card for trip { part: 'afternoon', i: 2 }
need to create schedule card for trip { part: 'afternoon', i: 3 }
need to create schedule card for trip { part: 'afternoon', i: 4 }
need to create schedule card for trip { part: 'evening', i: 0 }
need to create schedule card for trip { part: 'evening', i: 1 }
need to create schedule card for trip { part: 'evening', i: 2 }
need to create schedule card for trip { part: 'evening', i: 3 }
need to create schedule card for trip { part: 'evening', i: 4 }
need to create schedule card for trip { part: 'lodging', i: 0 }
need to create schedule card for trip { part: 'lodging', i: 1 }
need to create schedule card for trip { part: 'lodging', i: 2 }
need to create schedule card for trip { part: 'lodging', i: 3 }
need to create schedule card for trip { part: 'lodging', i: 4 }
finished making schedule cards vMEKsgzFy9EGdQKbp
releasing lock vMEKsgzFy9EGdQKbp
ExecutionError: The operation was unable to achieve a quorum during its retry window.
    at Redlock._execute (file:///Users/paymahn/code/tripvector/tripvector-mono/backend/node_modules/redlock/dist/esm/index.js:290:23)
    at async createNewScheduleCardsIfNecessary (file:///Users/paymahn/code/tripvector/tripvector-mono/backend/api/scheduleCards/graphql/queries.js:143:7)
    at async Object.scheduleCards (file:///Users/paymahn/code/tripvector/tripvector-mono/backend/api/scheduleCards/graphql/queries.js:153:5) {
  attempts: [
    Promise { [Object] },
    Promise { [Object] },
    Promise { [Object] },
    Promise { [Object] },
    Promise { [Object] },
    Promise { [Object] },
    Promise { [Object] },
    Promise { [Object] },
    Promise { [Object] },
    Promise { [Object] },
    Promise { [Object] }
  ]
}
ExecutionError: The operation was unable to achieve a quorum during its retry window.
    at Redlock._execute (file:///Users/paymahn/code/tripvector/tripvector-mono/backend/node_modules/redlock/dist/esm/index.js:290:23)
    at async Redlock.acquire (file:///Users/paymahn/code/tripvector/tripvector-mono/backend/node_modules/redlock/dist/esm/index.js:207:34)
    at async createNewScheduleCardsIfNecessary (file:///Users/paymahn/code/tripvector/tripvector-mono/backend/api/scheduleCards/graphql/queries.js:126:12)
    at async Object.scheduleCards (file:///Users/paymahn/code/tripvector/tripvector-mono/backend/api/scheduleCards/graphql/queries.js:153:5) {
  attempts: [
    Promise { [Object] }, Promise { [Object] },
    Promise { [Object] }, Promise { [Object] },
    Promise { [Object] }, Promise { [Object] },
    Promise { [Object] }, Promise { [Object] },
    Promise { [Object] }, Promise { [Object] },
    Promise { [Object] }, Promise { [Object] },
    Promise { [Object] }, Promise { [Object] },
    Promise { [Object] }, Promise { [Object] },
    Promise { [Object] }, Promise { [Object] },
    Promise { [Object] }, Promise { [Object] },
    Promise { [Object] }, Promise { [Object] },
    Promise { [Object] }, Promise { [Object] },
    Promise { [Object] }, Promise { [Object] },
    Promise { [Object] }, Promise { [Object] },
    Promise { [Object] }, Promise { [Object] },
    Promise { [Object] }
  ]
}

The time between the acquired lock and releasing lock log lines is around 1 second but for some reason, even with the generous retry policy I'm seeing redlock fail consistently.

The top stack trace includes line 143 which is the line where I release the lock. The bottom stack trace include line 126 which is where I acquire the lock.

I've tried using redlock.using as well but had the same results. I've downloaded a redis gui to inspect my local redis instance to see if there are any spurious keys but haven't been able to find any.

Here's my package.json:

    "@babel/runtime": "^7.16.3",
    "@graphql-tools/schema": "^8.3.1",
    "apollo-server-express": "^3.5.0",
    "axios": "^0.26.1",
    "bcryptjs": "^2.4.3",
    "body-parser": "^1.19.0",
    "chalk": "^4.1.2",
    "compression": "^1.7.4",
    "cookie-parser": "^1.4.6",
    "cors": "^2.8.5",
    "crypto-extra": "^1.0.1",
    "dayjs": "^1.10.7",
    "ejs": "^3.1.6",
    "express": "^4.17.1",
    "graphql": "^16.0.1",
    "graphql-redis-subscriptions": "^2.4.2",
    "graphql-subscriptions": "^2.0.0",
    "graphql-ws": "^5.7.0",
    "html-to-text": "^8.1.0",
    "immutability-helper": "^3.1.1",
    "ioredis": "^5.0.4",
    "jsonwebtoken": "^8.5.1",
    "juice": "^8.0.0",
    "lodash-es": "^4.17.21",
    "moment": "^2.29.3",
    "mongo-uri-tool": "^1.0.1",
    "mongodb": "^4.2.0",
    "nodemailer": "^6.7.2",
    "ps-node": "^0.1.6",
    "redlock": "^5.0.0-beta.2",
    "serve-favicon": "^2.5.0",
    "speakingurl": "^14.0.1",
    "ws": "^8.5.0"
@paymog
Copy link
Author

paymog commented Apr 26, 2022

Downgrading to version 4 for redlock and ioredis solved my issue completely, I guess there's a bug with the v5 implementation.

@kylecannon
Copy link

having this same issue too.

@jskorlol
Copy link

jskorlol commented May 6, 2022

I have the same problem, is there any solution?

@paymog
Copy link
Author

paymog commented May 7, 2022

@jskorlol try downgrading to version 4 of this package and ioredis

@kylecannon
Copy link

yeah, the problem is i would really like the lock auto extend functionality.

@jjm340
Copy link

jjm340 commented May 21, 2022

I'm having this problem too, we use redlock v4 to great success so I'm downgrading for now.

@hlongvu
Copy link

hlongvu commented Jun 7, 2022

any update on this, I am having this problem too

@phil-r
Copy link

phil-r commented Jun 24, 2022

Hey, we also encountered this issue on both v4 and v5.
In our case the problem was, that our lock was expired by the time we tried to release.

Failing code:

// redis and redlock setup is omitted

const wait = async (ms: number) => {
  return new Promise((resolve) => setTimeout(() => resolve(ms), ms));
};

const main = async () => {
  try {
    const lock = await redlock.acquire(['a'], 1000);
    await wait(1500);
    await lock.release();
  } catch (e) {
    console.error(e);
    throw e;
  }
};

main();

Error:

ExecutionError: The operation was unable to achieve a quorum during its retry window.

working code:

// redis, redlock and wait setup is omitted

const main = async () => {
  try {
    const lock = await redlock.acquire(['a'], 1000);
    await lock.extend(1600);
    await wait(1500);
    await lock.release();
  } catch (e) {
    console.error(e);
    throw e;
  }
};

main();

for us switching to redlock.using, increasing lock duration or using lock.extend all worked.

I think the best for the library would be to make lock.release() a noop if it's already expired.

@phil-r phil-r linked a pull request Jun 25, 2022 that will close this issue
@Dakuan
Copy link

Dakuan commented Aug 16, 2022

We are still seeing this intermittently even with redlock.using

@jketcham
Copy link

I was also running into this on version 5.0.0-beta.2 while using redlock.using, here's the relevant stack trace:

ExecutionError: The operation was unable to achieve a quorum during its retry window.
    at Redlock._execute (file:///workspace/node_modules/redlock/dist/esm/index.js:290:23)
    at processTicksAndRejections (node:internal/process/task_queues:96:5)
    at async Redlock.acquire (file:///workspace/node_modules/redlock/dist/esm/index.js:207:34)
    at async Redlock.using (file:///workspace/node_modules/redlock/dist/esm/index.js:448:20)
    ... my code ...

With the call to using looking like:

await redlock.using([lockId], 5000, async (signal) => { ... my code ... });

The problem seems to occur when the same resource is trying to be locked multiple times in close succession. Increasing the retryCount and retryDelay (to 30 and 1000 respectively for my setup) prevented the quorum error from throwing, but then I'd still have errors about not being able to extend an already expired lock.

@eth-limo
Copy link

Got bitten by this as well

    "ioredis": "^5.2.3",
    "redlock": "^5.0.0-beta.2",

@paulohlips
Copy link

paulohlips commented Nov 2, 2022

Hi folks, I faced this problem, but was a mistake in my code.

I was trying to aquire and make a set in the same key, as:

const main = async () => {
try {
const lock = await redlock.acquire(['myKey], 1000);
redisClient.set("myKey", "newValue")
await lock.release();
} catch (e) {
console.error(e);
throw e;
}
}

But this way is wrong because we need to aquire a resourceKey like "myLokedResouce:myKey" and then make changes in our key (is tihis exemple "myKey") and after release the lock on "myLokedResouce:myKey". Now my code search by "resourceKey" to know if can makes changes in my "key" like:

const key = "project"
const value = "CONFLICT"
const resource = locks:${key}
const lockTtl = 66000

async function lockAndSet(resource , lockTtl) {
const lock = await redlock.acquire([resource], lockTtl)

try {
await redisSet(key, value)
console.log('Time finished, key unlocked!')
await lock.release()

} catch (e) {
console.log(e)
}
}

lockAndSet(resource, lockTtl)

@shengjie9
Copy link

I have issue too! I have no idea. I read code and find client.evalsha(args) function throw error Error: node_redis: The EVALSHA command contains a invalid argument type.,then I have no idea

@reinoute
Copy link

reinoute commented Jan 5, 2023

I believe this is a bug in the v5 implementation. When you attempt to set a lock using acquire() and the resource is locked, Redlock will throw an ExecutionError, but it should throw a ResourceLockedError (as it did in the previous version).

@vi93a
Copy link

vi93a commented Mar 16, 2023

I was facing same issue but it was because of me passing redis client as undefined. You can debug this issue by logging in error event handler of redlock.

@gdelory
Copy link

gdelory commented Apr 4, 2023

We are also hitting this issue as soon as 2 process are trying to lock the same resource. This is 100% reproduceable:

const { client } = require('./utils/RedisClient')
const { default: Redlock } = require('redlock')

const redlock = new Redlock(
  [client], {
    driftFactor: 0.01, // multiplied by lock ttl to determine drift time
    retryCount: 300,
    retryDelay: 1000, // time in ms
    retryJitter: 200, // time in ms
    automaticExtensionThreshold: 500, // time in ms
  }
)

const withLock = async (resourceToLock, doSomethingWithLock) => {
  let lock = await redlock.acquire([resourceToLock], 30000)
  let result
  try {
    result = await doSomethingWithLock()
  } finally {
    await lock.release()
  }
  return result
}

const wait = ms => new Promise((resolve) => setTimeout(resolve, ms))

const main = async () => {
  const name = process.argv[2]
  try {
    await withLock('superId', async () => {
      console.log('start wait', name)
      await wait(30000)
      console.log('done waiting', name)
    })
    console.log('all done', name)
    
  } catch (error) {
    console.error(error.message)
  }
  process.exit()
  
}

main()

Result when running twice this process with p1 and p2 name:

start wait p2
done waiting p2
The operation was unable to achieve a quorum during its retry window.

Edit: This seems to happen 100% of the time if the acquire lock time is shorter than the time the function takes to complete, but seems to work when the acquire time is plenty more. This is not an issue if the function releases properly the lock, it might be if it crashes and doesn't though.

wait(30000) with redlock.acquire([resourceToLock], 30000) => crashes
wait(30000) with redlock.acquire([resourceToLock], 300000) => works

@manomano
Copy link

Downgrading to version 4 for redlock and ioredis solved my issue completely, I guess there's a bug with the v5 implementation.

Can you specify concrete version?

@aduca98
Copy link

aduca98 commented May 10, 2023

Any status update on the fix? It looks like a few PRs were made (I saw another github thread). where does this stand? I am also having this issue where I do not know the TTL upfront so I need the autoextending and it is giving me problems.

@EmilSabri
Copy link

EmilSabri commented Jul 1, 2023

Hi folks, I faced this problem, but was a mistake in my code.

I was trying to aquire and make a set in the same key, as:

const main = async () => { try { const lock = await redlock.acquire(['myKey], 1000); redisClient.set("myKey", "newValue") await lock.release(); } catch (e) { console.error(e); throw e; } }

But this way is wrong because we need to aquire a resourceKey like "myLokedResouce:myKey" and then make changes in our key (is tihis exemple "myKey") and after release the lock on "myLokedResouce:myKey". Now my code search by "resourceKey" to know if can makes changes in my "key" like:

const key = "project" const value = "CONFLICT" const resource = locks:${key} const lockTtl = 66000

async function lockAndSet(resource , lockTtl) { const lock = await redlock.acquire([resource], lockTtl)

try { await redisSet(key, value) console.log('Time finished, key unlocked!') await lock.release()

} catch (e) { console.log(e) } }

lockAndSet(resource, lockTtl)

This resolved my issues. Be sure to add a resource key ^5.0.0-beta.2

@BoatsDawn
Copy link

The operation was unable to achieve a quorum during its retry window.
Is there a way to fix this?

Code:

const sleep = (ms) => new Promise((res) => setTimeout(res, ms));

for (let i = 0; i < 30; i++) {
    redlock
        .using(['test'], 10_000, async () => {
            await sleep(200);
            console.log('OK');
        })
        .catch((e) => console.error(e.message));
}

@bukovacRobert
Copy link

Hey everyone! 👋
Is there a fix for this issue yet, or has anyone found a workaround?
Are we all just waiting for the stable v5 release, or is there something I’ve missed? Any info or updates would be super helpful!

@ravi9989
Copy link

I also faced the same problem, but the solution is around the duration of lock which you mentioned.
I mean :
You are configuring 400ms as duration of lock
and trying to release the lock at 600th ms[trying to release already expired lock]. Which could cause this issue.

@ifeLight
Copy link

So currently, with what is available, to prevent errors, you have to give a timelock beyond the duration of your service.

That is if your service will run for 2 seconds, make it 5s to be safe, so as long the service does not crash, it will notify Redis when it is done for another service to acquire the lock. Or you can use the extend method to extend the lock when your service will take longer time.

@jsnick
Copy link

jsnick commented Oct 23, 2023

I write this because there's maybe someone who's mistaken like me.

First, I wrote my code like

await redlock.acquire([
  'StoreProduct', // table name (static), 
  storeProduct.id, // record id (dynamic),
], 60 * 1000);

but I faced error ...unable to achieve a quorum during its retry window.

and after I read lib codes,
I found this codes

so now, I know my case needs to be fixed like

await redlock.acquire([
  `StoreProduct:${storeProduct.id}`,
], 60 * 1000);

@divmgl
Copy link

divmgl commented Dec 5, 2023

Just stumbled on this now. Downgrading to 4.2.0 was the only solution. As far as I can tell it works with the latest version of ioredis but you'll need to add a @ts-expect-error because the two signatures between the expected version of ioredis and the latest are different.

@apolenkov
Copy link

apolenkov commented Dec 14, 2023

I think it is a bug

If you have any existing lock
than client.evalsha return 0 here

const shaResult = (await client.evalsha(script.hash, keys.length, [

and after this place

if (result !== keys.length) {

return error because lock key is not implemented (result === 0), but you have min one row in keys.length > 0

BUT after return error from here

"The operation was unable to achieve a quorum during its retry window.",

@apolenkov
Copy link

apolenkov commented Dec 14, 2023

for lock error vote === 'against'
image

and after we always return error qurum because skip by this line

if (vote === "for") {

@UgurGumushan
Copy link

Hi folks, I faced this problem, but was a mistake in my code.

I was trying to aquire and make a set in the same key, as:

const main = async () => { try { const lock = await redlock.acquire(['myKey], 1000); redisClient.set("myKey", "newValue") await lock.release(); } catch (e) { console.error(e); throw e; } }

But this way is wrong because we need to aquire a resourceKey like "myLokedResouce:myKey" and then make changes in our key (is tihis exemple "myKey") and after release the lock on "myLokedResouce:myKey". Now my code search by "resourceKey" to know if can makes changes in my "key" like:

const key = "project" const value = "CONFLICT" const resource = locks:${key} const lockTtl = 66000

async function lockAndSet(resource , lockTtl) { const lock = await redlock.acquire([resource], lockTtl)

try { await redisSet(key, value) console.log('Time finished, key unlocked!') await lock.release()

} catch (e) { console.log(e) } }

lockAndSet(resource, lockTtl)

This worked for me. Thank you.
So this project did not document how to use resource keys. (Please update the *.md if possible)

Summary:
We should not use the key of the locked resource as resource keys and prepend something like "redlock:user1111" and this refers to the lock, like a pointer and not the actual item.

@BernalCarlos
Copy link

Hi folks, I faced this problem, but was a mistake in my code.

I was trying to aquire and make a set in the same key, as:

const main = async () => { try { const lock = await redlock.acquire(['myKey], 1000); redisClient.set("myKey", "newValue") await lock.release(); } catch (e) { console.error(e); throw e; } }

But this way is wrong because we need to aquire a resourceKey like "myLokedResouce:myKey" and then make changes in our key (is tihis exemple "myKey") and after release the lock on "myLokedResouce:myKey". Now my code search by "resourceKey" to know if can makes changes in my "key" like:

const key = "project" const value = "CONFLICT" const resource = locks:${key} const lockTtl = 66000

async function lockAndSet(resource , lockTtl) { const lock = await redlock.acquire([resource], lockTtl)

try { await redisSet(key, value) console.log('Time finished, key unlocked!') await lock.release()

} catch (e) { console.log(e) } }

lockAndSet(resource, lockTtl)

This was also the solution for my case. The DOCS need updating, I've done so on this PR: #291

@MrFabio
Copy link

MrFabio commented May 8, 2024

Solved it using a check for the lock expiration:

const now = new Date().getTime();
if (lock && lock.expiration > now) {
  await lock.release();
}

If the lock is expired (TTL), it is no longer in the cache, so don't release it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.