Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chain bmutex locks prevent wallet from recovering from failed rescan #1006

Open
pinheadmz opened this issue Jan 7, 2021 · 1 comment
Open

Comments

@pinheadmz
Copy link
Member

pinheadmz commented Jan 7, 2021

If the wallet (as a plugin) suffers an incomplete rescan due to shut down or crash, the walletDB state height will be below the chain height. If a new block is then added to the chain, that new block gets passed to walletDB _addBlock() which will detect the discrepancy and initiate a rescan starting from the walletDB current state height (in hopes of catching up whatever blocks remain from there to the new chain tip): On line 1991:

bcoin/lib/wallet/walletdb.js

Lines 1982 to 1993 in f1abba5

if (tip.height === this.state.height) {
// We let blocks of the same height
// through specifically for rescans:
// we always want to rescan the last
// block since the state may have
// updated before the block was fully
// processed (in the case of a crash).
this.logger.warning('Already saw WalletDB block (%d).', tip.height);
} else if (tip.height !== this.state.height + 1) {
await this.scan(this.state.height);
return 0;
}

The problem now is that to initiate this rescan, the wallet sends a signal through it's node client to chain.scan(), which then attempts to lock chain for the rescan:

bcoin/lib/blockchain/chain.js

Lines 1297 to 1304 in f1abba5

async scan(start, filter, iter) {
const unlock = await this.locker.lock();
try {
return await this.db.scan(start, filter, iter);
} finally {
unlock();
}
}

The bug is that on line 1298 here, the await will never resolve - why? Because chain.locker is still locked from all the way back from when that new block was added to the chain:

bcoin/lib/blockchain/chain.js

Lines 1314 to 1316 in f1abba5

async add(block, flags, id) {
const hash = block.hash();
const unlock = await this.locker.lock(hash);

...and then subsequently sent out to the wallet from chain.setBestChain() while still locked:

return this.emitAsync('connect', entry, block, view);

setBestChain() won't return until the emitAsync resolves, which it never will -- because that emitAsync was sent to the wallet, which is waiting for chain to unlock so it can rescan!

Easy fix: change emitAsync to emit at the end of setBestChain()

I can observe this fixing my problem, but I don't cause any regressions. I can provide a test case soon but it is a lot easier to demonstrate with an abortRescan() function, which is what I'm working on at the moment.

I've had users complain about this issue though before, mysteriously after a failed rescan the wallet seems frozen and new rescan calls don't work and new blocks added to the chain don't fix anyhting until the node restarts.

@pinheadmz
Copy link
Member Author

Ok so the fix downstream in hsd is here: handshake-org/hsd#533 if merged I can backport to bcoin.

The solution wasn't changing the emitAsync in setBestChain because that screws up the indexers, which should probably remain synchronous with the chain process. Instead I changed the nodeclient proxy event 'block connect' to emit synchronously and that solves the problem. (This is more similar to how separate wallet node operates as well, it gets the 'block connect' event from a websocket which doesn't wait for resolution).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant