New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Full node unable to apply election block #1704
Comments
Here is another example with version 0.15.0. This log contains 2 epochs of data. For both epochs block 2980799 to 2980800 (election block) and 3002399 to 3002400 (election block). The error messages
|
As requested by @jsdanielh, an example in 0.18.0.
Then 15 minutes later
It is able to accept blocks up to Full log of this example: |
Per @viquezclaudio reproduction, this slowness is caused by history pruning on election blocks |
We have pushed a temporary solution to this issue that makes the |
I'm guessing you are talking about e4bb4a6. You have a typo in your comment, this is for full nodes, not history nodes. Also, does this commit mean that full nodes are now history nodes with this "workaround" and need much more disk space? |
For now yes. Testnet full nodes will start storing everything from the next testnet restart. |
That's a pretty significant change to full node behavior, even if just as a workaround. However, I guess it will not sync the whole history on startup? It only does not delete history? Which means that if after a while I stop the node, delete the database, and restart it, the node will not sync back the deleted history, but start with a small database and only grow with new blocks? |
That is correct. It will use the same syncing mechanism as before to get to the head of the chain. But from there on it won't remove old epochs as soon a we hit an election block. |
Correct, this is just a temporal workaround while we implement the right solution: a special lightweight version of the history store for full nodes, that allows us to properly construct (and verify the history root) without having to store the full transactions. |
Use the light history store for full node clients Necessary changes to use the history interface Several changes to usage of the history store Fixes #1704
Use the light history store for full node clients Necessary changes to use the history interface Several changes to usage of the history store Fixes #1704
Use the light history store for full node clients Necessary changes to use the history interface Several changes to usage of the history store Fixes #1704
Currently fully synced full nodes are having the issue that after some event it loses the ability to store new incoming blocks and therefore can't keep up.
Multiple incidents have been observed and so far they all show the same pattern just before it can't continue to store blocks:
After this only three log messages are displayed and interestingly the receiving blocks via
gossipsub
is one of them. Which gives the feeling that the connections with other nodes aren't actually dropped but the node is unable to process them. Deadlock?Note that the consensus head hangs at #233038 but new
gossipsub
messages are coming in. This looks similar to #1692 .At some point the node falls so far behind that verification for mempool txs start to fail
full-node-cant-keep-up.log
The text was updated successfully, but these errors were encountered: