New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
blockstorage: Separate reindexing from saving new blocks #29975
base: master
Are you sure you want to change the base?
blockstorage: Separate reindexing from saving new blocks #29975
Conversation
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. Code CoverageFor detailed information about the code coverage, see the test coverage report. ReviewsSee the guideline for information on the review process.
If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
761ed1a
to
39ad8d8
Compare
🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the Possibly this is due to a silent merge conflict (the changes in this pull request being Leave a comment here, if you need help tracking down a confusing failure. |
Concept ACK |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Recreated the change to understand it better, commented on what I've noticed.
src/test/blockmanager_tests.cpp
Outdated
@@ -189,7 +189,7 @@ BOOST_AUTO_TEST_CASE(blockmanager_flush_block_file) | |||
// to block 2 location. | |||
CBlockFileInfo* block_data = blockman.GetBlockFileInfo(0); | |||
BOOST_CHECK_EQUAL(block_data->nBlocks, 2); | |||
BOOST_CHECK(blockman.SaveBlockToDisk(block3, /*nHeight=*/3, /*dbp=*/&pos2) == pos2); | |||
blockman.AddToBlockFileInfo(block3, /*nHeight=*/3, /*pos=*/pos2); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems to me the related comments needs to be updated in this file (e.g line 184 and 199)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated some comments. The entire test setup probably makes a bit less sense after the refactor, users unfamiliar
with the history might ask themselves why someone could think that Reindex
/ AddToBlockFileInfo
would change the block files so that we'd require a test making sure it doesn't.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concept ACK
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concept ACK
39ad8d8
to
194e84a
Compare
39ad8d8 to 194e84a: Addressed review feedback by @paplorinc, thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code review ACK 194e84a. I left a lot of comments, but everything looks right here and the code is a lot nicer than before.
return false; | ||
FlatFilePos blockPos{}; | ||
if (dbp) { | ||
blockPos = *dbp; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "validation, blockstorage: Separate code paths for reindex and saving new blocks" (a17eaca)
It looks like previously there would have been an error here if dbp->IsNull()
was true, and now there will not be an error. This is probably a good change, since AcceptBlock should not be looking at block positions, just passing them on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, in the reindex case; In this case the passed dbp isn't changed (it's now a const arg to AddToBlockFileInfo
). If a dpb was passed to AcceptBlock for which dbp->IsNull()
, the error message ("Failed to find position to write new block to disk") would have been very confusing anyway, because we don't write a block to disk during reindex anyway.
src/node/blockstorage.h
Outdated
* @param[in] pos the position of the block on disk. This must point *after* the | ||
* 8 byte serialization header, at the beginning of the actual block data. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "blockstorage: split up FindBlockPos function" (a2ae0a3)
I think this description is technically accurate, but I got confused by it and thought it was wrong because "the 8 byte serialization header" sounds like something that is part of CBlock
serialization, when actually it is referring to separator fields written by WriteBlockToDisk
before the serialized CBlock
.
Would suggest changing comment to "pos: the position of the serialized CBlock on disk. This is the position returned by WriteBlockToDisk pointing at the CBlock, not the separator fields before it."
I would also suggesting adding two more comments to this commit to make it clear what it happening at this stage of the PR.
In WriteBlockToDisk
documentation, "// The pos argument passed to this function is modified by this call. Before this call, it should point to an unused file location where separator fields will be written followed by the serialized CBlock data. After this call, it will point to the beginning of the serialized CBlock data, after the separator fields"
In FindBlockPos
documentation, "// The nAddSize argument passed to this function should include not just the size of the serialized CBlock, but the also size of separator fields which are written before it by WriteBlockToDisk (BLOCK_SERIALIZATION_HEADER_SIZE).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have done that, but bundled most of the doc changes to a doc-only commit at the beginning in order to not have unrelated things in the "blockstorage: split up FindBlockPos function" commit.
With 194e84a, since reindexing regenerates undo data, and undo data shouldn't be added until all existing blocks are, it seems like there is no reason for the --- a/src/node/blockstorage.cpp
+++ b/src/node/blockstorage.cpp
@@ -941,22 +941,11 @@ bool BlockManager::FindBlockPos(FlatFilePos& pos, unsigned int nAddSize, unsigne
void BlockManager::AddToBlockFileInfo(const CBlock& block, unsigned int nHeight, const FlatFilePos& pos)
{
LOCK(cs_LastBlockFile);
-
const unsigned int added_size = ::GetSerializeSize(TX_WITH_WITNESS(block));
- const BlockfileType chain_type = BlockfileTypeForHeight(nHeight);
- // -reindex and -reindex-chainstate delete any snapshot chainstate during init
- Assume(chain_type == BlockfileType::NORMAL);
-
const int nFile = pos.nFile;
if (static_cast<int>(m_blockfile_info.size()) <= nFile) {
m_blockfile_info.resize(nFile + 1);
}
-
- const int last_blockfile = m_blockfile_cursors[chain_type]->file_num;
- if (nFile != last_blockfile) {
- // No undo data yet in the new file, so reset our undo-height tracking.
- m_blockfile_cursors[chain_type] = BlockfileCursor{nFile};
- }
m_blockfile_info[nFile].AddBlock(nHeight, block.GetBlockTime());
m_blockfile_info[nFile].nSize = std::max(pos.nPos + added_size, m_blockfile_info[nFile].nSize);
m_dirty_fileinfo.insert(nFile); |
fKnown is true during reindex (and only then), which deletes any existing snapshot chainstate. As a result, this function can never be called wth fKnown set and a snapshot chainstate. Add an Assume for this, and make the code initializing a blockfile cursor for the snapshot conditional on !fKnown. This is a preparation for splitting the reindex logic out of FindBlockPos in the following commits.
194e84a
to
6a22eed
Compare
Thanks for the detailed review @ryanofsky! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code review ACK 6a22eed. Just some suggested changes were made since the last review: adding more Assume checks, renaming a function, and moving a declaration. Everything still looks good.
src/node/blockstorage.cpp
Outdated
|
||
const int last_blockfile = m_blockfile_cursors[chain_type]->file_num; | ||
if (nFile != last_blockfile) { | ||
// No undo data yet in the new file, so reset our undo-height tracking. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "blockstorage: split up FindBlockPos function" (2b6d274)
I think this comment is not really accurate in this context. There should be no need to reset BlockfileCursor::undo_height
here, only to set a new BlockfileCursor::file_num
value so MaxBlockfileNum()
returns the right thing. So the comment could be changed to something like "update the cursor so it points to the last file".
Maybe it would also make sense to make this a little more robust so doesn't move the cursor backwards if UpdateBlockInfo
calls are made out of order. We are already doing this for blocks within the file by using std::max to set the file size below. Generalizing this would also allow dropping the Assume()
checks:
--- a/src/node/blockstorage.cpp
+++ b/src/node/blockstorage.cpp
@@ -942,24 +942,19 @@ void BlockManager::UpdateBlockInfo(const CBlock& block, unsigned int nHeight, co
{
LOCK(cs_LastBlockFile);
+ // Update the cursor so it points to the last file.
+ const BlockfileType chain_type{BlockfileTypeForHeight(nHeight)};
+ auto& cursor{m_blockfile_cursors[chain_type]};
+ if (!cursor || cursor->file_num < pos.nFile) {
+ m_blockfile_cursors[chain_type] = BlockfileCursor{pos.nFile};
+ }
+
+ // Update the file information so it points to the last block.
const unsigned int added_size = ::GetSerializeSize(TX_WITH_WITNESS(block));
- const BlockfileType chain_type = BlockfileTypeForHeight(nHeight);
- // Check that chain type is NORMAL, because this function is only
- // called during reindexing, and reindexing deletes snapshot chainstates, so
- // chain_type will not be SNAPSHOT. Also check that cursor exists, because
- // the normal cursor should never be null.
- Assume(chain_type == BlockfileType::NORMAL);
- Assume(m_blockfile_cursors[chain_type]);
const int nFile = pos.nFile;
if (static_cast<int>(m_blockfile_info.size()) <= nFile) {
m_blockfile_info.resize(nFile + 1);
}
-
- const int last_blockfile = m_blockfile_cursors[chain_type]->file_num;
- if (nFile != last_blockfile) {
- // No undo data yet in the new file, so reset our undo-height tracking.
- m_blockfile_cursors[chain_type] = BlockfileCursor{nFile};
- }
m_blockfile_info[nFile].AddBlock(nHeight, block.GetBlockTime());
m_blockfile_info[nFile].nSize = std::max(pos.nPos + added_size, m_blockfile_info[nFile].nSize);
m_dirty_fileinfo.insert(nFile);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current logic seems really brittle to me, I suspected that there might be a bug hiding somewhere, but it barely works out in all scenarios I could think of.
With blocks being save on disk out of order, it is possible that at the end of a reindex, the cursor points to an older block file. Yet, it appears that nothing really bad happens as a result: When a new blocks arrives, FindBlockPos
will still find the correct position in this while loop, skipping ahead.
In a similar way, MaxBlockfileNum()
being incorrect can result in an incorrect DB_LAST_BLOCK
being written to disk on shutdown. However, we also recover from that because at next startup, LoadBlockIndexDB
uses DB_LAST_BLOCK
basically only for logging but doesn't trust it's actually pointing to the last block, searching for more db entries here, so only the log messages on startup will be wrong.
This seems really fragile to me, so I think that your suggestion is a good idea, and I'll add a commit for it!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added this with the latest push. Changed the // Update the file information so it points to the last block.
comment slightly because the file information doesn't really point to anything.
In particular, document the flat file positions expected and returned by functions better. Co-authored-by: Ryan Ofsky <ryan@ofsky.org>
FindBlockPos does different things depending on whether the block is known or not, as shown by the fact that much of the existing code is conditional on fKnown set or not. If the block position is known (during reindex) the function only updates the block info statistics. It doesn't actually find a block position in this case. This commit removes fKnown and splits up these two code paths by introducing a separate function for the reindex case when the block position is known. It doesn't change behavior.
…new blocks By calling SaveBlockToDisk only when we actually want to save a new block to disk. In the reindex case, we now call UpdateBlockInfo directly from validation. This commit doesn't change behavior.
6a22eed
to
ff2ef7d
Compare
The new name reflects that it is no longer called with existing blocks for which the position is already known. Returning a FlatFilePos directly simplifies the interface.
Previously, it was possible to move the cursor back to an older file during reindex if blocks are enocuntered out of order during reindex. This would means that MaxBlockfileNum() would be incorrect, and a wrong DB_LAST_BLOCK could be written to disk. This improves the logic by only ever moving the cursor forward (if possible) but not backwards. Co-authored-by: Martin Zumsande <mzumsande@gmail.com>
ff2ef7d
to
9cf475f
Compare
🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the Possibly this is due to a silent merge conflict (the changes in this pull request being Leave a comment here, if you need help tracking down a confusing failure. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK 9cf475f
* block file depending on nAddSize. May flush the previous blockfile to disk if full, updates | ||
* blockfile info, and checks if there is enough disk space to save the block. | ||
* | ||
* The nAddSize argument passed to this function should include not just the size of the serialized CBlock, but the also size of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: s/the also/also the/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, only left a few nit comments
* @returns in case of success, the position to which the block was written to | ||
* in case of an error, an empty FlatFilePos | ||
*/ | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: extra line
@@ -164,6 +168,12 @@ class BlockManager | |||
|
|||
AutoFile OpenUndoFile(const FlatFilePos& pos, bool fReadOnly = false) const; | |||
|
|||
/** | |||
* Write a block to disk. The pos argument passed to this function is modified by this call. Before this call, it should | |||
* point to an unused file location where separator fields will be written followed by the serialized CBlock data. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
* point to an unused file location where separator fields will be written followed by the serialized CBlock data. | |
* point to an unused file location where separator fields will be written, followed by the serialized CBlock data. |
src/node/blockstorage.h
Outdated
@@ -312,7 +322,17 @@ class BlockManager | |||
bool WriteUndoDataForBlock(const CBlockUndo& blockundo, BlockValidationState& state, CBlockIndex& block) | |||
EXCLUSIVE_LOCKS_REQUIRED(::cs_main); | |||
|
|||
/** Store block on disk. If dbp is not nullptr, then it provides the known position of the block within a block file on disk. */ | |||
/** Store block on disk and update block file statistics. | |||
* During reindex, only the block file statistics is updated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
* During reindex, only the block file statistics is updated. | |
* During reindex, only the block file statistics are updated. |
return FlatFilePos(); | ||
} | ||
// Must also account for the serialization header | ||
// (the 4 magic message start bytes + the 4 length bytes = 8 bytes = BLOCK_SERIALIZATION_HEADER_SIZE) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
// (the 4 magic message start bytes + the 4 length bytes = 8 bytes = BLOCK_SERIALIZATION_HEADER_SIZE) | |
// Account for the 4 magic message start bytes + the 4 length bytes (8 bytes total, defined as BLOCK_SERIALIZATION_HEADER_SIZE). | |
* The nAddSize argument passed to this function should include not just the size of the serialized CBlock, but the also size of | ||
* separator fields which are written before it by WriteBlockToDisk (BLOCK_SERIALIZATION_HEADER_SIZE). | ||
*/ | ||
[[nodiscard]] bool FindBlockPos(FlatFilePos& pos, unsigned int nAddSize, unsigned int nHeight, uint64_t nTime); | ||
[[nodiscard]] FlatFilePos FindNextBlockPos(unsigned int nAddSize, unsigned int nHeight, uint64_t nTime); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does [[nodiscard]]
still make sense now?
@@ -848,7 +848,7 @@ fs::path BlockManager::GetBlockPosFilename(const FlatFilePos& pos) const | |||
return BlockFileSeq().FileName(pos); | |||
} | |||
|
|||
bool BlockManager::FindBlockPos(FlatFilePos& pos, unsigned int nAddSize, unsigned int nHeight, uint64_t nTime) | |||
FlatFilePos BlockManager::FindNextBlockPos(unsigned int nAddSize, unsigned int nHeight, uint64_t nTime) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
@@ -943,24 +943,19 @@ void BlockManager::UpdateBlockInfo(const CBlock& block, unsigned int nHeight, co | |||
{ | |||
LOCK(cs_LastBlockFile); | |||
|
|||
// Update the cursor so it points to the last file. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit (in commit message):
This would means that MaxBlockfileNum
vs
This would mean that MaxBlockfileNum
SaveBlockToDisk
/FindBlockPos
are used for two purposes, depending on whether they are called during reindexing (dbp
set,fKnown = true
) or in the "normal" case when adding new blocks (dbp == nullptr
,fKnown = false
).The actual tasks are quite different
Using one function with many conditional statements for this leads to code that is hard to read / understand and bug-prone:
FindBlockPos
are conditional onfKnown
or!fKnown
#24858 and #27039 were recent bugs directly related to the differences between reindexing and normal mode, and in both cases the simple fix took a long time to be reviewed and merged.
This PR proposes to clean this code up by splitting out the reindex logic into a separate function (
AddToBlockFileInfo
) which will be called directly from validation. As a result,SaveBlockToDisk
andFindBlockPos
only need to cover the non-reindex logic.