test: improve BDB parser (handle internal/overflow pages, support all page sizes) #30125

theStack · 2024-05-16T17:48:31Z

This PR adds missing features to our test framework's BDB parser with the goal of hopefully being able to read all legacy wallets that are created with current and past versions of Bitcoin Core. This could be useful both for making review of #26606 easier and to also possibly improve our functional tests for the wallet BDB-ro parser by additionally validating it with an alternative implementation. The second commits introduces a test that create a legacy wallet with huge label strings (in order to create overflow pages, i.e. pages needed for key/value data than is larger than the page size) and compares the dump outputs of wallet tool and the extended test framework BDB parser.
It can be exercised via $ ./test/functional/tool_wallet.py --legacy. BDB support has to be compiled in (obviously).

For some manual tests regarding different page sizes, the following patch can be used:

diff --git a/src/wallet/bdb.cpp b/src/wallet/bdb.cpp
index 38cca32f80..1bf39323d3 100644
--- a/src/wallet/bdb.cpp
+++ b/src/wallet/bdb.cpp
@@ -395,6 +395,7 @@ void BerkeleyDatabase::Open()
                             DB_BTREE,                                 // Database type
                             nFlags,                                   // Flags
                             0);
+            pdb_temp->set_pagesize(1<<9); /* valid BDB pagesizes are from 1<<9 (=512) to <<16 (=65536) */
 
             if (ret != 0) {
                 throw std::runtime_error(strprintf("BerkeleyDatabase: Error %d, can't open database %s", ret, strFile));

I verified that the newly introduced test passes with all valid page sizes between 512 and 65536.

DrahtBot · 2024-05-16T17:48:34Z

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage

For detailed information about the code coverage, see the test coverage report.

Reviews

See the guideline for information on the review process.

Type	Reviewers
Concept ACK	laanwj, BrandonOdiwuor

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

No conflicts as of last run.

test/functional/test_framework/bdb.py

laanwj · 2024-05-16T19:37:11Z

Code review ACK, from what i learned about how BDB databases work internally, implementation looks correct on first glance.

…l page sizes) This aims to complete our test framework BDB parser to reflect our read-only BDB parser in the wallet codebase. This could be useful both for making review of bitcoin#26606 easier and to also possibly improve our functional tests for the BDB parser by comparing with an alternative implementation.

theStack · 2024-05-21T17:19:35Z

Rebased on master (necessary since #26606 was merged and also touched the same functional test file).

BrandonOdiwuor

Concept ACK

rkrux

Incomplete review and untested yet, left couple comments off the top of my head atm.
Will do another round thoroughly in the next few days.

rkrux · 2024-05-30T09:30:37Z

test/functional/test_framework/bdb.py

-        entry['data'] = data[offset + 3:offset + 3 + e_len]
+        record_header = data[offset:offset + 3]
+        offset += 3
+        e_len, record_type = struct.unpack('HB', record_header)


There's a separate effort from @maflcko regarding removing the usage of struct.pack from all places in test. Though here it's struct.unpack but I guess the reasoning for the removal of struct.pack is still applicable.
#29401

There is a small benefit to struct.pack being less code when more than one item is packed or unpacked. Seems fine to use either in this case, up to the author. However, in most other cases where only a single item is (un)packed, the modern alternatives are less code, and clearer.

rkrux · 2024-05-30T09:35:04Z

test/functional/test_framework/bdb.py

+        # Determine pagesize first
+        data = f.read(PAGE_HEADER_SIZE)
+        pagesize = struct.unpack('I', data[20:24])[0]
+        assert pagesize in (512, 1024, 2048, 4096, 8192, 16384, 32768, 65536)


WDYT about creating a constant tuple for these values? They seem pretty generic enough.

DrahtBot added the Tests label May 16, 2024

laanwj reviewed May 16, 2024

View reviewed changes

test/functional/test_framework/bdb.py Outdated Show resolved Hide resolved

laanwj reviewed May 16, 2024

View reviewed changes

test/functional/test_framework/bdb.py Show resolved Hide resolved

laanwj added the Wallet label May 17, 2024

theStack mentioned this pull request May 17, 2024

wallet: Implement independent BDB parser #26606

Merged

DrahtBot added the Needs rebase label May 21, 2024

theStack force-pushed the complete_bdb-ro_python_parser2 branch from d166e88 to 361c3b0 Compare May 21, 2024 17:17

DrahtBot added CI failed and removed Needs rebase labels May 21, 2024

test: compare BDB dumps of test framework parser and wallet tool

ad0a8ff

theStack force-pushed the complete_bdb-ro_python_parser2 branch from 361c3b0 to ad0a8ff Compare May 22, 2024 09:58

DrahtBot removed the CI failed label May 22, 2024

BrandonOdiwuor reviewed May 29, 2024

View reviewed changes

rkrux reviewed May 30, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test: improve BDB parser (handle internal/overflow pages, support all page sizes) #30125

test: improve BDB parser (handle internal/overflow pages, support all page sizes) #30125

theStack commented May 16, 2024 •

edited

DrahtBot commented May 16, 2024 •

edited

laanwj commented May 16, 2024

theStack commented May 21, 2024

BrandonOdiwuor left a comment

rkrux left a comment

rkrux May 30, 2024

maflcko May 30, 2024

rkrux May 30, 2024

test: improve BDB parser (handle internal/overflow pages, support all page sizes) #30125

Are you sure you want to change the base?

test: improve BDB parser (handle internal/overflow pages, support all page sizes) #30125

Conversation

theStack commented May 16, 2024 • edited

DrahtBot commented May 16, 2024 • edited

Code Coverage

Reviews

Conflicts

laanwj commented May 16, 2024

theStack commented May 21, 2024

BrandonOdiwuor left a comment

Choose a reason for hiding this comment

rkrux left a comment

Choose a reason for hiding this comment

rkrux May 30, 2024

Choose a reason for hiding this comment

maflcko May 30, 2024

Choose a reason for hiding this comment

rkrux May 30, 2024

Choose a reason for hiding this comment

theStack commented May 16, 2024 •

edited

DrahtBot commented May 16, 2024 •

edited