Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: TxDownloadManager #30110

Draft
wants to merge 29 commits into
base: master
Choose a base branch
from
Draft

Conversation

glozow
Copy link
Member

@glozow glozow commented May 15, 2024

Part of #27463.

Draft while #30000 is not merged, and because I am still smoothing out the interface, making sure the commits are clean, and writing more tests.

This modularizes transaction download logic into a TxDownloadManager. Transaction download logic refers to the process of deciding what transactions to request, download, and validate.[1]
All of the changes here should result in no behavior change.

There are several benefits to this interface, such as:

  • Unit test coverage and F U Z Z I N G for logic that currently isn't feasible to test as thoroughly (due to the code paths not being reachable through PeerManager without a ton of overhead) and/or currently only lightly tested through assert_debug_log ( 👎 ) in functional tests.
    • See for example the MempoolRejectedTx test, where we can define a matrix of expected outcomes for each TxValidationResult and test them very easily.
    • For fuzzing, see the txdownloadman and txdownload_impl targets.
  • When we add more functionality (e.g. package relay messages, more robust orphan handling), the vast majority of it will be within TxDownloadManager instead of PeerManager, making it easier to review and test.
  • A thinner and easier-to-use interface for PeerManager, which will no longer know anything about / have access to TxOrphanage or rejection caches. Its primary interface with TxDownloadManager would be:
    • Notify txdownloadman of tip changes (i.e. passing on some ValidationInterface callbacks)
    • Tell txdownloadmanwhen a {transaction, package} is {accepted, rejected} from mempool, and receive a few instructions
    • Each time a peer connects or disconnects + sanity check functions during cleanup
    • Pass transaction invs, notfounds to txdownloadman and receive instructions on what to validate
    • Get the getdata requests to send
    • Get the transactions to reconsider in ProcessOrphanTx
    • Get whether a peer HaveMoreWork for the ProessMessages loop
    • ^note that there are some cleanups I still have to do, so it's not this clean yet, but this is the vision. For example, while I'm writing this, I plan to make Find1P1CPackage private to TxDownloadImpl
    • ^in the future, it would also pass package-related messages on, and we would reuse the PackageToValidate struct to return the things peerman needs for validation.
  • (todo) with some more interface changes, we can make this manager internally thread-safe instead of having PeerManager handle it (e.g. currently it locks cs_main to synchronize accesses to m_txrequest!)

[1]: This does not include transaction announcements, i.e. what we send to our peers. All of that happens after we download/accept a tx. Txreconciliation (erlay) is excluded from this module, as it is part of announcements and/or part of helping the other peer decide which invs to send.

@DrahtBot
Copy link
Contributor

DrahtBot commented May 15, 2024

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage

For detailed information about the code coverage, see the test coverage report.

Reviews

See the guideline for information on the review process.
A summary of reviews will appear here.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #30214 (refactor: Improve assumeutxo state representation by ryanofsky)
  • #30186 (fuzz: increase txorphan harness stability by marcofleon)
  • #30116 (p2p: Fill reconciliation sets (Erlay) attempt 2 by sr-gi)
  • #30111 (locks: introduce mutex for tx download, flush rejection filters on UpdatedBlockTip by glozow)
  • #30058 (Encapsulate warnings in generalized node::Warnings and remove globals by stickies-v)
  • #29641 (scripted-diff: Use LogInfo/LogDebug over LogPrintf/LogPrint by maflcko)
  • #29492 (refactor: Remove redundant definitions by Empact)
  • #29415 (Broadcast own transactions only via short-lived Tor or I2P connections by vasild)
  • #27052 (test: rpc: add last block announcement time to getpeerinfo result by LarryRuane)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@DrahtBot
Copy link
Contributor

🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the
documentation.

Possibly this is due to a silent merge conflict (the changes in this pull request being
incompatible with the current code in the target branch). If so, make sure to rebase on the latest
commit of the target branch.

Leave a comment here, if you need help tracking down a confusing failure.

Debug: https://github.com/bitcoin/bitcoin/runs/24992798939

We need to synchronize between various tx download structures.
TxRequest does not inherently need cs_main for synchronization, and it's
not appropriate to lock all of the tx download logic under cs_main.
This is a synchronous version of UpdatedBlockTip.

It allows clients to respond to a new block immediately after it is
connected. The synchronicity is important for things like
m_recent_rejects, in which a transaction's validity can change (rejected
vs accepted) when this event is processed (e.g. it has a timelock
condition that has just been met). This is distinct from something like
m_recent_confirmed_transactions in which the validation outcome is the
same (valid vs already-have).
Resetting m_recent_rejects once per block is more efficient than
comparing hashRecentRejectsChainTip with the chain tip every time we
call AlreadyHaveTx. We keep hashRecentRejectsChainTip for now to assert
that updates happen correctly; it is removed in the next commit.
This also means AlreadyHaveTx no longer needs cs_main held.
The lock doesn't exist anymore.

-BEGIN VERIFY SCRIPT-
sed -i 's/EraseTxNoLock/EraseTxInternal/g' src/txorphanage.*
-END VERIFY SCRIPT-))'
@glozow
Copy link
Member Author

glozow commented May 20, 2024

Rebased for #29817 and added a "ensure we can always download a tx as long as we have 1 good outbound peer" fuzz test

static constexpr auto OVERLOADED_PEER_TX_DELAY{2s};
/** How long to wait before downloading a transaction from an additional peer */
static constexpr auto GETDATA_TX_INTERVAL{60s};
struct TxDownloadOptions {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Structs that are part of the public interface should probably go into txdownloadman.h. Otherwise one needs to include the impl header, e.g. like txdownloadman.h does right now, which defeats the purpose of pimpling to some extend.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, in the previous setup, we can change any of txdownload_impl.cpp without net_processing.cpp noticing, but yeah maybe less clean to depend on txdownload_impl.h.

I've updated with a new setup where I've added a txdownloadman.cpp depending on txdownload_impl.h, and txdownloadman.h is included by txdownload_impl.h instead of the other way around. And so net_processing.cpp no longer depends on txdownload_impl.h, which makes sense to me. Had to update lint-circular-dependencies.py but I do like that we can change txdownload_impl.h without recompiling everything. lmk if this seems better?


bool TxDownloadImpl::AlreadyHaveTx(const GenTxid& gtxid, bool include_reconsiderable)
{
const uint256& hash = gtxid.GetHash();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Iiuc correctly, clearing the filters here has been removed as we don't want TxDownloadManager to depend on ChainstateManager.

An alternative to the new synchronous validation interface callback could be to preserve the previous behavior (clearing the filters in AlreadyHaveTx on a tip change) by introducing an interface for TxDownloadMan to fetch the latest tip hash. For example, TxDownloadOptions could be extended to hold a lambda that returns the latest tip hash, which in production simply fetches the hash from the ChainstateManager (and for tests it can be mocked). This might be easier to review since it preserves the previous mechanism? Although perhaps also more difficult w.r.t to lock inversion of cs_main and m_tx_download_mutex.

Another alternative could be to directly pass the current block hash as a param into the relevant TxDownloadManager methods.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we'd ever need to make TxDownloadManager depend on ChainstateManager as we could just pass in the block hash, i.e.

Another alternative could be to directly pass the current block hash as a param into the relevant TxDownloadManager methods.

This would avoid the addition of UpdateBlockTipSync etc, as it's basically what we do now. I find this way of keeping synchronized with the chain tip pretty ugly - we'd need to hold cs_main and pass the blockhash to the TxDownloadManager all the time, bloating the interface. Subscribing to the validation interface and updating on every chain tip update seems much more sensible.

…er, and bloom filters

This module is going to be responsible for managing everything related
to transaction download, including txrequest, orphan transactions and
package relay. It will be responsible for managing usage of the
TxOrphanage and instructing PeerManager:
- what tx or package-related messages to send to which peer
- whether a tx or package-related message is allowed or useful
- what transactions are available to try accepting to mempool

Future commits will consolidate the interface and re-delegate
interactions from PeerManager to TxDownloadManager.
glozow added 19 commits May 21, 2024 13:39
The information stored in TxDownloadConnectionInfo isn't used until the
next commit.
The txdownload_impl is similar but allows us to check specific
invariants within its implementation. It will also change a lot more
than the external interface (txdownloadman) will, so we will add more to
this target later.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants