Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve SPV Syncer startup loop #2289

Open
matheusd opened this issue Oct 19, 2023 · 0 comments
Open

Improve SPV Syncer startup loop #2289

matheusd opened this issue Oct 19, 2023 · 0 comments

Comments

@matheusd
Copy link
Member

matheusd commented Oct 19, 2023

During IBD of a new SPV wallet, the same set of headers and GC filters are fetched from every connected peer. This is less than ideal from a performance standpoint, because it causes wasted bandwidth, memory and cpu processing and triggers a large number of concurrent goroutines (2000+ for each node).

Here's a sample profile from a mainnet wallet, one using SPV connect to restrict the wallet to a single peer and one where no such restriction is done (and between 6-8 peers are being used to perform IBD):

$ pprof -top regular/cpu.pprof | head
File: dcrwallet
Type: cpu
Time: Oct 19, 2023 at 11:06am (-03)
Duration: 14.79mins, Total samples = 737.47s (83.12%)
Showing nodes accounting for 600.61s, 81.44% of 737.47s total
Dropped 1028 nodes (cum <= 3.69s)
      flat  flat%   sum%        cum   cum%
    57.23s  7.76%  7.76%     58.27s  7.90%  github.com/decred/dcrd/gcs/v4.(*bitReader).readNBits
    45.90s  6.22% 13.98%     45.90s  6.22%  runtime/internal/syscall.Syscall6
    40.02s  5.43% 19.41%     40.20s  5.45%  github.com/decred/dcrd/crypto/blake256.block


$ pprof -top spvconnect/cpu.pprof | head
File: dcrwallet
Type: cpu
Time: Oct 19, 2023 at 11:29am (-03)
Duration: 6.34mins, Total samples = 445.08s (117.06%)
Showing nodes accounting for 381.27s, 85.66% of 445.08s total
Dropped 938 nodes (cum <= 2.23s)
      flat  flat%   sum%        cum   cum%
    56.36s 12.66% 12.66%     57.15s 12.84%  github.com/decred/dcrd/gcs/v4.(*bitReader).readNBits
    34.36s  7.72% 20.38%     34.36s  7.72%  github.com/dchest/siphash.Hash
    30.01s  6.74% 27.13%     30.01s  6.74%  runtime/internal/syscall.Syscall6

Going by the total Duration, reducing the duplication for the initial sync can (potentially) halve the time IBD takes.

My strategy for improving this is to:

1- Refactor the FetchMissingCfilters logic to make it syncer based instead of peer based #2291

  • This will allow the spv syncer to request the cfilters from multiple peers without duplication during startup

2- Dedupe header logic

  • getHeaders and handleBlockAnnouncements have a lot of logic in common, so it might be useful to dedupe/refactor the code.
  • The minimum peer version already implies sendHeaders support, so we can drop block inv announcements and rely only on the headers logic (as we should never receive block announcements via inv messages)
  • Peer init in connectToCandidates and connectToPersistent may also be somewhat dedupe'd.

3- Decouple header and cfilter fetching

  • The previously mentioned FetchMissingCfilters already gives the wallet the ability to have headers and cfilters at different sync heights.
  • This means we can decouple syncing, such that we may sync headers and cfilters at different speeds and from different peers.

4- Add an async GetCFilterV2 version

  • One of the limitations of the P2P network is that only a single CFilter may be requested on each msg
  • This leads to the current design, where up to 2k goroutines are initialized for each IBD peer and header range
  • Adding an async version of GetCFilterv2 to the peering pkg means we can drop the need for so many goroutines
  • The tradeoff is that syncing becomes more complex because the syncer will have to handle out of order and missing GCFilters

5- Abstract syncing from peer to syncer #2291

  • Instead of having each peer perform header sync during its startup procedure, abstract the logic to the syncer.
  • This means a central syncer func will fetch headers in sequence as needed from a select peer instead of all peers
  • Rescanning and address discovery logic should be abstracted from individual peers and support fetching blocks from multiple peers as well

6- Add perf tracking and peer rotation

  • One of the challenges in syncing from a single peer is that the peer may be slow, compared to others.
  • The syncer logic should periodically rotate peers, dropping peers that are behind the best known chain and verifying if other peers have better perf statistics
  • The syncer should be biased to using the best performing peers but with occasional peer rotation for exploration

The above is the rough plan, but might be subject to change, depending on how easy it is to keep the each step's refactoring reasonably small.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant