Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimizations related to inial scan of balances #4999

Open
rasom opened this issue Mar 29, 2024 · 0 comments
Open

Optimizations related to inial scan of balances #4999

rasom opened this issue Mar 29, 2024 · 0 comments

Comments

@rasom
Copy link
Member

rasom commented Mar 29, 2024

That might be the option if we go with a full scan, otherwise we can do it differently but use some of those ideas (like doing full nonce scan before balance scan, or doing getLogs before nonces and balances)

The full scan algorithm which in my head (before i will get enough of sleep) looks like an optimal one for mainnet at least:

  1. Find first block with nonce=1 (first signed/outgoing tx, FOT) using nonce bisection
  2. bisect balances from zero block to FOT
  3. bisect ERC20 by balances from zero block to FOT
  4. go with eth_getLogs from FOT to latest block
  5. check nonce change in all blocks with found ERC20/721/1155 transfers (at least outgoing), there will be a high chance to find blocks with nonce change, thus lowering the number of nonce bisection requests
  6. Finish nonce bisection
  7. Finish balance bisection between outgoing txs found on step 6

That's for the case when we do not load entire history right away, we need to define some metric (or set of metrics) to figure out how deep we want to scan history initially. I t might be N last months, or less than that if account is very active, or it might be nonce based as proposed below

with what we have in develop, i tested acc with 562 transaction, 421 outgoing and that required 12674 requests, 23.1 per transaction. Let's say we want to use no more than 1000 requests per initial scan, we can then figure out that it will be around 1000/23.1=43.3 transactions, of which 43/562*421=32.2 are expected to be outgoing. Then the approach would be to find the range of last 30 or so nonces and scan it.

To get better numbers we can get a sample of "fat" accounts (that might be accounts that per our rough estimation require more than 1000 requests to be fully scanned), then we will get better number for average (for instance) expected requests per detected transaction, and average nonce/txs ratio.

With those numbers and defined limit of requests per initial scan we can come up with magic number of last nonces.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant