Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug in getState and getStateV2 #5440

Closed
matthewkeil opened this issue Apr 30, 2023 · 12 comments
Closed

Bug in getState and getStateV2 #5440

matthewkeil opened this issue Apr 30, 2023 · 12 comments

Comments

@matthewkeil
Copy link
Member

Describe the bug

I was attempting to resolve ticket #5409 and the epoch state transitions but cannot pull the historical state. I believe the system is timing out when attempting to recreate the state through state transition.

Attempted to pull state via slot and 0x prefixed hex stateRoot but was unable to pull the data. The named states head, finalized etc work fine but older states to not get returned.

Expected behavior

curl -H "Accept: application/octet-stream" https://lodestar-mainnet.chainsafe.io/eth/v2/debug/beacon/states/6123583 -o state_mainnet_6123583.ssz should output the state to disk in SSZ format.

Steps to Reproduce

run: curl -H "Accept: application/octet-stream" https://lodestar-mainnet.chainsafe.io/eth/v2/debug/beacon/states/6123583 -o state_mainnet_6123583.ssz
returns: 504

run: curl -H "Accept: application/octet-stream" https://lodestar-mainnet.chainsafe.io/eth/v2/debug/beacon/states/0xadd30839a0a9bb6705edf4b0ee08b160af85ecc4e7ffc7ec0198575e339397f0 -o state_mainnet_6123583.ssz
returns: 404 or 504 depending on the root used

Screenshots

Screen Shot 2023-04-30 at 2 32 27 PM

Desktop (please complete the following information):

  • OS: osx and linux
  • Version: current
  • Branch: several, found on stable and unstable. Not sure which others
  • Commit hash: HEAD
@matthewkeil
Copy link
Member Author

Found the error that is actually getting thrown:

throw Error("Validator indices must not be empty");

error: Req req-7 getState error Validator indices must not be empty

@dapplion
Copy link
Contributor

dapplion commented May 2, 2023

It's really weird to see this error error: Req req-7 getState error Validator indices must not be empty, since all states have some active validators.

@matthewkeil
Copy link
Member Author

matthewkeil commented May 2, 2023

Here is a copy of the full stack trace and a screenshot. Was running for almost 13 min and then threw the error. Had super high cpu till it crashed so wondering if it was a single state transition (in the EpochContext.afterProcessTransition that was bad to cause it). Have been researching and trying to debug that one to see why it happened.

May-01 22:03:31.340[rest]            error: Req req-8 getState error  Validator indices must not be empty
Error: Validator indices must not be empty
    at computeProposerIndex (file:///Users/matthewkeil/Documents/dev/chainsafe/lodestar/packages/state-transition/src/util/seed.ts:52:11)
    at computeProposers (file:///Users/matthewkeil/Documents/dev/chainsafe/lodestar/packages/state-transition/src/util/seed.ts:31:7)
    at EpochContext.afterProcessEpoch (file:///Users/matthewkeil/Documents/dev/chainsafe/lodestar/packages/state-transition/src/cache/epochContext.ts:468:22)
    at processSlotsWithTransientCache (file:///Users/matthewkeil/Documents/dev/chainsafe/lodestar/packages/state-transition/src/stateTransition.ts:174:26)
    at processSlots (file:///Users/matthewkeil/Documents/dev/chainsafe/lodestar/packages/state-transition/src/stateTransition.ts:136:15)
    at getFinalizedState (file:///Users/matthewkeil/Documents/dev/chainsafe/lodestar/packages/beacon-node/src/api/impl/beacon/state/utils.ts:267:13)
    at stateBySlot (file:///Users/matthewkeil/Documents/dev/chainsafe/lodestar/packages/beacon-node/src/api/impl/beacon/state/utils.ts:199:20)
    at resolveStateId (file:///Users/matthewkeil/Documents/dev/chainsafe/lodestar/packages/beacon-node/src/api/impl/beacon/state/utils.ts:42:40)
    at Object.getState (file:///Users/matthewkeil/Documents/dev/chainsafe/lodestar/packages/beacon-node/src/api/impl/debug/index.ts:44:23)
    at Object.handler (file:///Users/matthewkeil/Documents/dev/chainsafe/lodestar/packages/api/src/beacon/server/debug.ts:23:26)

Screen Shot 2023-05-01 at 10 09 22 PM

Screen Shot 2023-05-01 at 10 09 29 PM

@dapplion
Copy link
Contributor

dapplion commented May 2, 2023

You should dump to disk the state that this code path claims has no active validators to take a deeper look after the fact

@matthewkeil
Copy link
Member Author

matthewkeil commented May 4, 2023

slot that threw in processSlots: 17056
state buffer length passed in: 6729301

Captured via:

 curl http://127.0.0.1:9596/eth/v2/debug/beacon/states/6123583  -H "Accept: application/octet-stream" -o state_mainnet_6123583.ssz
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100    84    0    84    0     0      0      0 --:--:--  0:12:58 --:--:--    21

packages/beacon-node/src/api/impl/becon/state/utils.ts in function getFinalizedState

  let stateBeforeError!: CachedBeaconStateAllForks;
  let slotBeforeError = slot;
  try {
    // due to skip slots, may need to process empty slots to reach the requested slot
    if (state.slot < slot) {
      stateBeforeError = state.clone();
      slotBeforeError = state.slot;
      state = processSlots(state, slot);
    }
  } catch (err) {
    const filepath = path.resolve(path.dirname(fileURLToPath(import.meta.url)), `state_mainnet_${slot}.ssz`);
    const serialized = stateBeforeError.serialize();
    console.log(`state buffer length: ${serialized.length}\nslot passed into processSlots: ${slotBeforeError}\n`);
  }

@matthewkeil
Copy link
Member Author

Serialized state that was saved in ssz format.

state_mainnet_before_error.ssz.zip

@twoeths
Copy link
Contributor

twoeths commented May 4, 2023

@matthewkeil I'm not able to deserialize the downloaded state, could you give it a try?

Error: Offset out of bounds 1690959926 > 6729301
      at readVariableOffsets (/Users/tuyennguyen/Projects/lodestar/node_modules/@chainsafe/ssz/src/type/container.ts:451:13)

maybe you could try stateBeforeError.commit() before stateBeforeError.serialize()

@matthewkeil
Copy link
Member Author

I was successful pulling the historical state from a full-sync Lodestar node. Perhaps we can throw an error if not in the correct mode. We could determine if the call will fail so it fails quickly and with a better error. @dapplion what are your thoughts? Would it be better to look at state or just the config?

@dapplion
Copy link
Contributor

dapplion commented May 4, 2023

Let's archive this issue for now and revisit latter

@matthewkeil
Copy link
Member Author

Leaving a breadcrumb here. When the db attempts to pull the nearest state to regen from it eventually attempts to pull and the repo returns null when it does not find it cached here

@matthewkeil
Copy link
Member Author

I think #6033 will tackle this

@philknows
Copy link
Member

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants