Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: async shuffling refactor #6521

Draft
wants to merge 11 commits into
base: unstable
Choose a base branch
from
Draft

Conversation

matthewkeil
Copy link
Member

@matthewkeil matthewkeil commented Mar 8, 2024

** NOTE: Note ready for review, but want to trigger CI **

Motivation

Move calculation of next shuffling to async to get it off of critical path during epoch transition. There is a full second during epoch transition used to calculate the epochCtx.nextShuffling and that can be moved to an async process. Refactored a few pieces of the EpochCache to make this work and will continue this by creating a worker that moves this calculation to a worker thread. By using a worker thread that is tuned down with NICE we can interleave the long calculation into thread idle time which is ideal. To be continued...

Description

  • Change how shufflings are built/cached. Original method was to build on the epochCtx and then to processState to move them to the ShufflingCache. Cleaned up that flow a bit to build/store the shufflings directly in the ShufflingCache
  • Moved full shufflings off the ShufflingCache and stored only the pieces we were using directly (length of activeValidators and the epoch numbers)
  • Move ShufflingCache from beacon-node to state-transition
  • Pass logger into EpochCache so its available for debugging issues with shuffling builds

@matthewkeil matthewkeil changed the title Mkeil/shuffle refactor 3 feat: async shuffling refactor Mar 8, 2024
let cachedState: CachedBeaconStateAllForks;
if (isCachedBeaconState(anchorState) && opts.skipCreateStateCacheIfAvailable) {
cachedState = anchorState;
cachedState.epochCtx.shufflingCache.addMetrics(metrics);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit of a funky edge case. I do not like adding the metrics here but there is an instance where the ShufflingCache need to be created before the metrics object (during genesis) and refactoring for that condition was not ideal. So in that circumstance the CachedBeaconState is built with a ShufflingCache and then when the seedState is passed into the chain the metrics are added to the class. This will be extended to also add the Logger when the worker thread for building is added.

@@ -45,7 +45,7 @@ describe("getShufflingForAttestationVerification", () => {
throw new Error("Unexpected input");
}
});
const expectedShuffling = {epoch: attEpoch} as EpochShuffling;
const expectedShuffling = {epoch: attEpoch} as unknown as EpochShuffling;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

epoch is no longer on EpochShuffling but using it here so there is something on the object to assert that its the same object

Copy link
Contributor

github-actions bot commented Mar 8, 2024

Performance Report

✔️ no performance regression detected

🚀🚀 Significant benchmark improvement detected

Benchmark suite Current: b613961 Previous: adc0534 Ratio
phase0 afterProcessEpoch - 250000 vs - 7PWei 9.5258 ms/op 112.14 ms/op 0.08
Full benchmark results
Benchmark suite Current: b613961 Previous: adc0534 Ratio
getPubkeys - index2pubkey - req 1000 vs - 250000 vc 808.69 us/op 792.50 us/op 1.02
getPubkeys - validatorsArr - req 1000 vs - 250000 vc 166.51 us/op 83.873 us/op 1.99
BLS verify - blst-native 1.4338 ms/op 1.3337 ms/op 1.08
BLS verifyMultipleSignatures 3 - blst-native 3.4845 ms/op 2.7201 ms/op 1.28
BLS verifyMultipleSignatures 8 - blst-native 6.7507 ms/op 6.0052 ms/op 1.12
BLS verifyMultipleSignatures 32 - blst-native 28.130 ms/op 21.969 ms/op 1.28
BLS verifyMultipleSignatures 64 - blst-native 62.987 ms/op 43.166 ms/op 1.46
BLS verifyMultipleSignatures 128 - blst-native 108.76 ms/op 86.357 ms/op 1.26
BLS deserializing 10000 signatures 998.97 ms/op 924.28 ms/op 1.08
BLS deserializing 100000 signatures 10.418 s/op 9.4577 s/op 1.10
BLS verifyMultipleSignatures - same message - 3 - blst-native 1.4400 ms/op 1.3320 ms/op 1.08
BLS verifyMultipleSignatures - same message - 8 - blst-native 1.6733 ms/op 1.6456 ms/op 1.02
BLS verifyMultipleSignatures - same message - 32 - blst-native 2.4824 ms/op 2.9228 ms/op 0.85
BLS verifyMultipleSignatures - same message - 64 - blst-native 3.7327 ms/op 4.4102 ms/op 0.85
BLS verifyMultipleSignatures - same message - 128 - blst-native 6.1825 ms/op 7.9726 ms/op 0.78
BLS aggregatePubkeys 32 - blst-native 28.600 us/op 25.918 us/op 1.10
BLS aggregatePubkeys 128 - blst-native 109.62 us/op 100.81 us/op 1.09
notSeenSlots=1 numMissedVotes=1 numBadVotes=10 111.68 ms/op 67.440 ms/op 1.66
notSeenSlots=1 numMissedVotes=0 numBadVotes=4 110.87 ms/op 63.667 ms/op 1.74
notSeenSlots=2 numMissedVotes=1 numBadVotes=10 61.930 ms/op 36.440 ms/op 1.70
getSlashingsAndExits - default max 235.53 us/op 203.78 us/op 1.16
getSlashingsAndExits - 2k 643.29 us/op 651.26 us/op 0.99
proposeBlockBody type=full, size=empty 5.8215 ms/op 5.3843 ms/op 1.08
isKnown best case - 1 super set check 404.00 ns/op 379.00 ns/op 1.07
isKnown normal case - 2 super set checks 330.00 ns/op 532.00 ns/op 0.62
isKnown worse case - 16 super set checks 326.00 ns/op 599.00 ns/op 0.54
CheckpointStateCache - add get delete 6.8540 us/op 7.6150 us/op 0.90
validate api signedAggregateAndProof - struct 2.8864 ms/op 3.0116 ms/op 0.96
validate gossip signedAggregateAndProof - struct 2.8918 ms/op 2.8203 ms/op 1.03
validate gossip attestation - vc 640000 1.3730 ms/op 1.3874 ms/op 0.99
batch validate gossip attestation - vc 640000 - chunk 32 159.38 us/op 168.47 us/op 0.95
batch validate gossip attestation - vc 640000 - chunk 64 140.72 us/op 146.72 us/op 0.96
batch validate gossip attestation - vc 640000 - chunk 128 143.82 us/op 141.39 us/op 1.02
batch validate gossip attestation - vc 640000 - chunk 256 148.50 us/op 130.31 us/op 1.14
pickEth1Vote - no votes 1.4879 ms/op 1.1663 ms/op 1.28
pickEth1Vote - max votes 14.622 ms/op 9.8931 ms/op 1.48
pickEth1Vote - Eth1Data hashTreeRoot value x2048 22.139 ms/op 16.455 ms/op 1.35
pickEth1Vote - Eth1Data hashTreeRoot tree x2048 27.896 ms/op 23.089 ms/op 1.21
pickEth1Vote - Eth1Data fastSerialize value x2048 759.74 us/op 620.45 us/op 1.22
pickEth1Vote - Eth1Data fastSerialize tree x2048 5.8967 ms/op 4.3950 ms/op 1.34
bytes32 toHexString 785.00 ns/op 532.00 ns/op 1.48
bytes32 Buffer.toString(hex) 349.00 ns/op 295.00 ns/op 1.18
bytes32 Buffer.toString(hex) from Uint8Array 568.00 ns/op 428.00 ns/op 1.33
bytes32 Buffer.toString(hex) + 0x 330.00 ns/op 292.00 ns/op 1.13
Object access 1 prop 0.21300 ns/op 0.16800 ns/op 1.27
Map access 1 prop 0.15800 ns/op 0.15400 ns/op 1.03
Object get x1000 8.1140 ns/op 7.3500 ns/op 1.10
Map get x1000 0.86400 ns/op 0.76700 ns/op 1.13
Object set x1000 64.244 ns/op 52.256 ns/op 1.23
Map set x1000 45.434 ns/op 41.214 ns/op 1.10
Return object 10000 times 0.25190 ns/op 0.24490 ns/op 1.03
Throw Error 10000 times 4.0930 us/op 3.9163 us/op 1.05
fastMsgIdFn sha256 / 200 bytes 3.4930 us/op 3.3970 us/op 1.03
fastMsgIdFn h32 xxhash / 200 bytes 379.00 ns/op 317.00 ns/op 1.20
fastMsgIdFn h64 xxhash / 200 bytes 379.00 ns/op 348.00 ns/op 1.09
fastMsgIdFn sha256 / 1000 bytes 11.863 us/op 11.370 us/op 1.04
fastMsgIdFn h32 xxhash / 1000 bytes 497.00 ns/op 417.00 ns/op 1.19
fastMsgIdFn h64 xxhash / 1000 bytes 483.00 ns/op 458.00 ns/op 1.05
fastMsgIdFn sha256 / 10000 bytes 107.76 us/op 104.97 us/op 1.03
fastMsgIdFn h32 xxhash / 10000 bytes 2.1350 us/op 1.9730 us/op 1.08
fastMsgIdFn h64 xxhash / 10000 bytes 1.4660 us/op 1.3830 us/op 1.06
send data - 1000 256B messages 22.046 ms/op 19.901 ms/op 1.11
send data - 1000 512B messages 33.079 ms/op 28.055 ms/op 1.18
send data - 1000 1024B messages 43.995 ms/op 41.059 ms/op 1.07
send data - 1000 1200B messages 43.890 ms/op 37.236 ms/op 1.18
send data - 1000 2048B messages 55.579 ms/op 48.863 ms/op 1.14
send data - 1000 4096B messages 44.914 ms/op 44.281 ms/op 1.01
send data - 1000 16384B messages 133.09 ms/op 117.00 ms/op 1.14
send data - 1000 65536B messages 531.63 ms/op 471.40 ms/op 1.13
enrSubnets - fastDeserialize 64 bits 1.5610 us/op 1.3310 us/op 1.17
enrSubnets - ssz BitVector 64 bits 665.00 ns/op 445.00 ns/op 1.49
enrSubnets - fastDeserialize 4 bits 263.00 ns/op 196.00 ns/op 1.34
enrSubnets - ssz BitVector 4 bits 656.00 ns/op 466.00 ns/op 1.41
prioritizePeers score -10:0 att 32-0.1 sync 2-0 123.50 us/op 104.86 us/op 1.18
prioritizePeers score 0:0 att 32-0.25 sync 2-0.25 157.90 us/op 132.87 us/op 1.19
prioritizePeers score 0:0 att 32-0.5 sync 2-0.5 218.62 us/op 175.69 us/op 1.24
prioritizePeers score 0:0 att 64-0.75 sync 4-0.75 354.18 us/op 297.62 us/op 1.19
prioritizePeers score 0:0 att 64-1 sync 4-1 401.29 us/op 368.83 us/op 1.09
array of 16000 items push then shift 1.8536 us/op 1.6256 us/op 1.14
LinkedList of 16000 items push then shift 10.465 ns/op 9.0490 ns/op 1.16
array of 16000 items push then pop 118.69 ns/op 59.223 ns/op 2.00
LinkedList of 16000 items push then pop 9.5370 ns/op 8.8570 ns/op 1.08
array of 24000 items push then shift 2.6889 us/op 2.4041 us/op 1.12
LinkedList of 24000 items push then shift 10.393 ns/op 8.8960 ns/op 1.17
array of 24000 items push then pop 165.55 ns/op 114.00 ns/op 1.45
LinkedList of 24000 items push then pop 9.9040 ns/op 8.7010 ns/op 1.14
intersect bitArray bitLen 8 6.4680 ns/op 5.7850 ns/op 1.12
intersect array and set length 8 83.520 ns/op 64.743 ns/op 1.29
intersect bitArray bitLen 128 38.499 ns/op 35.272 ns/op 1.09
intersect array and set length 128 1.1950 us/op 948.41 ns/op 1.26
bitArray.getTrueBitIndexes() bitLen 128 1.8030 us/op 1.5620 us/op 1.15
bitArray.getTrueBitIndexes() bitLen 248 3.2730 us/op 2.8920 us/op 1.13
bitArray.getTrueBitIndexes() bitLen 512 6.5630 us/op 5.2420 us/op 1.25
Buffer.concat 32 items 1.0480 us/op 1.0970 us/op 0.96
Uint8Array.set 32 items 2.4320 us/op 2.6580 us/op 0.91
Set add up to 64 items then delete first 5.4080 us/op 4.3039 us/op 1.26
OrderedSet add up to 64 items then delete first 7.1814 us/op 5.3629 us/op 1.34
Set add up to 64 items then delete last 5.7434 us/op 4.5334 us/op 1.27
OrderedSet add up to 64 items then delete last 7.5478 us/op 5.6186 us/op 1.34
Set add up to 64 items then delete middle 5.6896 us/op 4.4868 us/op 1.27
OrderedSet add up to 64 items then delete middle 9.1165 us/op 6.9857 us/op 1.31
Set add up to 128 items then delete first 11.788 us/op 9.4838 us/op 1.24
OrderedSet add up to 128 items then delete first 16.257 us/op 12.161 us/op 1.34
Set add up to 128 items then delete last 11.709 us/op 9.1715 us/op 1.28
OrderedSet add up to 128 items then delete last 15.139 us/op 11.275 us/op 1.34
Set add up to 128 items then delete middle 11.587 us/op 9.0236 us/op 1.28
OrderedSet add up to 128 items then delete middle 21.629 us/op 16.624 us/op 1.30
Set add up to 256 items then delete first 23.837 us/op 18.709 us/op 1.27
OrderedSet add up to 256 items then delete first 32.559 us/op 24.900 us/op 1.31
Set add up to 256 items then delete last 22.664 us/op 17.899 us/op 1.27
OrderedSet add up to 256 items then delete last 31.402 us/op 22.785 us/op 1.38
Set add up to 256 items then delete middle 22.998 us/op 18.059 us/op 1.27
OrderedSet add up to 256 items then delete middle 55.379 us/op 44.531 us/op 1.24
transfer serialized Status (84 B) 2.2230 us/op 1.6270 us/op 1.37
copy serialized Status (84 B) 1.5360 us/op 1.1970 us/op 1.28
transfer serialized SignedVoluntaryExit (112 B) 2.3030 us/op 1.8140 us/op 1.27
copy serialized SignedVoluntaryExit (112 B) 1.5360 us/op 1.2770 us/op 1.20
transfer serialized ProposerSlashing (416 B) 2.4800 us/op 2.8560 us/op 0.87
copy serialized ProposerSlashing (416 B) 2.4210 us/op 2.6770 us/op 0.90
transfer serialized Attestation (485 B) 3.1420 us/op 2.6360 us/op 1.19
copy serialized Attestation (485 B) 2.5720 us/op 2.4990 us/op 1.03
transfer serialized AttesterSlashing (33232 B) 3.7200 us/op 2.4860 us/op 1.50
copy serialized AttesterSlashing (33232 B) 10.398 us/op 6.3920 us/op 1.63
transfer serialized Small SignedBeaconBlock (128000 B) 4.8740 us/op 2.8310 us/op 1.72
copy serialized Small SignedBeaconBlock (128000 B) 28.665 us/op 15.038 us/op 1.91
transfer serialized Avg SignedBeaconBlock (200000 B) 5.1470 us/op 3.3940 us/op 1.52
copy serialized Avg SignedBeaconBlock (200000 B) 43.891 us/op 20.602 us/op 2.13
transfer serialized BlobsSidecar (524380 B) 5.2130 us/op 3.4700 us/op 1.50
copy serialized BlobsSidecar (524380 B) 114.01 us/op 120.55 us/op 0.95
transfer serialized Big SignedBeaconBlock (1000000 B) 5.4320 us/op 3.0400 us/op 1.79
copy serialized Big SignedBeaconBlock (1000000 B) 236.26 us/op 380.64 us/op 0.62
pass gossip attestations to forkchoice per slot 7.1293 ms/op 3.7688 ms/op 1.89
forkChoice updateHead vc 100000 bc 64 eq 0 763.00 us/op 672.05 us/op 1.14
forkChoice updateHead vc 600000 bc 64 eq 0 6.2779 ms/op 4.0664 ms/op 1.54
forkChoice updateHead vc 1000000 bc 64 eq 0 8.6348 ms/op 6.8946 ms/op 1.25
forkChoice updateHead vc 600000 bc 320 eq 0 4.8957 ms/op 4.1540 ms/op 1.18
forkChoice updateHead vc 600000 bc 1200 eq 0 5.1534 ms/op 4.2837 ms/op 1.20
forkChoice updateHead vc 600000 bc 7200 eq 0 6.1694 ms/op 5.3699 ms/op 1.15
forkChoice updateHead vc 600000 bc 64 eq 1000 12.005 ms/op 10.915 ms/op 1.10
forkChoice updateHead vc 600000 bc 64 eq 10000 13.295 ms/op 11.636 ms/op 1.14
forkChoice updateHead vc 600000 bc 64 eq 300000 22.280 ms/op 15.467 ms/op 1.44
computeDeltas 500000 validators 300 proto nodes 6.8713 ms/op 6.6073 ms/op 1.04
computeDeltas 500000 validators 1200 proto nodes 6.5748 ms/op 6.3694 ms/op 1.03
computeDeltas 500000 validators 7200 proto nodes 6.5105 ms/op 6.4834 ms/op 1.00
computeDeltas 750000 validators 300 proto nodes 10.230 ms/op 9.7722 ms/op 1.05
computeDeltas 750000 validators 1200 proto nodes 9.7887 ms/op 9.7933 ms/op 1.00
computeDeltas 750000 validators 7200 proto nodes 9.9933 ms/op 9.7276 ms/op 1.03
computeDeltas 1400000 validators 300 proto nodes 19.397 ms/op 17.968 ms/op 1.08
computeDeltas 1400000 validators 1200 proto nodes 19.377 ms/op 17.844 ms/op 1.09
computeDeltas 1400000 validators 7200 proto nodes 19.628 ms/op 17.858 ms/op 1.10
computeDeltas 2100000 validators 300 proto nodes 29.726 ms/op 26.869 ms/op 1.11
computeDeltas 2100000 validators 1200 proto nodes 28.949 ms/op 27.255 ms/op 1.06
computeDeltas 2100000 validators 7200 proto nodes 29.698 ms/op 26.350 ms/op 1.13
altair processAttestation - 250000 vs - 7PWei normalcase 2.7614 ms/op 2.9283 ms/op 0.94
altair processAttestation - 250000 vs - 7PWei worstcase 3.7242 ms/op 4.0120 ms/op 0.93
altair processAttestation - setStatus - 1/6 committees join 160.96 us/op 213.67 us/op 0.75
altair processAttestation - setStatus - 1/3 committees join 304.91 us/op 429.12 us/op 0.71
altair processAttestation - setStatus - 1/2 committees join 411.92 us/op 581.93 us/op 0.71
altair processAttestation - setStatus - 2/3 committees join 512.90 us/op 652.26 us/op 0.79
altair processAttestation - setStatus - 4/5 committees join 723.44 us/op 995.01 us/op 0.73
altair processAttestation - setStatus - 100% committees join 859.50 us/op 1.1058 ms/op 0.78
altair processBlock - 250000 vs - 7PWei normalcase 10.692 ms/op 7.9389 ms/op 1.35
altair processBlock - 250000 vs - 7PWei normalcase hashState 36.413 ms/op 34.314 ms/op 1.06
altair processBlock - 250000 vs - 7PWei worstcase 41.037 ms/op 38.807 ms/op 1.06
altair processBlock - 250000 vs - 7PWei worstcase hashState 119.33 ms/op 90.515 ms/op 1.32
phase0 processBlock - 250000 vs - 7PWei normalcase 3.2677 ms/op 2.8609 ms/op 1.14
phase0 processBlock - 250000 vs - 7PWei worstcase 34.232 ms/op 28.893 ms/op 1.18
altair processEth1Data - 250000 vs - 7PWei normalcase 709.33 us/op 476.37 us/op 1.49
getExpectedWithdrawals 250000 eb:1,eth1:1,we:0,wn:0,smpl:15 16.786 us/op 7.4280 us/op 2.26
getExpectedWithdrawals 250000 eb:0.95,eth1:0.1,we:0.05,wn:0,smpl:219 66.303 us/op 32.848 us/op 2.02
getExpectedWithdrawals 250000 eb:0.95,eth1:0.3,we:0.05,wn:0,smpl:42 28.461 us/op 10.765 us/op 2.64
getExpectedWithdrawals 250000 eb:0.95,eth1:0.7,we:0.05,wn:0,smpl:18 20.178 us/op 10.203 us/op 1.98
getExpectedWithdrawals 250000 eb:0.1,eth1:0.1,we:0,wn:0,smpl:1020 213.33 us/op 119.64 us/op 1.78
getExpectedWithdrawals 250000 eb:0.03,eth1:0.03,we:0,wn:0,smpl:11777 1.6168 ms/op 1.0326 ms/op 1.57
getExpectedWithdrawals 250000 eb:0.01,eth1:0.01,we:0,wn:0,smpl:16384 2.3482 ms/op 1.4912 ms/op 1.57
getExpectedWithdrawals 250000 eb:0,eth1:0,we:0,wn:0,smpl:16384 1.9789 ms/op 1.5262 ms/op 1.30
getExpectedWithdrawals 250000 eb:0,eth1:0,we:0,wn:0,nocache,smpl:16384 4.4760 ms/op 3.3999 ms/op 1.32
getExpectedWithdrawals 250000 eb:0,eth1:1,we:0,wn:0,smpl:16384 3.1252 ms/op 2.3292 ms/op 1.34
getExpectedWithdrawals 250000 eb:0,eth1:1,we:0,wn:0,nocache,smpl:16384 7.6084 ms/op 5.2056 ms/op 1.46
Tree 40 250000 create 406.37 ms/op 343.02 ms/op 1.18
Tree 40 250000 get(125000) 219.00 ns/op 193.49 ns/op 1.13
Tree 40 250000 set(125000) 1.0892 us/op 1.0295 us/op 1.06
Tree 40 250000 toArray() 23.353 ms/op 20.186 ms/op 1.16
Tree 40 250000 iterate all - toArray() + loop 25.100 ms/op 17.659 ms/op 1.42
Tree 40 250000 iterate all - get(i) 77.552 ms/op 64.553 ms/op 1.20
MutableVector 250000 create 17.905 ms/op 12.070 ms/op 1.48
MutableVector 250000 get(125000) 6.9500 ns/op 6.3850 ns/op 1.09
MutableVector 250000 set(125000) 320.65 ns/op 250.73 ns/op 1.28
MutableVector 250000 toArray() 4.0278 ms/op 2.7717 ms/op 1.45
MutableVector 250000 iterate all - toArray() + loop 4.1412 ms/op 2.8871 ms/op 1.43
MutableVector 250000 iterate all - get(i) 1.6019 ms/op 1.5245 ms/op 1.05
Array 250000 create 3.6988 ms/op 2.5386 ms/op 1.46
Array 250000 clone - spread 1.4953 ms/op 1.1837 ms/op 1.26
Array 250000 get(125000) 1.2350 ns/op 1.0230 ns/op 1.21
Array 250000 set(125000) 5.2550 ns/op 4.0410 ns/op 1.30
Array 250000 iterate all - loop 173.50 us/op 165.44 us/op 1.05
effectiveBalanceIncrements clone Uint8Array 300000 42.482 us/op 28.045 us/op 1.51
effectiveBalanceIncrements clone MutableVector 300000 455.00 ns/op 360.00 ns/op 1.26
effectiveBalanceIncrements rw all Uint8Array 300000 208.67 us/op 199.10 us/op 1.05
effectiveBalanceIncrements rw all MutableVector 300000 101.14 ms/op 81.252 ms/op 1.24
phase0 afterProcessEpoch - 250000 vs - 7PWei 9.5258 ms/op 112.14 ms/op 0.08
phase0 beforeProcessEpoch - 250000 vs - 7PWei 42.113 ms/op 50.768 ms/op 0.83
altair processEpoch - mainnet_e81889 394.36 ms/op 484.06 ms/op 0.81
mainnet_e81889 - altair beforeProcessEpoch 82.212 ms/op 81.149 ms/op 1.01
mainnet_e81889 - altair processJustificationAndFinalization 23.770 us/op 15.167 us/op 1.57
mainnet_e81889 - altair processInactivityUpdates 5.8206 ms/op 5.6592 ms/op 1.03
mainnet_e81889 - altair processRewardsAndPenalties 63.721 ms/op 39.039 ms/op 1.63
mainnet_e81889 - altair processRegistryUpdates 2.6900 us/op 2.3670 us/op 1.14
mainnet_e81889 - altair processSlashings 505.00 ns/op 490.00 ns/op 1.03
mainnet_e81889 - altair processEth1DataReset 603.00 ns/op 467.00 ns/op 1.29
mainnet_e81889 - altair processEffectiveBalanceUpdates 2.0540 ms/op 1.4377 ms/op 1.43
mainnet_e81889 - altair processSlashingsReset 7.1900 us/op 3.3510 us/op 2.15
mainnet_e81889 - altair processRandaoMixesReset 7.3850 us/op 4.6060 us/op 1.60
mainnet_e81889 - altair processHistoricalRootsUpdate 1.7070 us/op 675.00 ns/op 2.53
mainnet_e81889 - altair processParticipationFlagUpdates 2.7490 us/op 3.2150 us/op 0.86
mainnet_e81889 - altair processSyncCommitteeUpdates 594.00 ns/op 668.00 ns/op 0.89
mainnet_e81889 - altair afterProcessEpoch 9.1597 ms/op 115.69 ms/op 0.08
capella processEpoch - mainnet_e217614 1.8865 s/op 1.7714 s/op 1.06
mainnet_e217614 - capella beforeProcessEpoch 480.08 ms/op 452.12 ms/op 1.06
mainnet_e217614 - capella processJustificationAndFinalization 19.675 us/op 17.060 us/op 1.15
mainnet_e217614 - capella processInactivityUpdates 22.381 ms/op 22.958 ms/op 0.97
mainnet_e217614 - capella processRewardsAndPenalties 467.38 ms/op 476.85 ms/op 0.98
mainnet_e217614 - capella processRegistryUpdates 42.298 us/op 22.317 us/op 1.90
mainnet_e217614 - capella processSlashings 1.0850 us/op 451.00 ns/op 2.41
mainnet_e217614 - capella processEth1DataReset 775.00 ns/op 537.00 ns/op 1.44
mainnet_e217614 - capella processEffectiveBalanceUpdates 4.7199 ms/op 5.4559 ms/op 0.87
mainnet_e217614 - capella processSlashingsReset 6.5550 us/op 3.2850 us/op 2.00
mainnet_e217614 - capella processRandaoMixesReset 7.9070 us/op 5.2710 us/op 1.50
mainnet_e217614 - capella processHistoricalRootsUpdate 1.2360 us/op 602.00 ns/op 2.05
mainnet_e217614 - capella processParticipationFlagUpdates 2.2410 us/op 4.5950 us/op 0.49
mainnet_e217614 - capella afterProcessEpoch 8.5972 ms/op 307.65 ms/op 0.03
phase0 processEpoch - mainnet_e58758 444.59 ms/op 516.91 ms/op 0.86
mainnet_e58758 - phase0 beforeProcessEpoch 137.45 ms/op 144.26 ms/op 0.95
mainnet_e58758 - phase0 processJustificationAndFinalization 25.371 us/op 16.297 us/op 1.56
mainnet_e58758 - phase0 processRewardsAndPenalties 64.070 ms/op 53.819 ms/op 1.19
mainnet_e58758 - phase0 processRegistryUpdates 15.836 us/op 9.7940 us/op 1.62
mainnet_e58758 - phase0 processSlashings 654.00 ns/op 635.00 ns/op 1.03
mainnet_e58758 - phase0 processEth1DataReset 691.00 ns/op 816.00 ns/op 0.85
mainnet_e58758 - phase0 processEffectiveBalanceUpdates 2.1403 ms/op 1.1929 ms/op 1.79
mainnet_e58758 - phase0 processSlashingsReset 4.1980 us/op 2.4360 us/op 1.72
mainnet_e58758 - phase0 processRandaoMixesReset 6.2250 us/op 4.2400 us/op 1.47
mainnet_e58758 - phase0 processHistoricalRootsUpdate 659.00 ns/op 606.00 ns/op 1.09
mainnet_e58758 - phase0 processParticipationRecordUpdates 6.0560 us/op 4.9480 us/op 1.22
mainnet_e58758 - phase0 afterProcessEpoch 8.3624 ms/op 101.74 ms/op 0.08
phase0 processEffectiveBalanceUpdates - 250000 normalcase 2.5963 ms/op 1.3767 ms/op 1.89
phase0 processEffectiveBalanceUpdates - 250000 worstcase 0.5 1.4664 ms/op 1.5362 ms/op 0.95
altair processInactivityUpdates - 250000 normalcase 34.293 ms/op 32.523 ms/op 1.05
altair processInactivityUpdates - 250000 worstcase 34.463 ms/op 24.252 ms/op 1.42
phase0 processRegistryUpdates - 250000 normalcase 12.870 us/op 14.537 us/op 0.89
phase0 processRegistryUpdates - 250000 badcase_full_deposits 630.07 us/op 462.05 us/op 1.36
phase0 processRegistryUpdates - 250000 worstcase 0.5 144.28 ms/op 142.86 ms/op 1.01
altair processRewardsAndPenalties - 250000 normalcase 67.482 ms/op 65.509 ms/op 1.03
altair processRewardsAndPenalties - 250000 worstcase 67.273 ms/op 61.436 ms/op 1.10
phase0 getAttestationDeltas - 250000 normalcase 9.0912 ms/op 10.792 ms/op 0.84
phase0 getAttestationDeltas - 250000 worstcase 8.8824 ms/op 9.9359 ms/op 0.89
phase0 processSlashings - 250000 worstcase 131.90 us/op 97.945 us/op 1.35
altair processSyncCommitteeUpdates - 250000 149.24 ms/op 161.07 ms/op 0.93
BeaconState.hashTreeRoot - No change 247.00 ns/op 371.00 ns/op 0.67
BeaconState.hashTreeRoot - 1 full validator 147.89 us/op 122.53 us/op 1.21
BeaconState.hashTreeRoot - 32 full validator 1.6378 ms/op 1.1514 ms/op 1.42
BeaconState.hashTreeRoot - 512 full validator 16.635 ms/op 14.558 ms/op 1.14
BeaconState.hashTreeRoot - 1 validator.effectiveBalance 153.47 us/op 183.32 us/op 0.84
BeaconState.hashTreeRoot - 32 validator.effectiveBalance 2.2762 ms/op 2.1488 ms/op 1.06
BeaconState.hashTreeRoot - 512 validator.effectiveBalance 35.254 ms/op 33.190 ms/op 1.06
BeaconState.hashTreeRoot - 1 balances 146.39 us/op 135.09 us/op 1.08
BeaconState.hashTreeRoot - 32 balances 1.2142 ms/op 1.2129 ms/op 1.00
BeaconState.hashTreeRoot - 512 balances 13.828 ms/op 13.793 ms/op 1.00
BeaconState.hashTreeRoot - 250000 balances 227.79 ms/op 225.18 ms/op 1.01
aggregationBits - 2048 els - zipIndexesInBitList 25.700 us/op 70.920 us/op 0.36
byteArrayEquals 32 74.486 ns/op 75.090 ns/op 0.99
Buffer.compare 32 55.504 ns/op 55.817 ns/op 0.99
byteArrayEquals 1024 2.0428 us/op 2.0457 us/op 1.00
Buffer.compare 1024 72.694 ns/op 70.502 ns/op 1.03
byteArrayEquals 16384 32.563 us/op 32.557 us/op 1.00
Buffer.compare 16384 252.88 ns/op 270.28 ns/op 0.94
byteArrayEquals 123687377 242.78 ms/op 252.72 ms/op 0.96
Buffer.compare 123687377 6.3092 ms/op 8.5285 ms/op 0.74
byteArrayEquals 32 - diff last byte 72.437 ns/op 74.156 ns/op 0.98
Buffer.compare 32 - diff last byte 56.371 ns/op 57.229 ns/op 0.99
byteArrayEquals 1024 - diff last byte 2.0634 us/op 2.6518 us/op 0.78
Buffer.compare 1024 - diff last byte 73.310 ns/op 81.031 ns/op 0.90
byteArrayEquals 16384 - diff last byte 33.345 us/op 33.825 us/op 0.99
Buffer.compare 16384 - diff last byte 281.38 ns/op 254.75 ns/op 1.10
byteArrayEquals 123687377 - diff last byte 249.42 ms/op 257.30 ms/op 0.97
Buffer.compare 123687377 - diff last byte 6.8335 ms/op 6.9196 ms/op 0.99
byteArrayEquals 32 - random bytes 5.5360 ns/op 5.3910 ns/op 1.03
Buffer.compare 32 - random bytes 62.638 ns/op 62.462 ns/op 1.00
byteArrayEquals 1024 - random bytes 5.2600 ns/op 5.2140 ns/op 1.01
Buffer.compare 1024 - random bytes 61.120 ns/op 60.655 ns/op 1.01
byteArrayEquals 16384 - random bytes 5.2440 ns/op 5.1710 ns/op 1.01
Buffer.compare 16384 - random bytes 62.760 ns/op 60.324 ns/op 1.04
byteArrayEquals 123687377 - random bytes 8.6000 ns/op 8.4300 ns/op 1.02
Buffer.compare 123687377 - random bytes 67.050 ns/op 63.500 ns/op 1.06
regular array get 100000 times 45.530 us/op 43.936 us/op 1.04
wrappedArray get 100000 times 45.099 us/op 44.778 us/op 1.01
arrayWithProxy get 100000 times 15.670 ms/op 14.936 ms/op 1.05
ssz.Root.equals 55.199 ns/op 54.392 ns/op 1.01
byteArrayEquals 54.358 ns/op 54.348 ns/op 1.00
Buffer.compare 11.047 ns/op 11.401 ns/op 0.97
shuffle list - 16384 els 8.6635 ms/op 8.6133 ms/op 1.01
shuffle list - 250000 els 130.68 ms/op 124.99 ms/op 1.05
processSlot - 1 slots 16.520 us/op 17.420 us/op 0.95
processSlot - 32 slots 4.1946 ms/op 3.3153 ms/op 1.27
getEffectiveBalanceIncrementsZeroInactive - 250000 vs - 7PWei 64.553 ms/op 58.732 ms/op 1.10
getCommitteeAssignments - req 1 vs - 250000 vc 2.7080 ms/op 2.6537 ms/op 1.02
getCommitteeAssignments - req 100 vs - 250000 vc 3.9087 ms/op 3.8348 ms/op 1.02
getCommitteeAssignments - req 1000 vs - 250000 vc 4.2717 ms/op 4.1876 ms/op 1.02
findModifiedValidators - 10000 modified validators 520.55 ms/op 556.14 ms/op 0.94
findModifiedValidators - 1000 modified validators 426.48 ms/op 385.66 ms/op 1.11
findModifiedValidators - 100 modified validators 395.75 ms/op 415.96 ms/op 0.95
findModifiedValidators - 10 modified validators 395.42 ms/op 394.53 ms/op 1.00
findModifiedValidators - 1 modified validators 415.29 ms/op 399.70 ms/op 1.04
findModifiedValidators - no difference 412.30 ms/op 410.56 ms/op 1.00
compare ViewDUs 4.9412 s/op 4.2832 s/op 1.15
compare each validator Uint8Array 1.7897 s/op 1.5276 s/op 1.17
compare ViewDU to Uint8Array 1.4221 s/op 1.0780 s/op 1.32
migrate state 1000000 validators, 24 modified, 0 new 883.00 ms/op 787.74 ms/op 1.12
migrate state 1000000 validators, 1700 modified, 1000 new 1.1843 s/op 1.0623 s/op 1.11
migrate state 1000000 validators, 3400 modified, 2000 new 1.5765 s/op 1.2952 s/op 1.22
migrate state 1500000 validators, 24 modified, 0 new 1.0120 s/op 776.07 ms/op 1.30
migrate state 1500000 validators, 1700 modified, 1000 new 1.2797 s/op 1.0834 s/op 1.18
migrate state 1500000 validators, 3400 modified, 2000 new 1.6850 s/op 1.3105 s/op 1.29
RootCache.getBlockRootAtSlot - 250000 vs - 7PWei 5.5100 ns/op 4.2100 ns/op 1.31
state getBlockRootAtSlot - 250000 vs - 7PWei 791.59 ns/op 615.59 ns/op 1.29
computeProposers - vc 250000 11.234 ms/op 8.6634 ms/op 1.30
computeEpochShuffling - vc 250000 141.61 ms/op 122.81 ms/op 1.15
getNextSyncCommittee - vc 250000 178.07 ms/op 159.66 ms/op 1.12
computeSigningRoot for AttestationData 31.590 us/op 28.031 us/op 1.13
hash AttestationData serialized data then Buffer.toString(base64) 2.5710 us/op 2.2450 us/op 1.15
toHexString serialized data 1.6991 us/op 1.0674 us/op 1.59
Buffer.toString(base64) 289.51 ns/op 212.93 ns/op 1.36

by benchmarkbot/action

@philknows philknows added this to the v1.19.0 milestone Mar 19, 2024
@twoeths
Copy link
Contributor

twoeths commented Mar 27, 2024

this PR is not aligned with the high level design stated in in #6386 where it's recommended to move shuffling from state-transition to beacon-node. Some benefits of that approach:

  • beacon-node is the consumer of shuffling, it should just use the current ShufflingCache there, enhance if needed
  • we want to keep state-transition simple with no async/await
  • also it's more convenient to implement offloading next shuffling computation in beacon-node, there's already a couple of worker implementations there

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants