feat: de-duplicate payloads from persisted beacon blocks #6029

matthewkeil · 2023-10-09T22:08:14Z

NOTE: The Sim Merge Test is not going to pass. The container that it runs one test in needs to be updated. @g11tech is going to look for the Dockerfile and I will help get it updated and published so it will pass. The image is based on a pre-shanghai image that does not have engine_getPayloadBodiesByHashV1 available. This is the image:
https://hub.docker.com/r/g11tech/mergemock

Two things still need to be double checked before moving to ready:

- double check that getBlock works as expected
- get valid deneb block (ask @g11tech how to generate with valid data. perhaps can pull from devnet 9??)
- turn on deneb block unit tests. need to add a value for the fork epoch and spoof valid slots in the mocks
- convert the fixtures to .ssz format to reduce the diff
- proof on a benchmark the need for serialized conversion in packages/beacon-node/src/util/fullOrBlindedBlock.ts
- convert generator serialized conversion to promise and re-test perf as promise
- remove excess codepath from results above
- Fix sim-test eth1 engine mock to support engine_getPayloadBodiesByHashV1

Motivation

Lodestar is saving data that is also saved in the execution client database. In particular we are persisting transactions and withdrawals in the block and blockArchive databases.

Description

Stores blinded blocks in both the hot and cold db. Modifies calls for data retrieval that require the full block, ReqResp and API, to splice in the missing transactions and withdrawals.

Closes #5671

** How to test **

Extensive unit and perf testing was conducted to make sure that this should work correctly.

yarn test:unit
yarn benchmark:files packages/beacon-node/test/perf/util/fullOrBlindedBlock.test.ts

github-actions · 2023-10-09T22:36:35Z

Performance Report

✔️ no performance regression detected

Full benchmark results

Benchmark suite	Current: `3d24afb`	Previous: `2b5935a`	Ratio
getPubkeys - index2pubkey - req 1000 vs - 250000 vc	779.92 us/op	796.24 us/op	0.98
getPubkeys - validatorsArr - req 1000 vs - 250000 vc	56.687 us/op	47.995 us/op	1.18
BLS verify - blst-native	1.1833 ms/op	1.0721 ms/op	1.10
BLS verifyMultipleSignatures 3 - blst-native	2.5468 ms/op	2.2878 ms/op	1.11
BLS verifyMultipleSignatures 8 - blst-native	5.4344 ms/op	5.0705 ms/op	1.07
BLS verifyMultipleSignatures 32 - blst-native	20.657 ms/op	18.628 ms/op	1.11
BLS verifyMultipleSignatures 64 - blst-native	38.845 ms/op	36.661 ms/op	1.06
BLS verifyMultipleSignatures 128 - blst-native	77.583 ms/op	73.533 ms/op	1.06
BLS deserializing 10000 signatures	798.46 ms/op	761.13 ms/op	1.05
BLS deserializing 100000 signatures	8.2787 s/op	7.6443 s/op	1.08
BLS verifyMultipleSignatures - same message - 3 - blst-native	1.1668 ms/op	1.1701 ms/op	1.00
BLS verifyMultipleSignatures - same message - 8 - blst-native	1.3228 ms/op	1.2694 ms/op	1.04
BLS verifyMultipleSignatures - same message - 32 - blst-native	2.3607 ms/op	1.9999 ms/op	1.18
BLS verifyMultipleSignatures - same message - 64 - blst-native	3.1314 ms/op	2.9730 ms/op	1.05
BLS verifyMultipleSignatures - same message - 128 - blst-native	5.9647 ms/op	4.9050 ms/op	1.22
BLS aggregatePubkeys 32 - blst-native	24.113 us/op	22.258 us/op	1.08
BLS aggregatePubkeys 128 - blst-native	89.529 us/op	87.332 us/op	1.03
getAttestationsForBlock	39.245 ms/op	27.307 ms/op	1.44
isKnown best case - 1 super set check	361.00 ns/op	299.00 ns/op	1.21
isKnown normal case - 2 super set checks	324.00 ns/op	302.00 ns/op	1.07
isKnown worse case - 16 super set checks	579.00 ns/op	301.00 ns/op	1.92
CheckpointStateCache - add get delete	4.3360 us/op	3.4550 us/op	1.25
validate api signedAggregateAndProof - struct	2.4828 ms/op	2.4074 ms/op	1.03
validate gossip signedAggregateAndProof - struct	2.4176 ms/op	2.3522 ms/op	1.03
validate gossip attestation - vc 640000	1.2036 ms/op	1.1280 ms/op	1.07
batch validate gossip attestation - vc 640000 - chunk 32	147.16 us/op	135.47 us/op	1.09
batch validate gossip attestation - vc 640000 - chunk 64	127.96 us/op	120.70 us/op	1.06
batch validate gossip attestation - vc 640000 - chunk 128	118.47 us/op	109.81 us/op	1.08
batch validate gossip attestation - vc 640000 - chunk 256	110.73 us/op	106.85 us/op	1.04
pickEth1Vote - no votes	886.40 us/op	871.51 us/op	1.02
pickEth1Vote - max votes	10.008 ms/op	10.367 ms/op	0.97
pickEth1Vote - Eth1Data hashTreeRoot value x2048	18.552 ms/op	18.709 ms/op	0.99
pickEth1Vote - Eth1Data hashTreeRoot tree x2048	19.039 ms/op	24.615 ms/op	0.77
pickEth1Vote - Eth1Data fastSerialize value x2048	427.00 us/op	366.67 us/op	1.16
pickEth1Vote - Eth1Data fastSerialize tree x2048	8.8883 ms/op	5.1396 ms/op	1.73
bytes32 toHexString	422.00 ns/op	391.00 ns/op	1.08
bytes32 Buffer.toString(hex)	288.00 ns/op	275.00 ns/op	1.05
bytes32 Buffer.toString(hex) from Uint8Array	426.00 ns/op	378.00 ns/op	1.13
bytes32 Buffer.toString(hex) + 0x	301.00 ns/op	277.00 ns/op	1.09
Object access 1 prop	0.19400 ns/op	0.18100 ns/op	1.07
Map access 1 prop	0.18300 ns/op	0.17800 ns/op	1.03
Object get x1000	5.6220 ns/op	4.8040 ns/op	1.17
Map get x1000	0.49900 ns/op	0.48800 ns/op	1.02
Object set x1000	23.448 ns/op	22.926 ns/op	1.02
Map set x1000	15.988 ns/op	16.235 ns/op	0.98
Return object 10000 times	0.22510 ns/op	0.21350 ns/op	1.05
Throw Error 10000 times	2.8632 us/op	2.6166 us/op	1.09
fastMsgIdFn sha256 / 200 bytes	1.9710 us/op	1.8840 us/op	1.05
fastMsgIdFn h32 xxhash / 200 bytes	317.00 ns/op	281.00 ns/op	1.13
fastMsgIdFn h64 xxhash / 200 bytes	352.00 ns/op	331.00 ns/op	1.06
fastMsgIdFn sha256 / 1000 bytes	6.0650 us/op	5.8950 us/op	1.03
fastMsgIdFn h32 xxhash / 1000 bytes	432.00 ns/op	391.00 ns/op	1.10
fastMsgIdFn h64 xxhash / 1000 bytes	411.00 ns/op	387.00 ns/op	1.06
fastMsgIdFn sha256 / 10000 bytes	52.260 us/op	51.040 us/op	1.02
fastMsgIdFn h32 xxhash / 10000 bytes	1.7790 us/op	1.7250 us/op	1.03
fastMsgIdFn h64 xxhash / 10000 bytes	1.2170 us/op	1.1830 us/op	1.03
send data - 1000 256B messages	12.615 ms/op	11.017 ms/op	1.15
send data - 1000 512B messages	16.711 ms/op	14.457 ms/op	1.16
send data - 1000 1024B messages	23.331 ms/op	21.755 ms/op	1.07
send data - 1000 1200B messages	21.759 ms/op	20.456 ms/op	1.06
send data - 1000 2048B messages	22.679 ms/op	22.924 ms/op	0.99
send data - 1000 4096B messages	22.146 ms/op	23.586 ms/op	0.94
send data - 1000 16384B messages	71.023 ms/op	56.992 ms/op	1.25
send data - 1000 65536B messages	390.46 ms/op	228.16 ms/op	1.71
enrSubnets - fastDeserialize 64 bits	1.2430 us/op	859.00 ns/op	1.45
enrSubnets - ssz BitVector 64 bits	608.00 ns/op	397.00 ns/op	1.53
enrSubnets - fastDeserialize 4 bits	295.00 ns/op	185.00 ns/op	1.59
enrSubnets - ssz BitVector 4 bits	577.00 ns/op	391.00 ns/op	1.48
prioritizePeers score -10:0 att 32-0.1 sync 2-0	120.09 us/op	62.735 us/op	1.91
prioritizePeers score 0:0 att 32-0.25 sync 2-0.25	125.45 us/op	74.079 us/op	1.69
prioritizePeers score 0:0 att 32-0.5 sync 2-0.5	219.87 us/op	105.66 us/op	2.08
prioritizePeers score 0:0 att 64-0.75 sync 4-0.75	379.69 us/op	177.99 us/op	2.13
prioritizePeers score 0:0 att 64-1 sync 4-1	251.26 us/op	197.47 us/op	1.27
array of 16000 items push then shift	1.2690 us/op	1.1970 us/op	1.06
LinkedList of 16000 items push then shift	7.3980 ns/op	6.5390 ns/op	1.13
array of 16000 items push then pop	89.461 ns/op	59.456 ns/op	1.50
LinkedList of 16000 items push then pop	5.9140 ns/op	5.5730 ns/op	1.06
array of 24000 items push then shift	1.9780 us/op	1.7747 us/op	1.11
LinkedList of 24000 items push then shift	6.5740 ns/op	6.2550 ns/op	1.05
array of 24000 items push then pop	116.54 ns/op	77.599 ns/op	1.50
LinkedList of 24000 items push then pop	6.2000 ns/op	5.5830 ns/op	1.11
intersect bitArray bitLen 8	5.4140 ns/op	5.2120 ns/op	1.04
intersect array and set length 8	79.687 ns/op	39.104 ns/op	2.04
intersect bitArray bitLen 128	25.650 ns/op	24.742 ns/op	1.04
intersect array and set length 128	590.12 ns/op	548.66 ns/op	1.08
bitArray.getTrueBitIndexes() bitLen 128	1.3730 us/op	1.1640 us/op	1.18
bitArray.getTrueBitIndexes() bitLen 248	2.1430 us/op	1.8790 us/op	1.14
bitArray.getTrueBitIndexes() bitLen 512	4.8170 us/op	3.3860 us/op	1.42
Buffer.concat 32 items	969.00 ns/op	841.00 ns/op	1.15
Uint8Array.set 32 items	1.9200 us/op	1.7780 us/op	1.08
Set add up to 64 items then delete first	1.6977 us/op	1.6944 us/op	1.00
OrderedSet add up to 64 items then delete first	2.8673 us/op	2.5875 us/op	1.11
Set add up to 64 items then delete last	1.9820 us/op	1.9366 us/op	1.02
OrderedSet add up to 64 items then delete last	2.8114 us/op	2.8678 us/op	0.98
Set add up to 64 items then delete middle	1.8927 us/op	1.9352 us/op	0.98
OrderedSet add up to 64 items then delete middle	3.9762 us/op	4.1304 us/op	0.96
Set add up to 128 items then delete first	3.7339 us/op	3.8383 us/op	0.97
OrderedSet add up to 128 items then delete first	5.8489 us/op	5.9986 us/op	0.98
Set add up to 128 items then delete last	3.6044 us/op	3.6751 us/op	0.98
OrderedSet add up to 128 items then delete last	5.3784 us/op	5.6744 us/op	0.95
Set add up to 128 items then delete middle	3.6255 us/op	3.8124 us/op	0.95
OrderedSet add up to 128 items then delete middle	12.196 us/op	10.538 us/op	1.16
Set add up to 256 items then delete first	10.159 us/op	7.4845 us/op	1.36
OrderedSet add up to 256 items then delete first	20.723 us/op	11.857 us/op	1.75
Set add up to 256 items then delete last	14.394 us/op	7.2076 us/op	2.00
OrderedSet add up to 256 items then delete last	14.941 us/op	10.956 us/op	1.36
Set add up to 256 items then delete middle	9.2424 us/op	7.1477 us/op	1.29
OrderedSet add up to 256 items then delete middle	35.033 us/op	29.953 us/op	1.17
transfer serialized Status (84 B)	1.9670 us/op	1.3980 us/op	1.41
copy serialized Status (84 B)	2.2160 us/op	1.2440 us/op	1.78
transfer serialized SignedVoluntaryExit (112 B)	2.7790 us/op	1.4640 us/op	1.90
copy serialized SignedVoluntaryExit (112 B)	1.7380 us/op	1.3250 us/op	1.31
transfer serialized ProposerSlashing (416 B)	3.1800 us/op	2.3130 us/op	1.37
copy serialized ProposerSlashing (416 B)	3.1890 us/op	2.3160 us/op	1.38
transfer serialized Attestation (485 B)	2.5820 us/op	2.4300 us/op	1.06
copy serialized Attestation (485 B)	2.4820 us/op	2.2810 us/op	1.09
transfer serialized AttesterSlashing (33232 B)	2.4490 us/op	2.3780 us/op	1.03
copy serialized AttesterSlashing (33232 B)	8.0500 us/op	5.0060 us/op	1.61
transfer serialized Small SignedBeaconBlock (128000 B)	2.6790 us/op	2.3340 us/op	1.15
copy serialized Small SignedBeaconBlock (128000 B)	10.634 us/op	11.074 us/op	0.96
transfer serialized Avg SignedBeaconBlock (200000 B)	3.1330 us/op	2.4080 us/op	1.30
copy serialized Avg SignedBeaconBlock (200000 B)	14.004 us/op	18.528 us/op	0.76
transfer serialized BlobsSidecar (524380 B)	4.0240 us/op	2.5510 us/op	1.58
copy serialized BlobsSidecar (524380 B)	141.36 us/op	71.011 us/op	1.99
transfer serialized Big SignedBeaconBlock (1000000 B)	4.5650 us/op	2.5540 us/op	1.79
copy serialized Big SignedBeaconBlock (1000000 B)	256.63 us/op	139.87 us/op	1.83
pass gossip attestations to forkchoice per slot	2.7367 ms/op	2.6015 ms/op	1.05
forkChoice updateHead vc 100000 bc 64 eq 0	616.65 us/op	431.50 us/op	1.43
forkChoice updateHead vc 600000 bc 64 eq 0	2.9726 ms/op	2.9483 ms/op	1.01
forkChoice updateHead vc 1000000 bc 64 eq 0	5.1292 ms/op	4.5480 ms/op	1.13
forkChoice updateHead vc 600000 bc 320 eq 0	3.0412 ms/op	2.6198 ms/op	1.16
forkChoice updateHead vc 600000 bc 1200 eq 0	3.2473 ms/op	2.9296 ms/op	1.11
forkChoice updateHead vc 600000 bc 7200 eq 0	3.7134 ms/op	3.4147 ms/op	1.09
forkChoice updateHead vc 600000 bc 64 eq 1000	10.321 ms/op	9.8509 ms/op	1.05
forkChoice updateHead vc 600000 bc 64 eq 10000	10.271 ms/op	9.9270 ms/op	1.03
forkChoice updateHead vc 600000 bc 64 eq 300000	16.185 ms/op	12.361 ms/op	1.31
computeDeltas 500000 validators 300 proto nodes	4.0232 ms/op	2.8485 ms/op	1.41
computeDeltas 500000 validators 1200 proto nodes	3.8653 ms/op	2.8871 ms/op	1.34
computeDeltas 500000 validators 7200 proto nodes	4.0815 ms/op	2.8293 ms/op	1.44
computeDeltas 750000 validators 300 proto nodes	5.3841 ms/op	4.3903 ms/op	1.23
computeDeltas 750000 validators 1200 proto nodes	6.5005 ms/op	4.2857 ms/op	1.52
computeDeltas 750000 validators 7200 proto nodes	7.0571 ms/op	4.2634 ms/op	1.66
computeDeltas 1400000 validators 300 proto nodes	13.189 ms/op	8.2065 ms/op	1.61
computeDeltas 1400000 validators 1200 proto nodes	12.866 ms/op	8.2389 ms/op	1.56
computeDeltas 1400000 validators 7200 proto nodes	13.438 ms/op	8.2451 ms/op	1.63
computeDeltas 2100000 validators 300 proto nodes	19.904 ms/op	12.618 ms/op	1.58
computeDeltas 2100000 validators 1200 proto nodes	18.771 ms/op	12.617 ms/op	1.49
computeDeltas 2100000 validators 7200 proto nodes	14.523 ms/op	12.940 ms/op	1.12
computeProposerBoostScoreFromBalances 500000 validators	2.9497 ms/op	2.7660 ms/op	1.07
computeProposerBoostScoreFromBalances 750000 validators	3.0279 ms/op	2.7606 ms/op	1.10
computeProposerBoostScoreFromBalances 1400000 validators	2.9826 ms/op	2.8303 ms/op	1.05
computeProposerBoostScoreFromBalances 2100000 validators	2.9215 ms/op	2.8487 ms/op	1.03
altair processAttestation - 250000 vs - 7PWei normalcase	1.8450 ms/op	1.5454 ms/op	1.19
altair processAttestation - 250000 vs - 7PWei worstcase	2.3005 ms/op	2.8214 ms/op	0.82
altair processAttestation - setStatus - 1/6 committees join	117.31 us/op	121.41 us/op	0.97
altair processAttestation - setStatus - 1/3 committees join	246.18 us/op	210.46 us/op	1.17
altair processAttestation - setStatus - 1/2 committees join	332.01 us/op	311.58 us/op	1.07
altair processAttestation - setStatus - 2/3 committees join	423.60 us/op	419.26 us/op	1.01
altair processAttestation - setStatus - 4/5 committees join	538.62 us/op	528.25 us/op	1.02
altair processAttestation - setStatus - 100% committees join	638.22 us/op	673.94 us/op	0.95
altair processBlock - 250000 vs - 7PWei normalcase	7.5212 ms/op	8.9014 ms/op	0.84
altair processBlock - 250000 vs - 7PWei normalcase hashState	29.014 ms/op	25.617 ms/op	1.13
altair processBlock - 250000 vs - 7PWei worstcase	34.934 ms/op	30.510 ms/op	1.14
altair processBlock - 250000 vs - 7PWei worstcase hashState	86.718 ms/op	76.640 ms/op	1.13
phase0 processBlock - 250000 vs - 7PWei normalcase	2.4427 ms/op	2.4757 ms/op	0.99
phase0 processBlock - 250000 vs - 7PWei worstcase	26.857 ms/op	28.806 ms/op	0.93
altair processEth1Data - 250000 vs - 7PWei normalcase	314.66 us/op	307.60 us/op	1.02
getExpectedWithdrawals 250000 eb:1,eth1:1,we:0,wn:0,smpl:15	6.6980 us/op	7.7620 us/op	0.86
getExpectedWithdrawals 250000 eb:0.95,eth1:0.1,we:0.05,wn:0,smpl:219	52.951 us/op	41.841 us/op	1.27
getExpectedWithdrawals 250000 eb:0.95,eth1:0.3,we:0.05,wn:0,smpl:42	13.939 us/op	8.2060 us/op	1.70
getExpectedWithdrawals 250000 eb:0.95,eth1:0.7,we:0.05,wn:0,smpl:18	10.530 us/op	11.508 us/op	0.92
getExpectedWithdrawals 250000 eb:0.1,eth1:0.1,we:0,wn:0,smpl:1020	153.86 us/op	129.05 us/op	1.19
getExpectedWithdrawals 250000 eb:0.03,eth1:0.03,we:0,wn:0,smpl:11777	1.1582 ms/op	678.69 us/op	1.71
getExpectedWithdrawals 250000 eb:0.01,eth1:0.01,we:0,wn:0,smpl:16384	1.0762 ms/op	913.40 us/op	1.18
getExpectedWithdrawals 250000 eb:0,eth1:0,we:0,wn:0,smpl:16384	947.83 us/op	1.0932 ms/op	0.87
getExpectedWithdrawals 250000 eb:0,eth1:0,we:0,wn:0,nocache,smpl:16384	1.9349 ms/op	2.7804 ms/op	0.70
getExpectedWithdrawals 250000 eb:0,eth1:1,we:0,wn:0,smpl:16384	1.7822 ms/op	1.8019 ms/op	0.99
getExpectedWithdrawals 250000 eb:0,eth1:1,we:0,wn:0,nocache,smpl:16384	3.3222 ms/op	4.8357 ms/op	0.69
Tree 40 250000 create	230.64 ms/op	252.45 ms/op	0.91
Tree 40 250000 get(125000)	111.00 ns/op	117.30 ns/op	0.95
Tree 40 250000 set(125000)	678.72 ns/op	796.94 ns/op	0.85
Tree 40 250000 toArray()	9.7575 ms/op	21.021 ms/op	0.46
Tree 40 250000 iterate all - toArray() + loop	9.9683 ms/op	21.574 ms/op	0.46
Tree 40 250000 iterate all - get(i)	43.422 ms/op	52.900 ms/op	0.82
MutableVector 250000 create	10.069 ms/op	10.444 ms/op	0.96
MutableVector 250000 get(125000)	5.5920 ns/op	5.9100 ns/op	0.95
MutableVector 250000 set(125000)	202.68 ns/op	216.86 ns/op	0.93
MutableVector 250000 toArray()	2.1597 ms/op	3.1824 ms/op	0.68
MutableVector 250000 iterate all - toArray() + loop	2.5351 ms/op	3.3455 ms/op	0.76
MutableVector 250000 iterate all - get(i)	1.3410 ms/op	1.3468 ms/op	1.00
Array 250000 create	1.9415 ms/op	2.7194 ms/op	0.71
Array 250000 clone - spread	1.0207 ms/op	987.27 us/op	1.03
Array 250000 get(125000)	0.57900 ns/op	0.58100 ns/op	1.00
Array 250000 set(125000)	0.65000 ns/op	0.61600 ns/op	1.06
Array 250000 iterate all - loop	77.744 us/op	78.219 us/op	0.99
effectiveBalanceIncrements clone Uint8Array 300000	12.567 us/op	11.511 us/op	1.09
effectiveBalanceIncrements clone MutableVector 300000	364.00 ns/op	318.00 ns/op	1.14
effectiveBalanceIncrements rw all Uint8Array 300000	169.12 us/op	173.07 us/op	0.98
effectiveBalanceIncrements rw all MutableVector 300000	64.566 ms/op	61.339 ms/op	1.05
phase0 afterProcessEpoch - 250000 vs - 7PWei	80.320 ms/op	78.789 ms/op	1.02
phase0 beforeProcessEpoch - 250000 vs - 7PWei	29.465 ms/op	32.440 ms/op	0.91
altair processEpoch - mainnet_e81889	360.36 ms/op	363.30 ms/op	0.99
mainnet_e81889 - altair beforeProcessEpoch	49.044 ms/op	46.725 ms/op	1.05
mainnet_e81889 - altair processJustificationAndFinalization	16.426 us/op	8.7960 us/op	1.87
mainnet_e81889 - altair processInactivityUpdates	5.1969 ms/op	5.0674 ms/op	1.03
mainnet_e81889 - altair processRewardsAndPenalties	58.014 ms/op	48.836 ms/op	1.19
mainnet_e81889 - altair processRegistryUpdates	3.0420 us/op	1.2150 us/op	2.50
mainnet_e81889 - altair processSlashings	821.00 ns/op	313.00 ns/op	2.62
mainnet_e81889 - altair processEth1DataReset	1.0980 us/op	310.00 ns/op	3.54
mainnet_e81889 - altair processEffectiveBalanceUpdates	924.90 us/op	902.29 us/op	1.03
mainnet_e81889 - altair processSlashingsReset	3.9540 us/op	2.0690 us/op	1.91
mainnet_e81889 - altair processRandaoMixesReset	5.2210 us/op	3.1120 us/op	1.68
mainnet_e81889 - altair processHistoricalRootsUpdate	1.0130 us/op	451.00 ns/op	2.25
mainnet_e81889 - altair processParticipationFlagUpdates	2.1860 us/op	1.2020 us/op	1.82
mainnet_e81889 - altair processSyncCommitteeUpdates	925.00 ns/op	394.00 ns/op	2.35
mainnet_e81889 - altair afterProcessEpoch	84.434 ms/op	85.684 ms/op	0.99
capella processEpoch - mainnet_e217614	1.1862 s/op	1.2378 s/op	0.96
mainnet_e217614 - capella beforeProcessEpoch	215.48 ms/op	198.75 ms/op	1.08
mainnet_e217614 - capella processJustificationAndFinalization	17.229 us/op	7.1530 us/op	2.41
mainnet_e217614 - capella processInactivityUpdates	15.022 ms/op	13.391 ms/op	1.12
mainnet_e217614 - capella processRewardsAndPenalties	269.93 ms/op	238.55 ms/op	1.13
mainnet_e217614 - capella processRegistryUpdates	20.150 us/op	12.678 us/op	1.59
mainnet_e217614 - capella processSlashings	546.00 ns/op	342.00 ns/op	1.60
mainnet_e217614 - capella processEth1DataReset	481.00 ns/op	505.00 ns/op	0.95
mainnet_e217614 - capella processEffectiveBalanceUpdates	3.3342 ms/op	3.1561 ms/op	1.06
mainnet_e217614 - capella processSlashingsReset	1.8430 us/op	1.4570 us/op	1.26
mainnet_e217614 - capella processRandaoMixesReset	3.1780 us/op	2.8830 us/op	1.10
mainnet_e217614 - capella processHistoricalRootsUpdate	565.00 ns/op	444.00 ns/op	1.27
mainnet_e217614 - capella processParticipationFlagUpdates	2.0710 us/op	883.00 ns/op	2.35
mainnet_e217614 - capella afterProcessEpoch	205.91 ms/op	198.49 ms/op	1.04
phase0 processEpoch - mainnet_e58758	339.09 ms/op	346.99 ms/op	0.98
mainnet_e58758 - phase0 beforeProcessEpoch	102.02 ms/op	95.395 ms/op	1.07
mainnet_e58758 - phase0 processJustificationAndFinalization	11.276 us/op	9.6680 us/op	1.17
mainnet_e58758 - phase0 processRewardsAndPenalties	51.597 ms/op	47.363 ms/op	1.09
mainnet_e58758 - phase0 processRegistryUpdates	9.5470 us/op	4.8730 us/op	1.96
mainnet_e58758 - phase0 processSlashings	535.00 ns/op	292.00 ns/op	1.83
mainnet_e58758 - phase0 processEth1DataReset	750.00 ns/op	289.00 ns/op	2.60
mainnet_e58758 - phase0 processEffectiveBalanceUpdates	822.89 us/op	725.29 us/op	1.13
mainnet_e58758 - phase0 processSlashingsReset	1.7730 us/op	1.4060 us/op	1.26
mainnet_e58758 - phase0 processRandaoMixesReset	3.5710 us/op	1.6370 us/op	2.18
mainnet_e58758 - phase0 processHistoricalRootsUpdate	536.00 ns/op	284.00 ns/op	1.89
mainnet_e58758 - phase0 processParticipationRecordUpdates	3.0330 us/op	2.4030 us/op	1.26
mainnet_e58758 - phase0 afterProcessEpoch	66.681 ms/op	67.389 ms/op	0.99
phase0 processEffectiveBalanceUpdates - 250000 normalcase	946.39 us/op	887.51 us/op	1.07
phase0 processEffectiveBalanceUpdates - 250000 worstcase 0.5	1.8728 ms/op	1.1074 ms/op	1.69
altair processInactivityUpdates - 250000 normalcase	16.533 ms/op	15.463 ms/op	1.07
altair processInactivityUpdates - 250000 worstcase	20.013 ms/op	14.796 ms/op	1.35
phase0 processRegistryUpdates - 250000 normalcase	7.4040 us/op	3.4560 us/op	2.14
phase0 processRegistryUpdates - 250000 badcase_full_deposits	370.23 us/op	246.78 us/op	1.50
phase0 processRegistryUpdates - 250000 worstcase 0.5	109.62 ms/op	104.27 ms/op	1.05
altair processRewardsAndPenalties - 250000 normalcase	59.657 ms/op	50.910 ms/op	1.17
altair processRewardsAndPenalties - 250000 worstcase	49.411 ms/op	53.187 ms/op	0.93
phase0 getAttestationDeltas - 250000 normalcase	5.8217 ms/op	5.0912 ms/op	1.14
phase0 getAttestationDeltas - 250000 worstcase	5.3283 ms/op	4.9587 ms/op	1.07
phase0 processSlashings - 250000 worstcase	1.4411 ms/op	1.5576 ms/op	0.93
altair processSyncCommitteeUpdates - 250000	104.70 ms/op	104.16 ms/op	1.01
BeaconState.hashTreeRoot - No change	319.00 ns/op	288.00 ns/op	1.11
BeaconState.hashTreeRoot - 1 full validator	117.50 us/op	104.33 us/op	1.13
BeaconState.hashTreeRoot - 32 full validator	1.6118 ms/op	1.3449 ms/op	1.20
BeaconState.hashTreeRoot - 512 full validator	18.321 ms/op	13.384 ms/op	1.37
BeaconState.hashTreeRoot - 1 validator.effectiveBalance	131.66 us/op	143.56 us/op	0.92
BeaconState.hashTreeRoot - 32 validator.effectiveBalance	2.1911 ms/op	1.9368 ms/op	1.13
BeaconState.hashTreeRoot - 512 validator.effectiveBalance	22.247 ms/op	25.543 ms/op	0.87
BeaconState.hashTreeRoot - 1 balances	115.05 us/op	129.02 us/op	0.89
BeaconState.hashTreeRoot - 32 balances	1.0578 ms/op	867.73 us/op	1.22
BeaconState.hashTreeRoot - 512 balances	9.1723 ms/op	10.921 ms/op	0.84
BeaconState.hashTreeRoot - 250000 balances	172.43 ms/op	168.14 ms/op	1.03
aggregationBits - 2048 els - zipIndexesInBitList	11.615 us/op	9.3830 us/op	1.24
regular array get 100000 times	30.129 us/op	30.568 us/op	0.99
wrappedArray get 100000 times	29.528 us/op	30.557 us/op	0.97
arrayWithProxy get 100000 times	9.3758 ms/op	10.008 ms/op	0.94
ssz.Root.equals	244.00 ns/op	235.00 ns/op	1.04
byteArrayEquals	237.00 ns/op	225.00 ns/op	1.05
shuffle list - 16384 els	4.5248 ms/op	4.3707 ms/op	1.04
shuffle list - 250000 els	67.088 ms/op	64.246 ms/op	1.04
processSlot - 1 slots	17.512 us/op	13.379 us/op	1.31
processSlot - 32 slots	2.3465 ms/op	2.5688 ms/op	0.91
getEffectiveBalanceIncrementsZeroInactive - 250000 vs - 7PWei	40.174 ms/op	40.429 ms/op	0.99
getCommitteeAssignments - req 1 vs - 250000 vc	2.2714 ms/op	2.3571 ms/op	0.96
getCommitteeAssignments - req 100 vs - 250000 vc	3.4412 ms/op	3.7899 ms/op	0.91
getCommitteeAssignments - req 1000 vs - 250000 vc	3.7478 ms/op	4.2685 ms/op	0.88
RootCache.getBlockRootAtSlot - 250000 vs - 7PWei	5.7200 ns/op	6.5600 ns/op	0.87
state getBlockRootAtSlot - 250000 vs - 7PWei	595.89 ns/op	1.2790 us/op	0.47
computeProposers - vc 250000	6.5371 ms/op	8.1641 ms/op	0.80
computeEpochShuffling - vc 250000	69.796 ms/op	75.813 ms/op	0.92
getNextSyncCommittee - vc 250000	114.74 ms/op	137.36 ms/op	0.84
computeSigningRoot for AttestationData	21.256 us/op	24.369 us/op	0.87
hash AttestationData serialized data then Buffer.toString(base64)	1.2717 us/op	1.2953 us/op	0.98
toHexString serialized data	799.09 ns/op	834.04 ns/op	0.96
Buffer.toString(base64)	172.89 ns/op	158.43 ns/op	1.09

by benchmarkbot/action

dapplion · 2023-10-11T18:47:45Z

Some todos:

proof on a benchmark the need for packages/beacon-node/src/util/fullOrBlindedBlock.ts. Compare the difference between the two points below, and unless there's a massive difference, just do the simpler strategy merging structs. After doing the benchmarks, persist the results in code, add a new comment to this PR with the results, and delete the losing codepath.
- deserialize, merge structs, serialize
- serialize exec payload, merge as bytes
convert the fixtures to .ssz format to reduce the diff

matthewkeil · 2023-10-11T22:31:12Z

convert the fixtures to .ssz format to reduce the diff

@dapplion I am working on that conversion now. When I did it this evening the unit tests for capella broke. I had logic to convert the mainnet mocks to work with the minimal testing preset in the mock loading file. I manually converted them tonight to minimal config before saving them serialized but something needs debugging. I was modifing the raw JSON before using the @lodestar/types because of how the LODESTAR_PRESET flows into the ssz types but when I hand converted something was not converted correctly. Ill find it in the AM and push the changes.

proof on a benchmark the need for packages/beacon-node/src/util/fullOrBlindedBlock.ts. Compare the difference between the two points below, and unless there's a massive difference, just do the simpler strategy merging structs. After doing the benchmarks, persist the results in code, add a new comment to this PR with the results, and delete the losing codepath.

deserialize, merge structs, serialize

serialize exec payload, merge as bytes

I remembered chatting with you about this a couple weeks ago and got it ready for you :) Apologies, I should have brought this up when we spoke before standup. I forgot with all the other stuff we chatted about.

I posted those results from the perf test on the issue:
#5671 (comment)

I copied the results in that comment below so they are part of this PR too for visibility.

The test file is in this commit of this PR:
4112724

The results seem like the serialize is the way to go, is a couple of orders of magnitude faster, but would love to get your opinion before I delete the losing codepath. The perf test is in the commit linked above so you can check the methodology. I will leave the perf test as part of this PR if serialize is how you want to go.

I was thinking about removing the generator function and just returning a promise after our discussion before standup. I will rerun the perf tests like that to compare and post them in a comment below tomorrow once I sort out the mock serialization bug.

  fullOrBlindedBlock
    BlindedOrFull to full
      phase0
        ✔ phase0 to full - deserialize first                                  9646.737 ops/s    103.6620 us/op        -       4856 runs  0.617 s
        ✔ phase0 to full - convert serialized                                  2865330 ops/s    349.0000 ns/op        -    1740003 runs  0.909 s
      altair
        ✔ altair to full - deserialize first                                  5352.431 ops/s    186.8310 us/op        -       2699 runs  0.697 s
        ✔ altair to full - convert serialized                                  2967359 ops/s    337.0000 ns/op        -    1598138 runs  0.808 s
      bellatrix
        ✔ bellatrix to full - deserialize first                               3991.474 ops/s    250.5340 us/op        -       1208 runs  0.553 s
        ✔ bellatrix to full - convert serialized                               2463054 ops/s    406.0000 ns/op        -     879455 runs  0.505 s
      capella
        ✔ capella to full - deserialize first                                 3660.175 ops/s    273.2110 us/op        -       1846 runs  0.783 s
        ✔ capella to full - convert serialized                                 2364066 ops/s    423.0000 ns/op        -    2012155 runs   1.21 s
      deneb
        ✔ deneb to full - deserialize first                                   3621.915 ops/s    276.0970 us/op        -       1827 runs  0.806 s
        ✔ deneb to full - convert serialized                                   2398082 ops/s    417.0000 ns/op        -     506726 runs  0.303 s
    BlindedOrFull to blinded
      phase0
        ✔ phase0 to blinded - deserialize first                               12937.95 ops/s    77.29200 us/op        -       4230 runs  0.404 s
        ✔ phase0 to blinded - convert serialized                           1.000000e+7 ops/s    100.0000 ns/op        -    3120420 runs  0.606 s
      altair
        ✔ altair to blinded - deserialize first                               7185.198 ops/s    139.1750 us/op        -       2170 runs  0.439 s
        ✔ altair to blinded - convert serialized                               9900990 ops/s    101.0000 ns/op        -    2588758 runs  0.505 s
      bellatrix
        ✔ bellatrix to blinded - deserialize first                            100.1679 ops/s    9.983241 ms/op        -         76 runs   1.26 s
        ✔ bellatrix to blinded - convert serialized                           92.22430 ops/s    10.84313 ms/op        -        117 runs   1.77 s
      capella
        ✔ capella to blinded - deserialize first                              45.29530 ops/s    22.07735 ms/op        -         48 runs   1.58 s
        ✔ capella to blinded - convert serialized                             43.09465 ops/s    23.20474 ms/op        -         50 runs   1.66 s
      deneb
        ✔ deneb to blinded - deserialize first                                45.42834 ops/s    22.01269 ms/op        -         51 runs   1.63 s
        ✔ deneb to blinded - convert serialized                               46.20545 ops/s    21.64247 ms/op        -         46 runs   1.50 s

matthewkeil · 2023-10-11T22:49:22Z

Some todos:

proof on a benchmark the need for packages/beacon-node/src/util/fullOrBlindedBlock.ts. Compare the difference between the two points below, and unless there's a massive difference, just do the simpler strategy merging structs. After doing the benchmarks, persist the results in code, add a new comment to this PR with the results, and delete the losing codepath.

deserialize, merge structs, serialize

serialize exec payload, merge as bytes

convert the fixtures to .ssz format to reduce the diff

btw @dapplion I added these, and one for converting from generator and retesting perf, to the checklist above

dapplion · 2023-10-12T06:29:29Z

@matthewkeil thanks! the differences in performance do not justify doing the complex byte manipulation IMO. Just merge structs.

g11tech

just blocking right now for a deeper review as it might affect some of the critical paths i want to double check + with the produceblockv3 PR types and helpers...

will also dig into the mergemock requirements

matthewkeil · 2023-10-17T12:51:03Z

⚠️ Performance Alert ⚠️

Possible performance regression was detected for some benchmarks. Benchmark result of this commit is worse than the previous benchmark result exceeding threshold.

Benchmark suite Current: 49ab90f Previous: 3a6702e Ratio
forkChoice updateHead vc 600000 bc 64 eq 300000 72.026 ms/op 18.857 ms/op 3.82
Full benchmark results

by benchmarkbot/action

@dapplion there is a benchmark regression after removing the serialized blinding/unblinding. There is not a big difference in time for the blinding process and the increase in the updateHead test seems higher than it should be.

g11tech · 2023-10-30T07:54:08Z

packages/beacon-node/src/util/fullOrBlindedBlock.ts

+  return firstByte - readExtraDataOffsetAt > 92;
+}
+
+// same as isBlindedSignedBeaconBlock but without type narrowing


what is the issue with type narrowing?

g11tech · 2023-10-30T07:59:30Z

packages/beacon-node/src/api/impl/beacon/blocks/utils.ts

    canonical,
    header: {
-      message: blockToHeader(config, block.message),


its cleaner to extend blockToHeader to accept full or blinded,

also then the root above can be calulated from the header returned by hashtree root of the blockheader ... it should be more efficient since body won't be merklized twice

…ck-types

… reqresp

…ocks

codecov · 2023-10-30T13:56:55Z

We're currently processing your upload. This comment will be updated when the results are available.

g11tech requested changes Oct 16, 2023

View reviewed changes

matthewkeil force-pushed the mkeil/dedup-beacon-block-2 branch from 697870f to 158cb40 Compare October 22, 2023 22:13

g11tech force-pushed the mkeil/dedup-beacon-block-2 branch from 8d804ea to 2fb8f48 Compare October 30, 2023 07:49

g11tech changed the base branch from unstable to deneb-builder October 30, 2023 07:50

g11tech reviewed Oct 30, 2023

View reviewed changes

Base automatically changed from deneb-builder to unstable October 30, 2023 13:42

dapplion and others added 17 commits October 30, 2023 19:14

WIP with FullOrBlindedSignedBeaconBlock

7fe92f5

fix(repositories): block type only change to FullOrBlinded

1d46517

Draft consumer points

c99030e

fix: update IBeaconChain and clear error in beaconBlocksByRoot

e119753

wip: TODOs. remove commit

9d438a4

fix: rough out to get building

5f32489

feat: move utils to fullOrBlindedBlock.ts

758005e

feat: build out fullOrBlindedBlock.ts

afd645c

feat: implement consumers of fullOrBlindedBlock.ts

375f002

test: fullOrBlindedBlock and add mocks

6793707

feat: block reassembly as generator

821ea79

feat: make getBlockHeaders blinded safe

bee326c

test: debugging fullOrBlindedBlock

617dc22

fix: bellatrix reassembleBlindedOrFullToFullBytes

f9ed0b0

fix: capella reassembleBlindedOrFullToFullBytes

0d27985

test: debug fullOrBlindedBlock.ts

4f1ad7b

test: fix mocks and get mocks/block modifying for minimal and mainnet

358c49a

matthewkeil added 28 commits October 30, 2023 19:14

chore: fix check-types error

b834932

fix: use toHexString for executionEngine call

08f7e9a

fix: optional chain to fix .timestamp access error

9754368

refactor: ensure "blindedOrFullBlock" to function names

ac5e40a

test: add byteArrayEqualsThrowBadIndexes

a208b24

refactor: standardize function names

d96a4e4

test: perf of not converting serialized

4e69106

refactor: update comment to be more clear

af94fcb

refactor(beacon-node): make fullOrBlinded function names all consistent

6aaa250

fix: add any in beacon-node/test/unit/fullOrBlindedBlock.test for che…

ee6953b

…ck-types

refactor: remove commented mocks that were updated

89a1e48

chore: fix lint error

909b7b6

refactor: simplify TransactionAndWithdrawals type

d17136b

fix: remove comments making sure there are tests for fullOrBlinded in…

03fa617

… reqresp

refactor: export chainConfig for use in tests that consume the mockBl…

0aa9e14

…ocks

test: update workflow with new containers

37147ad

test: revert workflow to original container versions

844e6cb

feat: throw Eth1Error for invalid payload body

077bcca

fix: convert all json to ssz fixtures

9e89430

fix: remove serialized blind/unblind code paths

bc8e6b7

chore: fix lint error

22d7ae6

refactor: clean up fullOrBlindedBlock

0d08933

fix: debugged test:sim:multifork

a07b3ab

fix: debug broken test:sim:deneb

62619a9

fix(workflows): comment out broken sim tests for now

9489edd

fix: name conflict from rebase to unstable

554c33d

test: update fullOrBlinded.test to vitest imports

9a60b32

fix: edge case for GENESIS_SLOT as post-altair block

adfa2f5

g11tech force-pushed the mkeil/dedup-beacon-block-2 branch from 2fb8f48 to adfa2f5 Compare October 30, 2023 13:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: de-duplicate payloads from persisted beacon blocks #6029

feat: de-duplicate payloads from persisted beacon blocks #6029

matthewkeil commented Oct 9, 2023 •

edited

github-actions bot commented Oct 9, 2023 •

edited

dapplion commented Oct 11, 2023

matthewkeil commented Oct 11, 2023 •

edited

matthewkeil commented Oct 11, 2023 •

edited

dapplion commented Oct 12, 2023

g11tech left a comment

matthewkeil commented Oct 17, 2023

⚠️ Performance Alert ⚠️

g11tech Oct 30, 2023

g11tech Oct 30, 2023

codecov bot commented Oct 30, 2023

feat: de-duplicate payloads from persisted beacon blocks #6029

Are you sure you want to change the base?

feat: de-duplicate payloads from persisted beacon blocks #6029

Conversation

matthewkeil commented Oct 9, 2023 • edited

github-actions bot commented Oct 9, 2023 • edited

Performance Report

dapplion commented Oct 11, 2023

matthewkeil commented Oct 11, 2023 • edited

matthewkeil commented Oct 11, 2023 • edited

dapplion commented Oct 12, 2023

g11tech left a comment

Choose a reason for hiding this comment

matthewkeil commented Oct 17, 2023

⚠️ Performance Alert ⚠️

g11tech Oct 30, 2023

Choose a reason for hiding this comment

g11tech Oct 30, 2023

Choose a reason for hiding this comment

codecov bot commented Oct 30, 2023

matthewkeil commented Oct 9, 2023 •

edited

github-actions bot commented Oct 9, 2023 •

edited

matthewkeil commented Oct 11, 2023 •

edited

matthewkeil commented Oct 11, 2023 •

edited