Upgrade : kvdb-*, trie-db, memory-db, hash-db, parking_lot #332

rakita · 2021-03-22T07:28:27Z

No description provided.

This reverts commit 0fcb102.

rakita · 2021-03-23T08:49:18Z

First test, using new rocksdb without setting limit to walgraph and producing graph.
I started syncing from pre-upgraded db on block nb: #12004675 it was 399GB, I am unsure about prunning but i think it was default one.
I put 1024 prunning to take more memory and used this pull request to sync, i got peak of 588GB.
Value in graph are in GB and are substracted with 450 i does not start from beginning but it covers a good amount of passive and active syncing. what is interesting is that memory fluctuated but it didn't go wild:

graph is taken from: 2021-03-22_15-55-52 to 2021-03-23_08-00-52. The FIrst fifth part of the graph is active syncing (caching up to chain) and rest is passive syncing.
This fluctuating matches what etherscan measured when they used v3.0.1 that had same db version: https://etherscan.io/chartsync/chaindefault

will set max wals option and make another graph

mdben1247 · 2021-03-23T09:28:33Z

Just to muddy the waters a bit... :) I've come to suspect lately that the performance gains seen with this upgrade may not be solely due to rocksdb. When investigating the openethereum code, caching and memory overlays are used extensively, so it seems unlikely the db lookups would be responsible for such a speed gain during normal block syncing when most of the relevant state will be cached anyway. I suspect some combination of memory-db, hash-db and trie-db upgrades may be responsible for a large part of increased performance. I admit this is a speculation and quite untested. I haven't had much luck in untangling those libraries, since they are tied up together via their dependency to parity-util-mem.

rakita · 2021-03-24T08:07:39Z

All tests are starting to sync from old(current) version of db synced with v3.2.0. All graph values are in GB and are subtracted with 350 to show more focused data.
first graph is how new_db behaves without max_wal option i left it in active sync for about ~3h. I wanted to have better data on fluctuation:

In next graph, i restarted db and started syncing with v3.2.0, i wanted to have comparison data. After that i switched to new_db with max_wal set to 4gb, switch between those two can be seen clearly in graph. Fluctuation disappears because of max_wal option. Whole graph covers about ~16h of active syncing.

my current db size is 661GB.
I see few candidates that can cause this: 1. we are missing some rocksdb option, 2. rocksdb transition between old db to new db has price and will stabilize after some time (i will check newest rocksdb just to be sure it is not some kind of bug), 3. there is a bug in a pull request or upgraded libraries that doesn't do cleanup of data.

rakita · 2021-03-25T10:24:02Z

I upgraded rocksdb lib to the newest v6.17.3 just to test if this is not some kind of bug on rocksdb side. I got same behavior.
graph shows around ~18h of running, max_wal set to 4gb, it stops rising because we came to highest block and stopped active syncing. Values are subtracted by 350 to be more in focus. By db size was 750gb

for that slowdown in middle, i did some testing in parallel on the same machine and this is probably the reason why it slowed down.

It constantly rises, it seems that we are not pruning old state or something like that. At least we can deduce that this is related to state, block/header information is not that big.

I didn't try syncing from begining with the new db and see how it behaves, just to check if this could be related to the transition between old to new db.

ilia-falconix · 2021-03-25T11:55:32Z

I don't see the environment specs and how node was executed. Is it in container? Could you please share specs? Thx!

rakita · 2021-03-25T12:31:45Z

@ilia-falconix, it is a bare machine, but in the end, the spec does not affect db insertion/deletion and it should not influence how db gets to 750gb, in my opinion something else is in question.

mdben1247 · 2021-03-25T18:54:56Z

Here's my experience: I started syncing with rocksdb upgraded client at block #11764407, db was very close to 400GB. It's a fatdb, but otherwise pretty vanilla. It's now fully synced 436GB sans wal.

mdben1247 · 2021-03-25T19:00:57Z

I used 40GB for max_wal.

ytrezq · 2021-03-25T21:01:12Z

related? #335

ytrezq · 2021-03-29T14:50:51Z

I m sure there could be peformance increase if the underlying Rocksdb version is 6.5.2 facebook/rocksdb@e3a82bb which was pushed one year ago.

I requires a very recent kernel, and to be enabled at build time (it is disabled by default). And to be enabled at runtime through the ROCKSDB_USE_IO_URING environment variable.

ytrezq · 2021-03-29T15:43:58Z

I would not be that much entertained about increasing cache size. Increasing cache size is only profitable if per thread performance is faster than disk cache.

Larger caches takes more cpu time at searching in the cache. I actually saw increased block added peformance by decreasing database cache size on Intel Celeron or aarch64 cpu despite the cache still leaving free memory.

mdben1247 · 2021-05-25T05:44:35Z

I've been running a version with upgraded kvdb, memory-db... libs for the past couple of months and have to say I am getting a lot of OOM kills, one every few days at least. Somehow the memory limits are not respected. So this should definitely not be considered yet for inclusion to main tree.

I think it would be interesting to try to only upgrade parity libs, but keeping the current rocksdb version. I am trying to do that right now, but it's a mess, since all these libs are very much entangled one with another.

ytrezq · 2021-05-26T09:21:14Z

I've been running a version with upgraded kvdb, memory-db... libs for the past couple of months and have to say I am getting a lot of OOM kills, one every few days at least. Somehow the memory limits are not respected. So this should definitely not be considered yet for inclusion to main tree.

I think it would be interesting to try to only upgrade parity libs, but keeping the current rocksdb version. I am trying to do that right now, but it's a mess, since all these libs are very much entangled one with another.

disagree. In a couple of months (especially with the planned higher block gas limit) syncing speed of OpenEthereum full archive nodes with 6Ghz cpu will fall behind blockchain growth speed even when storing the whole database on tmpfs. It was already cut by 2 since the last December on my 4.5Ghz cpu going down to 1.20 block or 15Mgas per second.
I think it’s then important to benefit from Rocksdb huge improvement like io_uring even if it requires using higher amounts of ram. On that point, it’s about keeping the software working at all.

@mdben1247 how much memory do you have? I didn’t notice unusual rises above a specific number of times the memory limit on my side. Di you try to set even lower memory limits?

mdben1247 · 2021-05-26T16:49:32Z

It's not a problem that it uses a lot of ram. The problem is the limits set in configuration were not honored. The client is configured to use 25Gb. It gets killed when its actual usage exceeds 60Gb.

Other than that, I agree, it would be great to have all new fancy features of rockdb. Also, there are optimizations to be had outside of db. Such as caching, flat database layout (as in turbogeth), but that's another story and a lot of work to get there.

Anyhow, this particular PR does not seem production ready to me.

ytrezq · 2021-05-26T17:41:51Z

@mdben1247 on the other end with the current version, the database often ends up corrupted (invalid checksum) by just letting OpenEthereum running. This isn t happening in the new version.

On the other end, the limit ends up being honoured. But just several times the value. Did you tried with 1Gib?

yorickdowne · 2021-05-28T11:28:21Z

This has been running well for about a day, merged with 3.2.6. There's a PR for that merge for rakita's repo.
I have not tried this on mainnet, just on Kovan so far. RAM use has been very moderate for that first day.

ytrezq · 2021-05-28T16:36:38Z

This has been running well for about a day, merged with 3.2.6. There's a PR for that merge for rakita's repo.
I have not tried this on mainnet, just on Kovan so far. RAM use has been very moderate for that first day.

And anyway, even if it would happen. I find it better to have out of memory errors with the new version rather than crashed databases because of invalid writes with the old version.

ytrezq · 2021-05-29T16:51:22Z

@mdben1247 see also #416

bryanapperson · 2021-05-30T14:53:41Z

Are any of you running this on mainnet yet?

yorickdowne · 2021-05-30T22:47:27Z

At this point I am likely to not attempt it on mainnet.

mdben1247 · 2021-05-31T06:33:23Z

I have been running a bit modified, rebased to 3.2 version on mainnet in a heavy load environment for the past two months. It runs well, fast, but there are memory issues I reported: peak memory usage much higher (ie. at least 2-3x more) than the limits specified in config.

https://github.com/mdben1247/openethereum/tree/mdb_kvdb_32

rakita and others added 5 commits March 21, 2021 23:51

Revert "Improved metrics (openethereum#240)"

a0a2ec6

This reverts commit 0fcb102.

Upgraded: kvdb-*, trie-db, memory-db, hash-db

40101d5

Add reverter prometheus prefix and changes

670ce8b

fmt

939f41d

Cargo.lock build

efd2bf4

rakita force-pushed the mdben_db_upgrade branch from caf7801 to efd2bf4 Compare March 22, 2021 13:51

fix tests

d8a75ff

rakita force-pushed the mdben_db_upgrade branch from 85ba1f1 to d8a75ff Compare March 22, 2021 20:50

rakita and others added 3 commits March 23, 2021 14:03

Added external kvdb-rocksdb with max wal

c29a33b

Added rocksdb max wal option

6a86cb2

fmt

f055c2b

rakita added 2 commits March 24, 2021 09:49

Bump rocksdb to v6.17.3

6b24fe0

Merge branch 'dev' into mdben_db_upgrade

3aa75e8

rakita mentioned this pull request Mar 24, 2021

kvdb-* upgrade #270

Closed

Merge branch 'dev' into mdben_db_upgrade

5d25974

lauriehamilton1979 approved these changes Apr 15, 2021

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Upgrade : kvdb-*, trie-db, memory-db, hash-db, parking_lot #332

Upgrade : kvdb-*, trie-db, memory-db, hash-db, parking_lot #332

rakita commented Mar 22, 2021

rakita commented Mar 23, 2021

mdben1247 commented Mar 23, 2021

rakita commented Mar 24, 2021

rakita commented Mar 25, 2021

ilia-falconix commented Mar 25, 2021

rakita commented Mar 25, 2021

mdben1247 commented Mar 25, 2021

mdben1247 commented Mar 25, 2021

ytrezq commented Mar 25, 2021

ytrezq commented Mar 29, 2021 •

edited

ytrezq commented Mar 29, 2021

mdben1247 commented May 25, 2021

ytrezq commented May 26, 2021

mdben1247 commented May 26, 2021

ytrezq commented May 26, 2021 •

edited

yorickdowne commented May 28, 2021

ytrezq commented May 28, 2021

ytrezq commented May 29, 2021

bryanapperson commented May 30, 2021

yorickdowne commented May 30, 2021

mdben1247 commented May 31, 2021

Upgrade : kvdb-*, trie-db, memory-db, hash-db, parking_lot #332

Are you sure you want to change the base?

Upgrade : kvdb-*, trie-db, memory-db, hash-db, parking_lot #332

Conversation

rakita commented Mar 22, 2021

rakita commented Mar 23, 2021

mdben1247 commented Mar 23, 2021

rakita commented Mar 24, 2021

rakita commented Mar 25, 2021

ilia-falconix commented Mar 25, 2021

rakita commented Mar 25, 2021

mdben1247 commented Mar 25, 2021

mdben1247 commented Mar 25, 2021

ytrezq commented Mar 25, 2021

ytrezq commented Mar 29, 2021 • edited

ytrezq commented Mar 29, 2021

mdben1247 commented May 25, 2021

ytrezq commented May 26, 2021

mdben1247 commented May 26, 2021

ytrezq commented May 26, 2021 • edited

yorickdowne commented May 28, 2021

ytrezq commented May 28, 2021

ytrezq commented May 29, 2021

bryanapperson commented May 30, 2021

yorickdowne commented May 30, 2021

mdben1247 commented May 31, 2021

ytrezq commented Mar 29, 2021 •

edited

ytrezq commented May 26, 2021 •

edited