Skip to content
This repository has been archived by the owner on May 24, 2022. It is now read-only.

Upgrade : kvdb-*, trie-db, memory-db, hash-db, parking_lot #332

Draft
wants to merge 12 commits into
base: dev
Choose a base branch
from

Conversation

rakita
Copy link
Contributor

@rakita rakita commented Mar 22, 2021

No description provided.

@rakita
Copy link
Contributor Author

rakita commented Mar 23, 2021

First test, using new rocksdb without setting limit to walgraph and producing graph.
I started syncing from pre-upgraded db on block nb: #12004675 it was 399GB, I am unsure about prunning but i think it was default one.
I put 1024 prunning to take more memory and used this pull request to sync, i got peak of 588GB.
Value in graph are in GB and are substracted with 450 i does not start from beginning but it covers a good amount of passive and active syncing. what is interesting is that memory fluctuated but it didn't go wild:

graph is taken from: 2021-03-22_15-55-52 to 2021-03-23_08-00-52. The FIrst fifth part of the graph is active syncing (caching up to chain) and rest is passive syncing.
This fluctuating matches what etherscan measured when they used v3.0.1 that had same db version: https://etherscan.io/chartsync/chaindefault

will set max wals option and make another graph

@mdben1247
Copy link
Contributor

Just to muddy the waters a bit... :) I've come to suspect lately that the performance gains seen with this upgrade may not be solely due to rocksdb. When investigating the openethereum code, caching and memory overlays are used extensively, so it seems unlikely the db lookups would be responsible for such a speed gain during normal block syncing when most of the relevant state will be cached anyway. I suspect some combination of memory-db, hash-db and trie-db upgrades may be responsible for a large part of increased performance. I admit this is a speculation and quite untested. I haven't had much luck in untangling those libraries, since they are tied up together via their dependency to parity-util-mem.

@rakita
Copy link
Contributor Author

rakita commented Mar 24, 2021

All tests are starting to sync from old(current) version of db synced with v3.2.0. All graph values are in GB and are subtracted with 350 to show more focused data.
first graph is how new_db behaves without max_wal option i left it in active sync for about ~3h. I wanted to have better data on fluctuation:

In next graph, i restarted db and started syncing with v3.2.0, i wanted to have comparison data. After that i switched to new_db with max_wal set to 4gb, switch between those two can be seen clearly in graph. Fluctuation disappears because of max_wal option. Whole graph covers about ~16h of active syncing.

my current db size is 661GB.
I see few candidates that can cause this: 1. we are missing some rocksdb option, 2. rocksdb transition between old db to new db has price and will stabilize after some time (i will check newest rocksdb just to be sure it is not some kind of bug), 3. there is a bug in a pull request or upgraded libraries that doesn't do cleanup of data.

@rakita rakita mentioned this pull request Mar 24, 2021
@rakita
Copy link
Contributor Author

rakita commented Mar 25, 2021

I upgraded rocksdb lib to the newest v6.17.3 just to test if this is not some kind of bug on rocksdb side. I got same behavior.
graph shows around ~18h of running, max_wal set to 4gb, it stops rising because we came to highest block and stopped active syncing. Values are subtracted by 350 to be more in focus. By db size was 750gb

for that slowdown in middle, i did some testing in parallel on the same machine and this is probably the reason why it slowed down.

It constantly rises, it seems that we are not pruning old state or something like that. At least we can deduce that this is related to state, block/header information is not that big.

I didn't try syncing from begining with the new db and see how it behaves, just to check if this could be related to the transition between old to new db.

@ilia-falconix
Copy link

I don't see the environment specs and how node was executed. Is it in container? Could you please share specs? Thx!

@rakita
Copy link
Contributor Author

rakita commented Mar 25, 2021

@ilia-falconix, it is a bare machine, but in the end, the spec does not affect db insertion/deletion and it should not influence how db gets to 750gb, in my opinion something else is in question.

@mdben1247
Copy link
Contributor

Here's my experience: I started syncing with rocksdb upgraded client at block #11764407, db was very close to 400GB. It's a fatdb, but otherwise pretty vanilla. It's now fully synced 436GB sans wal.

@mdben1247
Copy link
Contributor

I used 40GB for max_wal.

@ytrezq
Copy link

ytrezq commented Mar 25, 2021

related? #335

@ytrezq
Copy link

ytrezq commented Mar 29, 2021

I m sure there could be peformance increase if the underlying Rocksdb version is 6.5.2 facebook/rocksdb@e3a82bb which was pushed one year ago.

I requires a very recent kernel, and to be enabled at build time (it is disabled by default). And to be enabled at runtime through the ROCKSDB_USE_IO_URING environment variable.

@ytrezq
Copy link

ytrezq commented Mar 29, 2021

I would not be that much entertained about increasing cache size. Increasing cache size is only profitable if per thread performance is faster than disk cache.

Larger caches takes more cpu time at searching in the cache. I actually saw increased block added peformance by decreasing database cache size on Intel Celeron or aarch64 cpu despite the cache still leaving free memory.

@mdben1247
Copy link
Contributor

I've been running a version with upgraded kvdb, memory-db... libs for the past couple of months and have to say I am getting a lot of OOM kills, one every few days at least. Somehow the memory limits are not respected. So this should definitely not be considered yet for inclusion to main tree.

I think it would be interesting to try to only upgrade parity libs, but keeping the current rocksdb version. I am trying to do that right now, but it's a mess, since all these libs are very much entangled one with another.

@ytrezq
Copy link

ytrezq commented May 26, 2021

I've been running a version with upgraded kvdb, memory-db... libs for the past couple of months and have to say I am getting a lot of OOM kills, one every few days at least. Somehow the memory limits are not respected. So this should definitely not be considered yet for inclusion to main tree.

I think it would be interesting to try to only upgrade parity libs, but keeping the current rocksdb version. I am trying to do that right now, but it's a mess, since all these libs are very much entangled one with another.

disagree. In a couple of months (especially with the planned higher block gas limit) syncing speed of OpenEthereum full archive nodes with 6Ghz cpu will fall behind blockchain growth speed even when storing the whole database on tmpfs. It was already cut by 2 since the last December on my 4.5Ghz cpu going down to 1.20 block or 15Mgas per second.
I think it’s then important to benefit from Rocksdb huge improvement like io_uring even if it requires using higher amounts of ram. On that point, it’s about keeping the software working at all.

@mdben1247 how much memory do you have? I didn’t notice unusual rises above a specific number of times the memory limit on my side. Di you try to set even lower memory limits?

@mdben1247
Copy link
Contributor

It's not a problem that it uses a lot of ram. The problem is the limits set in configuration were not honored. The client is configured to use 25Gb. It gets killed when its actual usage exceeds 60Gb.

Other than that, I agree, it would be great to have all new fancy features of rockdb. Also, there are optimizations to be had outside of db. Such as caching, flat database layout (as in turbogeth), but that's another story and a lot of work to get there.

Anyhow, this particular PR does not seem production ready to me.

@ytrezq
Copy link

ytrezq commented May 26, 2021

@mdben1247 on the other end with the current version, the database often ends up corrupted (invalid checksum) by just letting OpenEthereum running. This isn t happening in the new version.

On the other end, the limit ends up being honoured. But just several times the value. Did you tried with 1Gib?

@yorickdowne
Copy link

This has been running well for about a day, merged with 3.2.6. There's a PR for that merge for rakita's repo.
I have not tried this on mainnet, just on Kovan so far. RAM use has been very moderate for that first day.

@ytrezq
Copy link

ytrezq commented May 28, 2021

This has been running well for about a day, merged with 3.2.6. There's a PR for that merge for rakita's repo.
I have not tried this on mainnet, just on Kovan so far. RAM use has been very moderate for that first day.

And anyway, even if it would happen. I find it better to have out of memory errors with the new version rather than crashed databases because of invalid writes with the old version.

@ytrezq
Copy link

ytrezq commented May 29, 2021

@mdben1247 see also #416

@bryanapperson
Copy link

Are any of you running this on mainnet yet?

@yorickdowne
Copy link

At this point I am likely to not attempt it on mainnet.

@mdben1247
Copy link
Contributor

I have been running a bit modified, rebased to 3.2 version on mainnet in a heavy load environment for the past two months. It runs well, fast, but there are memory issues I reported: peak memory usage much higher (ie. at least 2-3x more) than the limits specified in config.

https://github.com/mdben1247/openethereum/tree/mdb_kvdb_32

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants