New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Splitstore running with LOTUS_CHAINSTORE_SPLITSTORE_COLDSTORETYPE=discard
does a lot of unnecessary work.
#11778
Comments
This is a good point. I've made a simple change that should make using non-Delete supporting blockstores work for this configuration: #11880. Let me know if there are any performance benefits noticed and I can open this up for real. It's possible to spend ~5-10x the work on figuring out exactly how much of the critical section code we can skip but I doubt there will be enough general motivation for me to go that route. If anyone figured that out wrote it up and PRed it I'd definitely review. Also note I'm not deploying the optimization in the case of moving gc's triggered by oversized stores for simplicity and because the tradeoff should fall more towards redundant deleting in that case. |
I believe that in discard mode things can be rearranged so there is no critical section at all, so the above estimate is kinda inaccurate 😅 What I am suggesting, and consider to be safe is:
The advantage here is that at no point do you have a state where some blocks will be lost, you also don't spend ~7GiB maintaining marksets, and you don't walk the sets multiple times. Not saying this has to be done right then and there, just pushing back that the complex safety dance is warranted at all in light of MovingGC. |
Nice that sounds like a well thought out and reasonable change. It's probably not something I'm going to get to for a while but I'd review anything you or anyone else sent my way in the meantime. If you've got any preliminary estimates of the wasted work please put them here to help in motivating the change. |
@ZenGround0 this is taken from a live box, my notes prefixed with
|
Very useful information, thank you. Is this with or without the patch in #11880? Im curious to see how much if at all that helped. |
@ZenGround0 you missed my writeup there: #11880 (comment) Yes, the patch is removed above. |
Checklist
Latest release
, the most recent RC(release canadiate) for the upcoming release or the dev branch(master), or have an issue updating to any of these.Lotus component
Lotus Version
Repro Steps
A typical state-only node (i.e. what an SP would run) is usually configured as:
The expectation here is that once the GC loop awakens, it:
However a typical log from such a setup looks like the one below. Note various calculations of coldsets culimnating with
purged cold objects
lines. This is quite unexpected as the original blockstore is deleted right after this anyway 🤔Describe the Bug
I ran into this while experimenting with different blockstores that do not implement the
Delete*
blockstore methods. Started digging because in the describeddiscard
configuration there should have been no way to enter this codepath: https://github.com/filecoin-project/lotus/blob/v1.26.0/blockstore/splitstore/splitstore_compact.go#L1438-L1449, yet it clearly does get hit.I suspect there is a missing noop somewhere, as the original focus was on making the
active coldstore
case work. Given thediscard
setup is way more prevalent, ask is to review this when time permits ( wink-wink when @ZenGround0 is back 😻)Not porposing a patch as I can't readily follow the code as written...
Logging Information
The text was updated successfully, but these errors were encountered: