Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

irmin-pack.unix: idea to improve GC crash consistency on startup #2082

Open
metanivek opened this issue Sep 9, 2022 · 2 comments
Open

irmin-pack.unix: idea to improve GC crash consistency on startup #2082

metanivek opened this issue Sep 9, 2022 · 2 comments

Comments

@metanivek
Copy link
Member

From #1971 (comment)

Assuming the prefix contains the commit:

  • at startup if files are not in a consistent state:
    • check for the previous generation files;
    • ignore the suffix
    • reconstruct control file: from the mapping, and fstat dict.
    • restart the node from the commit in the prefix
  • the guarantee is that if the first gc succeeds then we have always a set of files (prefix, mapping and dict) in a consistent state from which we can build the control file and start with a fresh suffix.
@metanivek
Copy link
Member Author

From a recent discussion with NL, this would likely work for them but we would need to have a way to communicate to them the latest good commit hash so that they can rollback lib_store as well.

@Ngoguey42
Copy link
Contributor

I have another design to propose to solve crash (in)consistency problems.

Step 1. Change the "index" from "Index" to an append only file

Since irmin-pack supports minimal indexing, the index grows 41 byte per block, which is 43 MB per year. All the Index machinery is not useful anymore when using the minimal indexing mode.

Up to now we wanted to keep Index in case of minimal indexing fails - so that Tezos could fallback on the non-minimal indexing strategy. Minimal-indexing has proven itself, there is no need to keep that failsafe.

During our initial discussions on implementing irmin-pack's lower layer, it seemed to us that dropping the support for non-minimal indexing would simplify a lot the implementation (we would still support stores that knew non-minimal indexing in the past).

At open time, the control file now allows to detect the case where the suffix is ahead of time of the dict. However, we are still not able to detect the cases where the index is ahead of time of the suffix (we either raise Pack_store.Invalid_read or worse).

All in all, we can now consider the fact of migrating away from Index.

For the index, we could use a storage scheme similar to dict. It would be an append only file that is fully loaded in memory when opening the store, that could be garbage collected and which end offset could be remembered by the control file.

For the GC we would include a "generation" integer in the index filename. We would GC using the "surgery" technique. We would have to handle newies the same way as the Irmin 3.4 suffix.

Step 2. Raise Recovery_needed when opening a deeply corrupted store

Currently with irmin 3.4, when opening a store where the control file is ahead of time of the dict or the suffix, we raise Inconsistent_store. We currently provide no way for recovering these stores.

Following step 1. we would be able to also detect these cases for the index.

For both the dict/index/suffix we could then raise Recovery_needed and implement a recovery method. See next step.

Step 3. A new recovery method

Following the 2 previous steps we could implement a recovery method that:

  • Decides a new end offset for the index file
  • Decides a new end offset for the suffix file
  • Decides a new end offset for the dict file
  • Overwrites the old control file

The algorithms would search in index for the valid entry with the highest offset. An entry in index is valid if:

  1. it points to a valid offset in the suffix/prefix/lower,
  2. if all the objects preceeding that valid object have valid pointers in the dict.

We would also be able to drop the existing "reconstruct index" recovery method.

Step 0. a. Migrating stores that only knew minimal indexing

The simplest solution would be a migration that happens at open_rw time of the file manager. It would traverse Index and convert it to the very first index file. A crash during that migration would not be destructive.

A second solution would be to make the existing Index readonly and use the new index file scheme for the new index entries. The migrated irmin-pack stores would forever keep the Index directory. GC would work normally for the new entries.

A third solution, on top of the second solution, would be to migrate the data out of Index during the first GC. We could then discard Index after the finalise of that first GC.

Step 0. b. Migrating stores that knew non-minimal indexing

We would stick to the "second solution" of the previous section.

Discriminating between case a. and b. would be possible by looking at the existing control file. We've already stored these informations in it's current form.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants