Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2i repair #24

Open
martinsumner opened this issue Aug 9, 2018 · 2 comments
Open

2i repair #24

martinsumner opened this issue Aug 9, 2018 · 2 comments

Comments

@martinsumner
Copy link
Owner

martinsumner commented Aug 9, 2018

The current kv_index_hashtree has a facility to keep an AAE hashtree of 2i terms. It keeps a single tree of all 2i terms in the vnode - using the special IndexN = {0, 0}.

This hashtree cannot be compared with any other vnode, as the terms are not put into separate hashtrees based on the IndexN. No two vnodes have exactly the same 2i terms.

This 2i terms tree is only used when using the admin function to repair 2i. The admin function to repair 2i will fold over all the objects in the vnode to produce an AAE tree for 2i terms, that this tree can be compared with the special {0, 0} tree for that vnode kept by kv_index_hashtree - and use that tree comparison to discover index postings in need of repair in the backend (and make backend changes to repair them).

This has a number of issues:

  • There's work to maintain the {0, 0} AAE tree always, even though it is only used in repair.
  • There's no way of proactively discovering the need to repair, discovering indexes are misaligned has to be resolved by the developer/operator, and on what nodes the indexes need to be repaired.

In the NHS 2i repair has never been run in production.

In a leveled backend world, there may be better ways of doing this:

  • It should be possible to fold over all index entries in the Ledger on a vnode building a Tictac tree dynamically (maintaining one makes no sense if we need one rarely).
  • It should be possible to fold over the objects in the Journal on the vnode building a TicTac tree for all indexes as represented by the object state.
  • If an index is broken, if the whole keystore is wiped, and the store restarted, the index will be rebuilt along with the rest of the keystore.

On this basis, it would be better to ignore the {0,0} special 2i AAE index requirement in the future. If any user wants to persist with leveldb, and wants to be able to run repair_2i, they may stick with legacy aae. However, we need a function to allow a vnode/index combination to verify consistency between ledger and journal within leveled - and a friendly way to rebuild a keystore.

Validating consistency between indexes on a vnode could be scheduled through a vnode poke. so the operator could ask through configuration for all vnode/indexes to be checked on a regular interval.

@martinsumner
Copy link
Owner Author

martinsumner/leveled#160

@martinsumner
Copy link
Owner Author

martinsumner commented Aug 9, 2018

Just as a note it should be remembered about what the purpose of AAE for 2i is. This is to resolve issues of disk corruption. In any 2i supporting backend ultimately the objects will end up being stored in different parts of the disk system to the actual objects - so potential indexes may get corrupted (i.e. lose entries), but the objects that are the source of the index entries are unaffected. So AAE of objects, even with tree rebuilding, may not detect inconsistency in index results.

In leveled if a portion of a file (that contains index entries) fails CRC checking, the database will continue as if the entries weren't there (rathe than crash). Similar situations can arise with leveldb. Hence a need to separately resolve anti-entropy in index entries.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant