Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add binary format that supports partial reading and self-verification as a storage format #8

Open
darkskygit opened this issue Aug 23, 2023 · 2 comments

Comments

@darkskygit
Copy link
Collaborator

darkskygit commented Aug 23, 2023

ybinary v1 is a binary format optimized for one-time network transmission.

It only supports overall reading and cannot know whether binary is damaged before the reading process goes wrong.

For specific analysis, please refer to this review:

toeverything/OctoBase#383 (comment)

We need to design a binary format that supports partial reading and self-verification to store crdt state permanently and robustly

@Brooooooklyn
Copy link
Collaborator

From the advice from @dmonad, we can store the checksum info in the y-binary itself.

@dmonad
Copy link
Collaborator

dmonad commented Aug 30, 2023

You can create a new (custom) binary "v1-with-checksum" by concatenating the checksum and the binary update. E.g.

doc.on('update', update => {
  const v1UpdateWithChecksum = encoding.encode(encoder => {
     encoding.writeUint8(encoder, ChecksumType)
     encoding.writeVarUint8Array(encoder, checksum(update))
     encoding.writeVarUint8Array(encoder, update)
  })
})

I imagine that most users don't want to verify each single update and re-request the data from another source if the update is manipulated. So maybe you store an error-correcting CRC checksum instead of something like sha or rabin.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants