Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Opaque payload support #357

Open
kjnilsson opened this issue Mar 2, 2023 · 1 comment
Open

Opaque payload support #357

kjnilsson opened this issue Mar 2, 2023 · 1 comment
Milestone

Comments

@kjnilsson
Copy link
Contributor

kjnilsson commented Mar 2, 2023

Machines such as RabbitMQ quorum queues are used to stored potentially large binary data packets that are never really evaluated or used to calculate the state machine logic. Currently such data needs to be embedded in the command terms that are written to the raft log using term_to_iovec/1 in the WAL and segment writer. When a command is applied to the state machine (inc during recovery) the command is fully read from the log but never used during the apply operation. This is clearly redundant.

It would make sense to support such opaque binary payloads in a more efficient way such that they are only read when needed. Additionally it may be possible to maintain their on-disk representation separately from the raft log itself, Allowing for more efficient snapshots that don't need to include the payload itself. Segment compaction has dependency issues and thus severe limitations.

An approach that used a standard raft log with snapshotting + truncation combined with an approach where payloads are deleted / compacted based on liveness information provided by the state machine itself (rather than after a snapshot) could yield a "best of both worlds" approach that would allow Ra to efficiently host, e.g. kv stores with large payload data.

RabbitMQ use cases include: MQTT retained message storage, delayed / scheduled message delivery, quorum queues themselves may benefit from this.

@kjnilsson kjnilsson added this to the 3.0 milestone Mar 2, 2023
@kjnilsson
Copy link
Contributor Author

kjnilsson commented Mar 2, 2023

Quick thoughts:

Receive / write to cache, Written to WAL with entry

Apply (no payload, only payload meta data (size))

Read for replication needs to include payload

Write to payload / segment store (who does this? segment writer?)

  • how to ensure we skip data that isn't needed but still include payloads that are behind the snapshot index.
  • need to know all live payload indexes when writing to segments
  • need to be coordinated with compaction
  • ra server companion process? (also to do snapshotting, snapshot send)

Payload reading (log effect)

Payload compaction (who, safety, indexing)

Snapshot replication (negotiaote which payloads to replicate)

Snapshot format (include list of all live payload idx (raft idx)))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant