Implement client de-duplication sessions #145

kjnilsson · 2019-11-14T17:43:52Z

Client re-trying the processing of commands can result in duplicate entries in the log. This would then require the state machine itself to maintain de-duplication state (sequence numbers). It may be possible to implement de-duplication inside Ra without appending an entry to the log first.

Every writer wanting to use sessions first needs to establish a session with the leader (through a special internal command type). If there already is an active session for this id at the time of application the request will fail. If there is session state (the last committed sequence number) but no active process registered it will succeed and the sequence number will be returned to the client.

Every command it sends subsequently that includes a client provided session-id and a session-seq-num will be checked to ensure it is contiguous. The session id and seq num will be added to the command's meta data. The Ra leader will maintain both an active session map #{SessionId := {Pid, Seq, LastSeenMs} which is updated as commands are written to the log and a "consistent" session id map of #{SessionId => Seq} which is derived from committed log entries (and thus replicated across the cluster). If the leader receives a command with an out of sequence session-seq-number it will respond with a rejection message, else it will append it to the log and update the active session state.

When a leader changes all active sessions are discarded (as they are transient state on the leader) and all clients are required to re-establish their sessions and resend commands from the last committed sequence number.

The committed session state will need to be made part of each snapshot and recovered on restart as well as sent as part of snapshot installation. But as snapshots are triggered by the state machine and can be for an index lower than the last applied this means we have to somehow derive the entire session state for a past index.

This requires some degree of cooperation from client processes. E.g. they should not redirect commands part of a message to a new leader without establishing a session first.

This approach will need careful modelling and testing before being accepted.

The text was updated successfully, but these errors were encountered:

kjnilsson · 2021-06-09T07:03:26Z

Rather than returning an error for duplicate commands the server could simply return an applied notification (assuming the corresponding command's index has been committed).

Also for simplicity we could accept any sequence higher than the last sequence rather than seq+1 which may be too restrictive for some use cases.

kjnilsson added effort-high enhancement labels Nov 14, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement client de-duplication sessions #145

Implement client de-duplication sessions #145

kjnilsson commented Nov 14, 2019 •

edited

kjnilsson commented Jun 9, 2021

Implement client de-duplication sessions #145

Implement client de-duplication sessions #145

Comments

kjnilsson commented Nov 14, 2019 • edited

kjnilsson commented Jun 9, 2021

kjnilsson commented Nov 14, 2019 •

edited