Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Local effects (send_msg, log etc) can be duplicated when adding a new member #387

Open
kjnilsson opened this issue Jul 3, 2023 · 1 comment
Labels

Comments

@kjnilsson
Copy link
Contributor

Describe the bug

Local effects will be executed by the leader if at the time of processing the effect the target node does not have a current member on it. In the case where a new member is added to a node and the log currently command that would generate local effects for this node the new member will re-execute all these effects as local.

E.g. in a RabbitMQ cluster of A, B, C where there is a quroum queue with a leader on A and a follower on B. If there is a consumer on C that has received some messages (sent by leader) and these have not yet been truncated from the log if a member is added on C whilst this consuming channel is still alive the new member on C will send all the already sent messages in the log to the consumer. (In RabbitMQs case the consuming code will just filter these messages out).

This means catching up could have a performance penalty (of processing all these local effects) which means it may take longer for the new member to catch up.

Reproduction steps

See above

Expected behavior

We need to find a given "pivot" point where a member starts executing local effects. It may be enough to check that the new member is in the cluster configuration but if a member with the same name is added and removed then added again this could still result in duplication which may or may not be acceptable.

Alt we could use the voter status in #375 as the pivot for whether a leader or local member executes the effect or not.

Additional context

No response

@kjnilsson kjnilsson added the bug label Jul 3, 2023
@illotum
Copy link
Collaborator

illotum commented Jul 3, 2023

For a generic replicated state machine, the non-voters do not have up to date state in order to execute commands faithfully. Agree that it would be nice to at least wait for the follower to catch up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants