Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Cassandra node gets decommissioned forever if scaling is partially done #410

Open
srteam2020 opened this issue Aug 17, 2021 · 0 comments
Assignees
Labels
bug Something isn't working

Comments

@srteam2020
Copy link
Contributor

Describe the bug

When scaling down, the Cassandra operator always decommissions a Cassandra node (or a Cassandra pod) before deleting the pod. However, we find sometimes the Cassandra node can be left in a decommissioned state without being deleted forever when the Cassandra operator misses certain events.

The scaling down logic is implemented as follows:

if desiredSpecReplicas < currentSpecReplicas {
	...
	if len(decommissionedNodes) == 0 {
		// decommission one Cassandra node (pod)
	} else if len(decommissionedNodes) == 1 {
		// delete the decommissioned node (pod)
	}
}

Assume we have a Cassandra datacenter with three (currentSpecReplicas) nodes and the user wants to scale to two (desiredSpecReplicas). When seeing desiredSpecReplicas < currentSpecReplicas, the operator first finds there is no decommissioned node (len(decommissionedNodes) == 0), so it will decommission one of the Cassandra nodes and finishes this reconcile. Ideally, the operator is supposed to delete the decommissioned node in the next reconcile.

However, if the user changes the replica back to three before the operator enters the next reconcile (this can happen when the operator runs slow or encounters a crash), the operator will find that desiredSpecReplicas == currentSpecReplicas in the next reconcile, and the decommissioned node will not be deleted. Thus, the node is left in the decommissioned state forever until the user issues another scale down later. There will be only two Cassandra nodes functioning, though the stateful set still hosts three Cassandra nodes (pods).

To Reproduce

Steps to reproduce the behavior:

  1. Create a Cassandra datacenter with three replicas.
  2. Scale down: three -> two. The operator decommissions the node, but has not deleted the pod yet
  3. Scale up: two -> three. The operator finds desiredSpecReplicas == currentSpecReplicas and leaves the node decommissioned.

Expected behavior
The operator should check whether any node is decommissioned and bring back the node if it is not supposed to be deleted.

Environment

  • OS Linux
  • Kubernetes version v1.18.9
  • kubectl version v1.20.1
  • Go version 1.13.9
  • Cassandra version 3

Additional context
We are willing to help fix this bug. One potential fix is to delete the pod where the node is decommissioned. Since the pod is hosted by the statefulset, the pod will be automatically recreated and get out of the decommissioned state.

@srteam2020 srteam2020 added the bug Something isn't working label Aug 17, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants