Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update ETCD volume to lower storage capacity #646

Open
shreyas-s-rao opened this issue Nov 14, 2022 · 3 comments
Open

Update ETCD volume to lower storage capacity #646

shreyas-s-rao opened this issue Nov 14, 2022 · 3 comments
Labels
area/backup Backup related area/cost Cost related area/storage Storage related kind/enhancement Enhancement, improvement, extension lifecycle/stale Nobody worked on this for 6 months (will further age) platform/aws Amazon web services platform/infrastructure

Comments

@shreyas-s-rao
Copy link
Contributor

How to categorize this issue?

/area backup
/area storage
/area cost
/kind enhancement
/platform aws

What would you like to be added:
Update ETCD volume (GP3) to use lower storage capacity.

Why is this needed:
The current storage capacity of 80Gi was used for GP2 volumes for ETCD data storage, since the IOPS for GP2 volumes is determined by the volume size. Hence, 80Gi was set as the volume size in order to achieve optimum IOPS for ETCD, even though the ETCD DB and additional operations do not utilise the entire 80GB of storage. With the move to GP3 volumes, this is no longer necessary - the storage capacity can be reduced to a lower value sufficient enough to host the ETCD DB and allow additional operations on it (such as defragmentation, restoration, etc).

@gardener-robot gardener-robot added area/backup Backup related area/cost Cost related area/storage Storage related kind/enhancement Enhancement, improvement, extension platform/aws Amazon web services platform/infrastructure labels Nov 14, 2022
@shreyas-s-rao
Copy link
Contributor Author

/assign

@unmarshall
Copy link

unmarshall commented Nov 29, 2022

@shreyas-s-rao as you have already found out that just by changing the storageCapacity a statefulset cannot be updated. If one tries to do that the following error will be returned:

The StatefulSet "web" is invalid: spec: Forbidden: updates to statefulset spec for fields other than 'replicas', 'template', 'updateStrategy', 'persistentVolumeClaimRetentionPolicy' and 'minReadySeconds' are forbidden

For a multi-node etcd if you wish to avoid down time and also change the storageCapacity to a lower capacity then there is one way to do it:

  1. Change the update strategy from RollingUpdate to OnDelete for the statefulset.
  2. Delete the Statefulset with --cascade=orphan so that it does not delete the pods.
  3. Update the statefulset with the reduced storageCapacity (25Gi) and recreate the statefulset.
  4. The pods will continue to use the PVC which requested 80Gi of space.
  5. Delete one pod and also ensure that its PVC is deleted. For a multi-node etcd you will still retain the quorum. Since the update strategy has changed you will see a new pod with a new PVC with 25Gi of storage capacity. The remainder of the pods will not be udpated.

Repeat this for all the pods and this way you would be able to update all etc statefulset replicas while maintaining quorum. Since you have control over the update process, you can wait till the new etcd member has been promoted to a member.

@unmarshall
Copy link

We anyways should change the update strategy from RollingUpdate to OnDelete to ensure that quorum is always maintained which is not possible to ensure when using RollingUpdate

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/backup Backup related area/cost Cost related area/storage Storage related kind/enhancement Enhancement, improvement, extension lifecycle/stale Nobody worked on this for 6 months (will further age) platform/aws Amazon web services platform/infrastructure
Projects
None yet
Development

No branches or pull requests

3 participants