Skip to content
This repository has been archived by the owner on Oct 22, 2021. It is now read-only.

Multi-zone / Max-in-flight #1306

Open
andrew-edgar opened this issue Apr 28, 2021 · 1 comment
Open

Multi-zone / Max-in-flight #1306

andrew-edgar opened this issue Apr 28, 2021 · 1 comment
Labels
enhancement New feature or request unscheduled

Comments

@andrew-edgar
Copy link

andrew-edgar commented Apr 28, 2021

Working on how multi-zone and max-in-flight support should be done in quarks-operator and quarks-statefulset.

@manno @jandubois @rj1 I have been testing a couple of things with regards to “max-in-flight”. There were “several” issues that will never get solved until K8S supports a similar functionality as “maxUnavailable” in deployments.

I wanted to give you an update on what tests and functionality I have done.

  1. I have 2 pending PRs for mutli-zone support. (change the naming of instance_groups in multizone to use real zone name #1303 and Change the naming of stateful sets with multi zones to match the real zone name quarks-statefulset#65). The PRs changed from using “zX” in the stateful set to using “zonename“. The purpose of that was to enable our use case of having a single control plane and multiple “shard (minion)” clusters. We are planning to have each shard being a single “zone” but required to use the “multi-zone” functionality to ensure no conflicting cell_ids and naming across clusters. This functionality was working but I would prefer that you do NOT yet merge those PRs as my work on max-in-flight piggy backed on this change and did need some tweaking.

  2. I have another change I’m planning to PR that uses topologySpreadContraints to distribute across “azs” inside a K8S cluster. This would mean we would “not” use the multi-zone config which created multiple stateful sets as this feature causes the pods to correctly spread across the zones without having to require multiple statefulsets. This piece is independant and I’m scoping it to be added to the statfulset ONLY if there is no “zone” for the instance group

  3. now back to max-in-flight 🙂. This is a “hard” problem and we went about solving it by using “azs” in the instance_group definition. What we do now though is have “duplicate” azs. so the azs in the instance_group looks like this …

azs:
  - eu-de-1
  - eu-de-1
  - eu-de-1

With some tweaks to the “zones” support that exists now. I can cause 3 statefulsets to be created, for example for diego-cell I now get 3 statefulsets named …

diego-cell-eu-de-1-0
diego-cell-eu-de-1-1
diego-cell-eu-de-1-2

And if I have multiple instances (3) in each statefulset I now get pod naming like this …

diego-cell-eu-de-1-0-0
diego-cell-eu-de-1-0-1
diego-cell-eu-de-1-0-2
diego-cell-eu-de-1-1-0
diego-cell-eu-de-1-1-1
diego-cell-eu-de-1-1-2
diego-cell-eu-de-1-2-0
diego-cell-eu-de-1-2-1
diego-cell-eu-de-1-2-2

Leaving the thoughts of canaries behind. We now can see 3 “cells” being created/updated in parallel and the names are all different.

This is quite a conceptual change and I have not tested going from an old version of quarks to a new version using an update but on a “new” deploy this does look and work perfectly to allow our use case.

From a kubecf perspective we have in the helm chart the “multi-zone” feature which in theory when false will remove all “azs” from all instance_groups. I would like to have a hybrid model where “some” instance groups are multizone (to allow for max-in-flight) like “doppler” but some NOT multizone like diego-api (which would then use topologySpreadConstraints.

So long story short ... Should I submit the changes in PRs? What is the feeling from the dev team about merging these changes and will they be accepted.

All the code changes I have tested with have good "tests" developed to prove that they will do what is expected.

Thanks,

Andrew

@andrew-edgar andrew-edgar added the enhancement New feature or request label Apr 28, 2021
@cf-gitbot
Copy link

We have created an issue in Pivotal Tracker to manage this:

https://www.pivotaltracker.com/story/show/177950101

The labels on this github issue will be updated when the story is started.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request unscheduled
Projects
None yet
Development

No branches or pull requests

2 participants