Separate proxy and controller Deployments #921

rainest · 2023-11-01T20:54:38Z

Kong/kubernetes-ingress-controller#4671 talks about chart stuff but isn't in this repo (presumably for milestone reasons). This is a stub to provide charts tracking of the same.

rainest · 2023-11-02T01:53:04Z

tl;dr

Using an umbrella chart results in a number of limitations on template
development (in general, we cannot share information between the two releases,
and thus cannot, for example, automatically configure the controller admin API
target based on the proxy's admin API Service configuration) and challenges to
release processes. If we no longer wish to support deploying the controller as
a sidecar, it makes sense to use a single chart that deploys the controller and
Kong each in their own Deployment.

This requires some redesign of the values.yaml to support configuring the
Deployments independently. There are several options for this:

Do not actually require changes or add new settings, and simply apply the
same settings to both Deployments equally. This is a bad option.
Apply existing settings to the proxy Deployment and create new settings for
the controller Deployment as needed. This requires fewer changes, but results
in confusing configuration design, since like settings are configured in very
different places.
Reorganize settings for consistency and clearing some debt where we poorly
structured them in the past. This involves breaking changes. Most of these
changes are just moving a key to a new location, however, so a tool to
automate it shouldn't be difficult to write.

#923 is the companion to this analysis.
It annotates keys of interest in place within values.yaml.

I do not think we should attempt to support both models, as designing good
configuration and templates for that would be quite difficult. If we want to
keep support for the sidecar in any fashion, I'd recommend doing that in a
continued 2.x line while we use 3.x for split Deployments, and work with users
to figure out what they'd need to migrate if it's difficult.

Background

The current design of the kong chart uses a single Deployment, which can
contain a Kong container, controller container, or both. values.yaml only has a
single set of Deployment-level (and Pod-level) settings, as these are shared
equally among all containers.

Migrations Jobs often also share many of these settings even though they're a
separate workload. Since migrations use a Kong container, settings applicable
to the Kong Deployment are often applicable to the migrations, and sharing them
allows us to economize on settings.

To split Kong and the controller into separate Deployments, we thus need to
convert these settings into two settings in most cases. We may opt to fall back
to the original shared setting for both--it's not ideal and limits some
functionality (e.g. you probably don't want the same replica count and
scaling rules for a controller Deployment as a Kong Deployment), but could
allow for a simpler transition for existing users. Defaults for the new
settings in the base values.yaml will apply, however, so this option is
probably less appealing than it'd seem initially.

values.yaml organization

The chart has grown organically over time, with historically less-informed
decicisions about how to organize its configuration effectively. We have, for
example, root-level .podAnnotations, .deploymentAnnotations, and
.deployment configuration keys. Although it'd make sense to nest these along
the lines of the actual resource structure, like

deployment:
    annotations: {}
    pod: 
        annotations: {}

we added one of the root-level annotations keys early, before we needed the
others, and didn't move it when it became disorganized.

Mirrored and consolidated Kubernetes resource configuration

Since adding a second Deployment will require separate configuration keys for
many existing settings, it'd make sense to organize them following our most
recent structure. For example:

deployment:
  controller:
    annotations: {} # Deployment-level annotations
    autoscaling: {} # A duplicate of the current root key, for the controller Deployment
    pod:
      annotations: {} # Pod-level annotations. currently there is a single root-level .podAnnotations used for the single unified Pod
      serviceAccount: {}
      container:
        env: {...} # controller environment
        image: {}
      sidecarContainers: [] # attach a tcpdump or what have you
  kong: {} # same as the controller section

admissionWebhook: {} # the current contents for .ingressController.admissionWebhook

This section should hold all configuration that lives under the Kubernetes
Deployment resource.

This section may hold configuration related to Deployment-associated resources,
like HorizontalPodAutoscaler (.deployment.*.autoscaling). We can alternate
similarly split those sections (autoscaling.controller.* and
autoscaling.kong.*).

Application-level configuration

We have several sections dedicated to application-specific configuration
unrelated to Kubernetes resources, such as the ingressController.konnect
settings and enterprise settings.

While these generally manipulate values in a container's environment, they
don't clearly fit into the Kubernetes resource structure from a user POV. As
such I think we should keep them mostly at their current locations, with the
end result that we have deployment.controller (or
deployment.ingressController if we want name consistency) for standard
Kubernetes resource settings and ingressController.konnect,
ingressController.gatewayDiscovery, and so on for our application-specific
configuration.

Migration strategies

Moving keys is a breaking change for existing settings. If we move
configuration currently elsewhere (e.g. .podAnnotations to
.deployment.kong.pod.annotations), users will have to update configuration.

We can provide a helper script, however, as looking up a YAML key and moving it
elsewhere isn't particularly hard.

We could alternatively continue to support existing keys, but this comes with
template complexity and non-obvious precedence rules. If you, for example, see
documentation saying "set .deployment.kong.pod.annotations" and do so without
realizing you already have a .podAnnotations and we prefer the latter, your
settings will not take effect and you will be confused. We would probably need
to prefer the older setting if present due to the way values.yaml defaults
work.

We can opt to use the newest organization pattern for new settings only (for
example, creating .deployment.controller.pod.annotations while retaining
.podAnnotations for the proxy, but this prolongs the problem.

Existing release migration

After changes to values.yaml, users will need to upgrade releases.

kong chart users

Going from the kong chart with a single Deployment to one with multiple
should not require much beyond updating values.yaml, and maybe choosing
different settings for the new separate Deployment (optional, but probably
advisable).

AFAIK the main task here is to handle the change from an internal localhost
listen for the admin API to discovery, so flipping the admin Service on with
the appropriate external listen and enabling discovery. There's some manual
effort required, but not much beyond a typical upgrade.

Multiple release kong chart users

Although some classes of separate release (namely separate controller releases)
should no longer be necessary, users can continue using them to avoid breaking
release continuity.

These users still need to migrate values.yaml and set up discovery if they were
using a single-Deployment Kong+controller release, but they do not need to
consolidate releases if they do not want to. The procedure to create, for
example, a hybrid DP release is still to create a release with the controller
disabled and DP settings configured.

Although the multi-Deployment configuration would make single-release hybrid
mode more viable, I'm not considering it at present.

ingress chart users

This state has no perfect migration path. Helm, broadly, cannot transfer
resources from one release to another (there's technically maybe a path by
uninstalling and keeping resources and then manipulating labels to trick it
into adopting them into a new release, but that's fiddly as heck and I don't
really want to entertain it as a generally acceptable plan of action).

Migrating existing ingress chart releases to the kong chart would thus
require deleting an ingress release and creating a kong release. This
requires an outage window in the simplest case, though it may be possible to
carefully orchestrate a handover with the old release still installed.

Since ingress is DB-less only, this approach should be fine: the new
instances will come up, see the same Ingress resources, and apply the same
configuration. This shouldn't be significantly different from running kubectl rollout restart beyond that there are temporarily no replicas of the proxy
or controller.

Template refactoring

Using mirrored structure for the Kubernetes resource configuration should
technically allow us to use a shared template for them that we invoke for each
section. However, application-level configuration, though not directly part of
the resource configuration, ends up there, mostly in the form of environment
variables.

It may be simple enough to just inject the environment variable set generated
from another section into the Deployment template. I'd have to try it to be
sure.

pmalek · 2023-11-03T18:41:53Z

I do not think we should attempt to support both models, as designing good
configuration and templates for that would be quite difficult. If we want to
keep support for the sidecar in any fashion, I'd recommend doing that in a
continued 2.x line while we use 3.x for split Deployments

👍 I like the idea. Make a clear end of support date for 2.x for users to align the expectations and huge links to the 3.x and the migration guide.

We can provide a helper script, however, as looking up a YAML key and moving it
elsewhere isn't particularly hard.

That's the way I've done it in the past. Spoiler alert: it's always more work than anticipated.

This state has no perfect migration path. Helm, broadly, cannot transfer
resources from one release to another (there's technically maybe a path by
uninstalling and keeping resources and then manipulating labels to trick it
into adopting them into a new release, but that's fiddly as heck and I don't
really want to entertain it as a generally acceptable plan of action).

Tempting... but no. This will end badly. Fiddly is an understatement.

What I'm slightly concerned about is the testing situation. We had 1 approach to golden tests in #748 but that never took off. We could revive that idea and use it as we progress with the refactor and reorganisation of keys in values to ensure we're getting the templates output that we're expecting.

rainest · 2023-11-03T19:26:02Z

Spoiler alert: it's always more work than anticipated.

Any particular pitfalls you hit, or just that some items that needed migrations were missed initially?

pmalek · 2023-11-03T20:31:42Z

Spoiler alert: it's always more work than anticipated.

Any particular pitfalls you hit, or just that some items that needed migrations were missed initially?

If it's 1:1 move without transformation then it should be OK-ish, with any sort of transformations or pivoting it's annoying to get it right the first time

rainest · 2023-11-10T23:43:25Z

Since we are (along with simply rearranging settings) moving users from sidecar to gateway discovery, the admin API must be exposed, which is not something existing kong chart users have had to consider for their security model.

We did unfortunately omit restrictions on its usage in ingress, but I don't think we'd want to release a final (i.e. not alpha) chart 3.x without mTLS automation in place, since we'd otherwise bake a reduction in security into an upgrade.

pmalek · 2023-11-14T16:45:30Z

If we take lessons learnt from https://docs.google.com/document/d/14jwnYoSj5RGvOCaWGpPuqa1VvyjKnFmscnYpiIxfXQ0/edit#heading=h.csub82xg1smh and (haven't figured it out yet) generate certs via helm's genSignedCert (or similar) just like we're doing it now but with proper alternative names, this should work right?

Currently we're using

charts/charts/kong/templates/service-kong-admin.yaml

Line 35 in 0242f36

{{- $cn := printf "admin.%s.svc" ( include "kong.namespace" . ) -}}

which wouldn't work with Gateway Discovery because

GD with service DNS strategy doesn't work with status readiness endpoint Gateway Discovery service scoped DNS strategy doesn't work with /status/ready kubernetes-ingress-controller#5110
we'd have to change to pod dns strategy which would require a different, wildcard based alternative name

We could then flip the tls skip verify to false.

rainest · 2023-11-22T16:16:34Z

Practical update testing with a (slightly outdated) #947 indicates that the ingress->kong migration actually lacks one of the major expected hurdles: Helm just doesn't care if you change the chart used for a release. Running something like

helm upgrade --install ana kong/ingress
helm upgrade --install ana rainestcharts/kong

is entirely acceptable and functions no differently than upgrading to a new version of the same chart. Migrating an ingress install with defaults to kong 3.x with defaults as the same release works fine.

Changes to resource names do, however, result in issues. The default names for Deployments and Services in kong lack the -gateway bit from the ingress subchart. Services will effectively be created new (and may be assigned new IPs depending on their configuration) and the old Deployments instantly vanish and promptly terminate their replicas, often before the new Deployment's replicas come online. Setting nameOverride allows mimicking the existing ingress names and replacing replicas/keeping the same Services as one would normally expect. The Service type override won't be necessary in the final release, since the admin default has changed to ClusterIP.

kong somewhat amusingly upgrades less smoothly in some respects. Although there are no naming issues, the new controller attempts to talk to the existing Pods, which do not have the admin API exposed, and never comes online. I'm not entirely sure why discovery doesn't find the new replica, but it doesn't. Exposing the admin API first appears necessary.

This doesn't yet cover the introduction of mutual TLS between the controller and admin API, which I'd like to have in place by default, and which will probably introduce additional complications.

Values migration from ingress is a separate, more difficult bag of worms. Keys no longer have their ingress and gateway prefixes, and while we can simply flatten them, some keys (e.g. replicaCount or podAnnotations) may be present in either section. We can probably migrate these by stuffing the ingress ones into the equivalent ingressController settings for the new Deployment in kong, but it requires more exhaustive review than simply migrating the keys that have changed.

rainest · 2023-11-22T21:43:49Z

values.yaml migration strategy as of https://github.com/Kong/chart-migrate/tree/71146f9534e74a8195d205efa5e13d152db99b60

kong chart values.yaml are easy. You move the keys that moved and leave the rest in place. The root command does this in a single step.

ingress chart values.yaml are more complicated. There are

controller.ingressController values that have moved need to move to their new home under ingressController.
controller keys that apply to the Deployment/Pod/etc. that should now configure the second Deployment in kong need to move to their equivalent keys under ingressController.deployment.
gateway keys generally need to move to the root.
controller keys that do not apply to the Deployment and such also need to move to the root.
Keys that are present in both gateway and controller that were not moved in either 1 or 2 are kind of a mystery. IDK how you should interpret both subcharts configuring stuff under certificates, for example--it's something you technically can, but probably shouldn't do. These prefer whatever value is in gateway and print a warning, since they probably require some user review. Hopefully this does not apply to most configurations.

We can run the root migrate command with a separate key list to handle 1 and 2. From there, a second merge command flattens the values.yaml to handle steps 3-5. This step deletes several keys where the merge strategy (preferring gateway keys) may cause issues:

// should not exist. could maybe create the webhook or RBAC resources, but they'd be configured incorrectly
// we _do not_ want this key to use the value from gateway
gateway.ingressController
// unsure how we should handle this along with deployment.kong.enabled. ignoring it since we can reasonably assume
// that ingress installs are not doing split deployments for the most part. on-prem hybrid is probably the only
// major exception.
gateway.enabled
// ditto
controller.enabled
// contained some overrides that were a hack for subchart stuff, should not be needed
controller.proxy

rainest · 2023-11-28T23:43:40Z

Following up on mTLS now that the split is staged. Some discussion earlier at #921 (comment)

The existing chart-managed mTLS implementation from #780 is a bit odd:

Configuration-wise, the controller and proxy configuration is not integrated. You can set the controller to use a generated certificate and CA, but the proxy will not require it unless you manually specify its name in the admin API configuration.

Implementation-wise, the admin Service template handles controller certificate generation and part of the proxy mount, though again, these systems do not integrate with one another. The proxy mount (from the admin API configuration) uses an unusual way to handle the configuration options. admin.tls.client.caBundle lets you dump a CA into values.yaml. There's no existing resource for this, so the chart creates a ConfigMap to hold it. This ConfigMap is also used if you provide a Secret name with secretName: the Helm client uses lookup to retrieve the Secret and extracts its contents, which it then copies to the ConfigMap, rather than mounting the Secret directly.

This implementation approach hits a chicken-egg problem with the generated certificate, since the lookup attempt precedes creating the Secret. To work around this without larger refactoring, I was able to use a separate if clause that mounts the Secret if we want to use the generated certificate.

I kinda want to simplify the UX and implementation here. Ideally we'd only use a single section, since there shouldn't be any reason to configure different CAs for the admin API and controller. I don't know that I'd want to make a breaking change for this, however, and it's not something where we could simply migrate key locations. On the implementation side, dropping support for providing the CA string in values.yaml would allow us to use a normal Secret mount--I figure extraResources is available if you really want to manage the CA from values.yaml.

czeslavo · 2023-12-08T11:58:05Z

I'll try to bring some context on why some things were done the way they are.

Configuration-wise, the controller and proxy configuration is not integrated. You can set the controller to use a generated certificate and CA, but the proxy will not require it unless you manually specify its name in the admin API configuration.

Originally #780 was implemented with split releases in mind as that was the only way to run Gateway Discovery at that time. Because of that, we couldn't make any validation between proxy and controller configuration.

This implementation approach hits a chicken-egg problem with the generated certificate, since the lookup attempt precedes creating the Secret. To work around this without larger refactoring, I was able to use a separate if clause that mounts the Secret if we want to use the generated certificate.

The idea was that one of the releases must be responsible for generating the key-pair Secret and the controller was designated to be that one. Because of that, you had to first deploy the controller release and then use the generated secret name in the gateway release (or the name you hardcoded in ingressController.adminApi.tls.client.caSecretName) -
it's described in the README (bottom of the Gateway discovery section):

On the controller release side, that can be achieved by setting ingressController.adminApi.tls.client.enabled to true. By default, Helm will generate a certificate Secret named -admin-api-keypair and a CA Secret named -admin-api-ca-keypair for you.
...
On the Gateway release side, set either admin.tls.client.secretName to the name of your CA Secret or set admin.tls.client.caBundle to the CA certificate string.

rainest self-assigned this Nov 1, 2023

rainest mentioned this issue Nov 2, 2023

Split the Deployment #923

Closed

4 tasks

mflendrich mentioned this issue Nov 8, 2023

Deprecate the sidecar (non-GD) mode in DB-less Kong/kubernetes-ingress-controller#4671

Open

2 tasks

rainest mentioned this issue Nov 17, 2023

Split the Deployment without values reorg #947

Merged

4 tasks

rainest mentioned this issue Nov 29, 2023

Use mTLS by default for 3.x #959

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Separate proxy and controller Deployments #921

Separate proxy and controller Deployments #921

rainest commented Nov 1, 2023

rainest commented Nov 2, 2023 •

edited

pmalek commented Nov 3, 2023

rainest commented Nov 3, 2023

pmalek commented Nov 3, 2023

rainest commented Nov 10, 2023

pmalek commented Nov 14, 2023

rainest commented Nov 22, 2023

rainest commented Nov 22, 2023

rainest commented Nov 28, 2023

czeslavo commented Dec 8, 2023

Separate proxy and controller Deployments #921

Separate proxy and controller Deployments #921

Comments

rainest commented Nov 1, 2023

rainest commented Nov 2, 2023 • edited

tl;dr

Background

values.yaml organization

Mirrored and consolidated Kubernetes resource configuration

Application-level configuration

Migration strategies

Existing release migration

kong chart users

Multiple release kong chart users

ingress chart users

Template refactoring

pmalek commented Nov 3, 2023

rainest commented Nov 3, 2023

pmalek commented Nov 3, 2023

rainest commented Nov 10, 2023

pmalek commented Nov 14, 2023

rainest commented Nov 22, 2023

rainest commented Nov 22, 2023

rainest commented Nov 28, 2023

czeslavo commented Dec 8, 2023

rainest commented Nov 2, 2023 •

edited