Skip to content
This repository has been archived by the owner on Sep 14, 2020. It is now read-only.

Helm 3 change triggers create insteand of update #359

Open
Carles-Figuerola opened this issue May 7, 2020 · 3 comments
Open

Helm 3 change triggers create insteand of update #359

Carles-Figuerola opened this issue May 7, 2020 · 3 comments
Labels
question Further information is requested

Comments

@Carles-Figuerola
Copy link

Long story short

helm 3 updates trigger @kopf.on.create rather than @kopf.on.update.

Description

After a resource has been created using helm3 (haven't tested with helm2), and the definition changes, kopf is triggering the create function rather than update

These are the debug logs from helm
I0507 14:27:07.862467   34642 round_trippers.go:443] GET https://XXXXXXXXXXXXXXXXXXXX.us-west-2.eks.amazonaws.com/apis/my-api-domain/v1/namespaces/default/my-resources/myresource 200 OK in 108 milliseconds
I0507 14:27:07.862500   34642 round_trippers.go:449] Response Headers:
I0507 14:27:07.862510   34642 round_trippers.go:452]     Audit-Id: xxxxx-xxxx-xxxx-xxxx-xxxxxx
I0507 14:27:07.862519   34642 round_trippers.go:452]     Content-Type: application/json
I0507 14:27:07.862526   34642 round_trippers.go:452]     Content-Length: 1615
I0507 14:27:07.862533   34642 round_trippers.go:452]     Date: Thu, 07 May 2020 19:27:07 GMT
I0507 14:27:07.863041   34642 request.go:1068] Response Body: {"apiVersion":"my-api-domain/v1","kind":"MyResource","metadata":{"annotations":{"kopf.zalando.org/last-handled-configuration":"{\"spec\": \"content\"}, \"metadata\": {\"labels\": {\"app.kubernetes.io/managed-by\": \"Helm\"}, \"annotations\": {\"meta.helm.sh/release-name\": \"myapp\", \"meta.helm.sh/release-namespace\": \"default\"}}}","meta.helm.sh/release-name":"myapp","meta.helm.sh/release-namespace":"default"},"creationTimestamp":"2020-05-07T19:06:28Z","finalizers":["kopf.zalando.org/KopfFinalizerMarker"],"generation":4,"labels":{"app.kubernetes.io/managed-by":"Helm"},"name":"myresource","namespace":"default","resourceVersion":"36598","selfLink":"/apis/my-api-domain/v1/namespaces/default/my-resources/myresource","uid":"daaf8839-9095-11ea-85ab-024b3556169a"},"spec":"content"}
I0507 14:27:07.863377   34642 request.go:1068] Request Body: {"apiVersion":"my-api-domain/v1","kind":"MyResource","metadata":{"annotations":{"meta.helm.sh/release-name":"myapp","meta.helm.sh/release-namespace":"default"},"labels":{"app.kubernetes.io/managed-by":"Helm"},"name":"myresource","namespace":"default","resourceVersion":"36598"},"spec":"content"}
I0507 14:27:07.863502   34642 round_trippers.go:423] curl -k -v -XPUT  -H "Accept: application/json" -H "Content-Type: application/json" 'https://XXXXXXXXXXXXXXXXXXXX.us-west-2.eks.amazonaws.com/apis/my-api-domain/v1/namespaces/default/my-resources/myresource'
I0507 14:27:07.990540   34642 round_trippers.go:443] PUT https://XXXXXXXXXXXXXXXXXXXX.us-west-2.eks.amazonaws.com/apis/my-api-domain/v1/namespaces/default/my-resources/myresource 200 OK in 127 milliseconds
I0507 14:27:07.990576   34642 round_trippers.go:449] Response Headers:
I0507 14:27:07.990584   34642 round_trippers.go:452]     Audit-Id: xxxxx-xxxx-xxxx-xxxx-xxxxxx
I0507 14:27:07.990589   34642 round_trippers.go:452]     Content-Type: application/json
I0507 14:27:07.990593   34642 round_trippers.go:452]     Content-Length: 894
I0507 14:27:07.990597   34642 round_trippers.go:452]     Date: Thu, 07 May 2020 19:27:07 GMT
I0507 14:27:07.990862   34642 request.go:1068] Response Body: {"apiVersion":"my-api-domain/v1","kind":"MyResource","metadata":{"annotations":{"meta.helm.sh/release-name":"myapp","meta.helm.sh/release-namespace":"default"},"creationTimestamp":"2020-05-07T19:06:28Z","generation":5,"labels":{"app.kubernetes.io/managed-by":"Helm"},"name":"myresource","namespace":"default","resourceVersion":"39209","selfLink":"/apis/my-api-domain/v1/namespaces/default/my-resources/myresource","uid":"daaf8839-9095-11ea-85ab-024b3556169a"},"spec":"modifiedContent"}
client.go:446: [debug] Replaced "myresource" with kind MyResource for kind MyResource


This is how my operator sees this update:
[2020-05-07 19:36:45,024] myclass              [INFO    ] This is the on_create function
[2020-05-07 19:36:45,374] kopf.objects         [ERROR   ] [default/myresource] Handler 'on_create' failed permanently: <redacted> already exists.
[2020-05-07 19:36:45,375] kopf.objects         [INFO    ] [default/myresource] All handlers succeeded for creation.
However, when I apply the same changes using `kubectl apply -f`, I get the expected result.
[2020-05-07 19:36:45,024] myclass              [INFO    ] This is the on_update function
[2020-05-07 19:30:14,944] kopf.objects         [INFO    ] [default/myresource] Handler 'on_update' succeeded.
[2020-05-07 19:30:14,944] kopf.objects         [INFO    ] [default/myresource] All handlers succeeded for update.

Environment

  • Kopf version: 0.24
  • Kubernetes version: v1.14.9-eks-f459c0
  • Python version: 3.7.6
  • OS/platform: Debian 10.2
Python packages installed
pip freeze --all
aiohttp==3.6.2
aiojobs==0.2.2
async-timeout==3.0.1
attrs==19.3.0
boto3==1.10.50
botocore==1.13.50
certifi==2019.11.28
chardet==3.0.4
Click==7.0
docutils==0.15.2
idna==2.8
iso8601==0.1.12
jmespath==0.9.4
kopf==0.24
multidict==4.7.3
pip==19.3.1
pykube-ng==19.12.1
python-dateutil==2.8.1
pytz==2019.3
PyYAML==5.3
requests==2.22.0
s3transfer==0.2.1
setuptools==44.0.0
six==1.13.0
typing-extensions==3.7.4.1
urllib3==1.25.7
wheel==0.33.6
yarl==1.4.2
@Carles-Figuerola Carles-Figuerola added the bug Something isn't working label May 7, 2020
@Carles-Figuerola
Copy link
Author

I think the issue here is that I'm using --force on my helm commands. I don't know if that changed slightly with helm2 to helm3, but I'm going to investigate further. For context, that's from helm's help:

   --force             force resource updates through a replacement strategy

@nolar
Copy link
Contributor

nolar commented May 12, 2020

Thank you for reporting.

Well, a "replacement strategy" phrasing implies deletion and re-creation. Some quick googling also points to comments like helm/helm#5281 (comment) saying that "…helm upgrade —force is the equivalent of a kubectl replace…". kubectl replace docs and options also hint that it goes through deletion.

So, Kopf handles what actually happens on Kubernetes level — deletion and then creation.

Internally, Kopf distinguishes between creation and upgrades by presence or absence of an annotation named kopf.zalando.org/last-handled-configuration, which contains a JSON-serialised essence of the object as it was last handled, incl. at the 1st cycle of creation. I believe, there is no way to transfer 3rd-party annotations on Helm upgrades. Which means, continuity of the resource cannot be simulated.

So, you have to implement your own global state, accessed by e.g. resource names or namespaces+names. Then, put both on-creation and on-update handlers to the same function, and check for the existence internally.

Conceptually, something like this:

import kopf

RESOURCES = {}


@kopf.on.create(...)
@kopf.on.update(...)
def myresouce_changed_fn(namespace, name, **_):
    key = f'{namespace}/{name}'
    if key in RESOURCES:
        print('it was an upgrade')
    else:
        print('it was a real creation')
        RESOURCES[key] = ...


@kopf.on.delete(...)
def myresource_deleted(namespace, name, **_):
    key = f'{namespace}/{name}'
    if key in RESOURCES:
       del RESOURCES[key]

@nolar
Copy link
Contributor

nolar commented May 12, 2020

As a slightly more advanced solution, but also slightly more complex, you can utilise the recent configurable storages for "diff-bases" aka "essences" of the resources (available since 0.27rc6 — release candidates yet, as of 2020-05-12):

import kopf
from typing import MutableMapping, Any, Optional


class ByNameDiffBaseStorage(kopf.DiffBaseStorage):
    _items: MutableMapping[str, kopf.BodyEssence]

    def __init__(self) -> None:
        super().__init__()
        self._items = {}

    def fetch(self, *, body: kopf.Body) -> Optional[kopf.BodyEssence]:
        key = f'{body.metadata.namespace}/{body.metadata.name}'
        return self._items.get(key, None)

    def store(self, *, body: kopf.Body, patch: kopf.Patch, essence: kopf.BodyEssence) -> None:
        key = f'{body.metadata.namespace}/{body.metadata.name}'
        self._items[key] = essence


@kopf.on.startup()
def configure(settings: kopf.OperatorSettings, **_):
    settings.persistence.diffbase_storage = kopf.MultiDiffBaseStorage([
        settings.persistence.diffbase_storage,  # the default Kopf's storage
        ByNameDiffBaseStorage(),
    ])


@kopf.on.create('zalando.org', 'v1', 'kopfexamples')
def create_fn(**_):
    pass


@kopf.on.update('zalando.org', 'v1', 'kopfexamples')
def update_fn(**_):
    pass


@kopf.on.delete('zalando.org', 'v1', 'kopfexamples')
def delete_fn(**_):
    pass

The magic happens in MultiDiffBaseStorage: if it can find a last-handled-configuration record on the resource itself (in annotations), it will use it — which is the case for the existing resources. If it is absent there —which happens when the resource is "replaced" by Helm— it will try the in-memory storage by "namespace/name" key, even if it is a different resource, and so it will detect "update" cause instead of "creation". A new last-handled-configuration will be put to both storages after the on-update handler succeeds.

Note: I've made only a quick-test of this code, not the thorough testing. This is only a theory how the problem can be hacked. This code can lead to unexpected consequences, so better test it in an isolated cluster (e.g. minikube or kind or so).

Two quite obvious problems:

  • If the new resource replacement essentially mismatches with the original deleted one, this can confuse the business/domain logic. For example, some fields in some built-in resources (e.g. Deployment's selectors) are immutable normally, so there is probably no logic to handle their changes; but the replacement will change them.

  • If the replacement happens when the operator is down or restarting, the in-memory storage will not transfer the resource's essence from the previous object (i.e. between it is stored on the old resource for the last time and it is fetched from the new resource for the first time). So, after the operator startup, it will be detected as "creation". You probably need some external persistent storage rather than memory, which will persist the essences during the operator restart/downtime — e.g. databases or redis or alike; or shadow Kubernetes objects, with Kubernetes/etcd as a database (not sure if this is a good idea).

One fancy effect:

  • If the resource is deleted and created without changes, the on-update handler is not even called — because there is nothing updated (the diff is empty). Only by changing the resource or its yaml file can the on-update handler be triggered. This is as expected.

PS: Perhaps, you also want to include the resource's kind/singular/plural/full name into the key, in case you have multiple resources served by the same operator. Otherwise, they can have the same "namespace/name" strings, and will collide with each other in memory.

More info:

@nolar nolar added question Further information is requested and removed bug Something isn't working labels May 12, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants