Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CSV stuck in install loop with auth-delegator already exists error #3187

Open
Gentoli opened this issue Mar 11, 2024 · 0 comments
Open

CSV stuck in install loop with auth-delegator already exists error #3187

Gentoli opened this issue Mar 11, 2024 · 0 comments
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@Gentoli
Copy link

Gentoli commented Mar 11, 2024

Bug Report

What did you do?
Upgrade OLM

What did you expect to see?
OLM picks up already installed operators

What did you see instead? Under which circumstances?

CSV stuck in retry loop with failed: clusterrolebindings.rbac.authorization.k8s.io "<some-operator>:auth-delegator" already exists

example log:

{"level":"error","ts":"2024-03-11T09:23:18Z","logger":"controllers.operator","msg":"Could not update Operator status","request":{"name":"cert-manager.cert-manager"},"error":"Operation cannot be fulfilled on operators.operators.coreos.com \"cert-manager.cert-manager\": the object has been modified; please apply your changes to the latest version and try again","stacktrace":"github.com/operator-framework/operator-lifecycle-manager/pkg/controller/operators.(*OperatorReconciler).Reconcile\n\t/home/runner/work/operator-lifecycle-manager/operator-lifecycle-manager/pkg/controller/operators/operator_controller.go:157\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/home/runner/work/operator-lifecycle-manager/operator-lifecycle-manager/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:119\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/home/runner/work/operator-lifecycle-manager/operator-lifecycle-manager/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:316\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/home/runner/work/operator-lifecycle-manager/operator-lifecycle-manager/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:266\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/home/runner/work/operator-lifecycle-manager/operator-lifecycle-manager/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:227"}
time="2024-03-11T09:23:19Z" level=warning msg="needs reinstall: webhooks not installed" csv=cert-manager.v1.14.2 id=22GeL namespace=cert-manager phase=Failed strategy=deployment
I0311 09:23:19.150921       1 event.go:298] Event(v1.ObjectReference{Kind:"ClusterServiceVersion", Namespace:"cert-manager", Name:"cert-manager.v1.14.2", UID:"2ff6cc43-eacb-42cd-9607-3f58f7a8a00a", APIVersion:"operators.coreos.com/v1alpha1", ResourceVersion:"905345183", FieldPath:""}): type: 'Normal' reason: 'NeedsReinstall' webhooks not installed
time="2024-03-11T09:23:19Z" level=info msg="scheduling ClusterServiceVersion for install" csv=cert-manager.v1.14.2 id=a47Rd namespace=cert-manager phase=Pending
I0311 09:23:19.958772       1 event.go:298] Event(v1.ObjectReference{Kind:"ClusterServiceVersion", Namespace:"cert-manager", Name:"cert-manager.v1.14.2", UID:"2ff6cc43-eacb-42cd-9607-3f58f7a8a00a", APIVersion:"operators.coreos.com/v1alpha1", ResourceVersion:"905345206", FieldPath:""}): type: 'Normal' reason: 'AllRequirementsMet' all requirements found, attempting install
time="2024-03-11T09:23:20Z" level=info msg="No api or webhook descs to add CA to"
time="2024-03-11T09:23:20Z" level=info msg="No api or webhook descs to add CA to"
time="2024-03-11T09:23:20Z" level=warning msg="reusing existing cert cert-manager-webhook-service-cert"
time="2024-03-11T09:23:20Z" level=warning msg="could not create auth delegator clusterrolebinding cert-manager-webhook-service-system:auth-delegator"
I0311 09:23:20.385386       1 event.go:298] Event(v1.ObjectReference{Kind:"ClusterServiceVersion", Namespace:"cert-manager", Name:"cert-manager.v1.14.2", UID:"2ff6cc43-eacb-42cd-9607-3f58f7a8a00a", APIVersion:"operators.coreos.com/v1alpha1", ResourceVersion:"905345224", FieldPath:""}): type: 'Warning' reason: 'InstallComponentFailed' install strategy failed: clusterrolebindings.rbac.authorization.k8s.io "cert-manager-webhook-service-system:auth-delegator" already exists
{"level":"error","ts":"2024-03-11T09:23:20Z","logger":"controllers.operator","msg":"Could not update Operator status","request":{"name":"cert-manager.cert-manager"},"error":"Operation cannot be fulfilled on operators.operators.coreos.com \"cert-manager.cert-manager\": the object has been modified; please apply your changes to the latest version and try again","stacktrace":"github.com/operator-framework/operator-lifecycle-manager/pkg/controller/operators.(*OperatorReconciler).Reconcile\n\t/home/runner/work/operator-lifecycle-manager/operator-lifecycle-manager/pkg/controller/operators/operator_controller.go:157\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/home/runner/work/operator-lifecycle-manager/operator-lifecycle-manager/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:119\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/home/runner/work/operator-lifecycle-manager/operator-lifecycle-manager/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:316\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/home/runner/work/operator-lifecycle-manager/operator-lifecycle-manager/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:266\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/home/runner/work/operator-lifecycle-manager/operator-lifecycle-manager/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:227"}
E0311 09:23:20.578578       1 queueinformer_operator.go:319] sync {"update" "cert-manager/cert-manager.v1.14.2"} failed: clusterrolebindings.rbac.authorization.k8s.io "cert-manager-webhook-service-system:auth-delegator" already exists

Environment

  • operator-lifecycle-manager version:
v0.27
  • Kubernetes version information:
PS \> kubectl version
Client Version: v1.28.4
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.28.7-gke.1026000

same error on

$ oc version
Client Version: 4.14.0-202401111553.p0.g286cfa5.assembly.stream-286cfa5
Kustomize Version: v5.0.1
Server Version: 4.15.0-0.okd-2024-01-27-070424
Kubernetes Version: v1.28.2-3568+0fb47263bee9d4-dirty
  • Kubernetes cluster kind:
    GKE/OKD

Possible Solution

Revert to v0.25.0 (kubectl apply -f https://github.com/operator-framework/operator-lifecycle-manager/releases/download/v0.25.0/olm.yaml --server-side --force-conflicts). seems like v0.26 and v0.27 are both broken

OR

Add olm.managed=true label to the existing resource (kubectl get rolebinding -n kube-system -o name | grep auth-reader | xargs -I {} kubectl label -n kube-system {} olm.managed=true)

Additional context
The catalog source have high cpu usage and timeouts. I think this is due to the constant retry.

@Gentoli Gentoli added the kind/bug Categorizes issue or PR as related to a bug. label Mar 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

No branches or pull requests

1 participant