[Bug]: Operator upgrade 1.22.1 -> 1.22.2 -> 1.23.1 fails due to missing CRB #4518

tothf · 2024-05-13T04:02:13Z

Is there an existing issue already for this bug?

I have searched for an existing issue, and could not find anything. I believe this is a new bug.

I have read the troubleshooting guide

I have read the troubleshooting guide and I think this is a new bug.

I am running a supported version of CloudNativePG

I have read the troubleshooting guide and I think this is a new bug.

Contact Details

No response

Version

1.23.0

What version of Kubernetes are you using?

1.27

What is your Kubernetes environment?

Other

How did you install the operator?

Other

What happened?

We are running CNPG on Openshift. We have updated a lab instance, Openshift 4.13.25, from 1.22.2 to 1.23.1 without issues.
We have an Openshift 4.14.17 where we had CNPG Operator 1.22.1 installed, all fine. We have upgraded the instance to 1.22.2, no issues. After the upgrade finished, all pods were up, we upgraded the operator to 1.23.1. Cluster instances were not started to be upgraded. We have checked and the controller-manager deployment complained that the cnpg-manager service account does not have permission to list ClusterImageCatalogs.
The reason was that the ClusterImageCatalogs access is a simple Role applied as a RoleBinding that allows only the namespace where the operator is installed.
The problem is that the ClusterImageCatalogs CRD instances are not created in any namespace but on the cluster level.
As a workaround, we have created a ClusterRole with the same ClusterImageCatalogs access (get, list, watch) and a ClusterRoleBinding for cnpg-manager SA. Once we applied the upgrade was completed successfully.
Sorry, the logs were purged when the upgrade succeeded hence I cannot attach them.

Code to apply workaround:

kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: cnpg-clusterimagecatalogs
rules:
  - verbs:
      - get
      - list
      - watch
    apiGroups:
      - postgresql.cnpg.io
    resources:
      - clusterimagecatalogs
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: cnpg-clusterimagecatalogs
subjects:
  - kind: ServiceAccount
    name: cnpg-manager
    namespace: postgres
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cnpg-clusterimagecatalogs

Cluster resource

No response

Relevant log output

No response

Code of Conduct

I agree to follow this project's Code of Conduct

tothf added the triage Pending triage label May 13, 2024

tothf assigned gbartolini May 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: Operator upgrade 1.22.1 -> 1.22.2 -> 1.23.1 fails due to missing CRB #4518

[Bug]: Operator upgrade 1.22.1 -> 1.22.2 -> 1.23.1 fails due to missing CRB #4518

tothf commented May 13, 2024 •

edited

[Bug]: Operator upgrade 1.22.1 -> 1.22.2 -> 1.23.1 fails due to missing CRB #4518

[Bug]: Operator upgrade 1.22.1 -> 1.22.2 -> 1.23.1 fails due to missing CRB #4518

Comments

tothf commented May 13, 2024 • edited

Is there an existing issue already for this bug?

I have read the troubleshooting guide

I am running a supported version of CloudNativePG

Contact Details

Version

What version of Kubernetes are you using?

What is your Kubernetes environment?

How did you install the operator?

What happened?

Cluster resource

Relevant log output

Code of Conduct

tothf commented May 13, 2024 •

edited