Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AWX Operator cannot create pods after upgrading from k8s 1.21.3 to 1.22.0 #494

Closed
Kaauw opened this issue Aug 11, 2021 · 6 comments
Closed

Comments

@Kaauw
Copy link

Kaauw commented Aug 11, 2021

ISSUE TYPE
  • Bug Report
SUMMARY

AWX Operator cannot create awx-pods after upgrading from k8s 1.21.3 to 1.22.0
Worked well before.

ENVIRONMENT
  • AWX version: 19.2.2
  • Operator version: 0.12.0
  • Kubernetes version: v1.22.0
  • AWX install method: on premise with ubuntu 20.04 and docker
STEPS TO REPRODUCE

From 1.21.3 install awx-operator and setup awx. It work well.
Upgrade to 1.22.0
Kill and recreate the awx deployment
pods awx postgres is up
pods awx server is not up

EXPECTED RESULTS

AWX should be up in running state

ACTUAL RESULTS

Only awx postgres is up

NAME                                     READY   STATUS    RESTARTS      AGE
awx-itd-postgres-0                       1/1     Running   0             8m29s
awx-operator-545497f7d5-k88wr            1/1     Running   1 (34m ago)   55m
nfs-client-provisioner-5c95d8f86-9tm6k   1/1     Running   5 (34m ago)   5d4h
ADDITIONAL INFORMATION

Yaml file i use to create pods :

---
apiVersion: awx.ansible.com/v1beta1
kind: AWX
metadata:
  name: awx-itd
spec:
  service_type: LoadBalancer
  loadbalancer_protocol: http
  loadbalancer_port: 80
  loadbalancer_annotations: |
    metallb.universe.tf/address-pool: bde-172-17
  hostname: awx.bde.lab
  replicas: 2
  projects_persistence: true
  projects_storage_class: managed-nfs-storage
  postgres_storage_class: managed-nfs-storage
  #adminUser: admin
AWX-OPERATOR LOGS
PLAY RECAP *********************************************************************
localhost                  : ok=29   changed=2    unreachable=0    failed=1    skipped=27   rescued=0    ignored=0


-------------------------------------------------------------------------------
{"level":"error","ts":1628693554.202394,"logger":"controller-runtime.controller","msg":"Reconciler error","controller":"awx-controller","request":"default/awx-itd","error":"event runner on failed","stacktrace":"github.com/go-logr/zapr.(*zapLogger).Error\n\tpkg/mod/github.com/go-logr/zapr@v0.1.1/zapr.go:128\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\tpkg/mod/sigs.k8s.io/controller-runtime@v0.6.0/pkg/internal/controller/controller.go:258\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\tpkg/mod/sigs.k8s.io/controller-runtime@v0.6.0/pkg/internal/controller/controller.go:232\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker\n\tpkg/mod/sigs.k8s.io/controller-runtime@v0.6.0/pkg/internal/controller/controller.go:211\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1\n\tpkg/mod/k8s.io/apimachinery@v0.18.2/pkg/util/wait/wait.go:155\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil\n\tpkg/mod/k8s.io/apimachinery@v0.18.2/pkg/util/wait/wait.go:156\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\tpkg/mod/k8s.io/apimachinery@v0.18.2/pkg/util/wait/wait.go:133\nk8s.io/apimachinery/pkg/util/wait.Until\n\tpkg/mod/k8s.io/apimachinery@v0.18.2/pkg/util/wait/wait.go:90"}
@duntonr
Copy link

duntonr commented Aug 16, 2021

Observing the same.

After updating microk8s to 1.22 and then trying to deploy operator 0.13, the operator container upgraded but seem to get errors when it tries to upgrade the awx pod

"job":"6175742077372812453","name":"awx","namespace":"default","error":"exit status 2","stacktrace":"github.com/go-logr/zapr.(*zapLogger).Error\n\tpkg/mod/github.com/go-logr/zapr@v0.1.1/zapr.go:128\ngithub.com/operator-framework/operator-sdk/pkg/ansible/runner.(*runner).Run.func1\n\tsrc/github.com/operator-framework/operator-sdk/pkg/ansible/runner/runner.go:239"}
{"level":"error","ts":1629084951.678016,"logger":"controller-runtime.controller","msg":"Reconciler error","controller":"awx-controller","request":"default/awx","error":"event runner on failed","stacktrace":"github.com/go-logr/zapr.(*zapLogger).Error\n\tpkg/mod/github.com/go-logr/zapr@v0.1.1/zapr.go:128\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\tpkg/mod/sigs.k8s.io/controller-runtime@v0.6.0/pkg/internal/controller/controller.go:258\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\tpkg/mod/sigs.k8s.io/controller-runtime@v0.6.0/pkg/internal/controller/controller.go:232\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker\n\tpkg/mod/sigs.k8s.io/controller-runtime@v0.6.0/pkg/internal/controller/controller.go:211\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1\n\tpkg/mod/k8s.io/apimachinery@v0.18.2/pkg/util/wait/wait.go:155\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil\n\tpkg/mod/k8s.io/apimachinery@v0.18.2/pkg/util/wait/wait.go:156\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\tpkg/mod/k8s.io/apimachinery@v0.18.2/pkg/util/wait/wait.go:133\nk8s.io/apimachinery/pkg/util/wait.Until\n\tpkg/mod/k8s.io/apimachinery@v0.18.2/pkg/util/wait/wait.go:90"}

the awx yaml is extremely basic:

---
apiVersion: awx.ansible.com/v1beta1
kind: AWX
metadata:
  name: awx
spec:
  task_privileged: true

@shanemcd
Copy link
Member

Chatting with some folks from the operator-sdk team.

This seems like it might be caused by the fact that our operator is built upon version 0.19 of the sdk.

cc @Spredzy @rooftopcellist

@shanemcd
Copy link
Member

Confirmed with the operator-sdk team that operators built upon 0.x are not going to work on the newer version of kubernetes. We've prioritized bumping this at some point in the near-term. Any updates will be posted here.

@ggiinnoo
Copy link
Contributor

I can confirm this issue exists on version 1.22.1.
Going back to version 1.21.3 resolves the issue 🥳

@rooftopcellist
Copy link
Member

rooftopcellist commented Aug 30, 2021

With the addition of this PR - #508, I am able to deploy the awx-operator and awx app to an Openshift 4.9 cluster (k8s v1.22.0).

$ oc version
Client Version: 4.6.8
Server Version: 4.9.0-0.nightly-2021-08-23-224104
Kubernetes Version: v1.22.0-rc.0+5c2f7cd

All containers are running and I am able to log in and run a job from the UI.

image

To use this fix, you will need to build the awx-operator image from devel at the moment as a release with this fix has not yet been cut.

@duntonr
Copy link

duntonr commented Aug 30, 2021

I was also able to deploy 0.19.3 by editing https://raw.githubusercontent.com/ansible/awx-operator/devel/deploy/awx-operator.yaml to:

...
      containers:
        - name: awx-operator
          image: 'quay.io/ansible/awx-operator:devel'
          env:
            - name: WATCH_NAMESPACE
            - name: POD_NAME
              valueFrom:
                fieldRef:
                  apiVersion: v1
                  fieldPath: metadata.name
            - name: OPERATOR_NAME
              value: awx-operator
            - name: ANSIBLE_GATHERING
              value: explicit
            - name: OPERATOR_VERSION
              value: devel
            - name: ANSIBLE_DEBUG_LOGS
              value: 'false'
...

note changes on image tag and OPERATOR_VERSION to devel and applying it

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants