Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Waiting for postgres pod even though external postgres is defined #213

Closed
daenney opened this issue Apr 15, 2021 · 6 comments · Fixed by #231
Closed

Waiting for postgres pod even though external postgres is defined #213

daenney opened this issue Apr 15, 2021 · 6 comments · Fixed by #231
Labels
component:docs Improvements or additions to documentation good first issue Good for newcomers help wanted Extra attention is needed

Comments

@daenney
Copy link
Contributor

daenney commented Apr 15, 2021

---
apiVersion: awx.ansible.com/v1beta1
kind: AWX
metadata:
  name: awx
  namespace: awx
spec:
  tower_postgres_configuration_secret: awx-postgres-configuration 
  tower_old_postgres_configuration_secret: awx-old-postgres-configuration 
$ kubectl -n awx get secrets
NAME                             TYPE                                  DATA   AGE
[..]
awx-old-postgres-configuration   Opaque                                5      21m
[..]
awx-postgres-configuration       Opaque                                5      21m
[..]

Operator logs the following:

TASK [installer : Get the postgres pod information] ****************************\r\ntask path: /opt/ansible/roles/installer/tasks/migrate_data.yml:11\nfatal: [localhost]: FAILED! => {\"msg\": \"The conditional check 'postgres_pod['resources'][0]['status']['phase'] == 'Running'' failed. The error was: error while evaluating conditional (postgres_pod['resources'][0]['status']['phase'] == 'Running'): list object has no element 0\"}\n\r\nPLAY RECAP
*********************************************************\r\nlocalhost                  : ok=29   changed=0    unreachable=0    failed=1    skipped=13   rescued=0    ignored=0   \r\n\n","job":"2740376916591569721","name":"awx","namespace":"awx","error":"exit status 2","stacktrace":"github.com/go-logr/zapr.(*zapLogger).Error\n\tpkg/mod/github.com/go-logr/zapr@v0.1.1/zapr.go:128\ngithub.com/operator-framework/operator-sdk/pkg/ansible/runner.(*runner).Run.func1\n\tsrc/github.com/operator-framework/operator-sdk/pkg/ansible/runner/runner.go:239"}

--------------------------- Ansible Task Status Event StdOut  -----------------

PLAY RECAP *********************************************************************
localhost                  : ok=29   changed=0    unreachable=0    failed=1    skipped=13   rescued=0    ignored=0   


-------------------------------------------------------------------------------
{"level":"error","ts":1618494205.2788725,"logger":"controller-runtime.controller","msg":"Reconciler error","controller":"awx-controller","request":"awx/awx","error":"event runner on failed","stacktrace":"github.com/go-logr/zapr.(*zapLogger).Error\n\tpkg/mod/github.com/go-logr/zapr@v0.1.1/zapr.go:128\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\tpkg/mod/sigs.k8s.io/controller-runtime@v0.6.0/pkg/internal/controller/controller.go:258\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\tpkg/mod/sigs.k8s.io/controller-runtime@v0.6.0/pkg/internal/controller/controller.go:232\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker\n\tpkg/mod/sigs.k8s.io/controller-runtime@v0.6.0/pkg/internal/controller/controller.go:211\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1\n\tpkg/mod/k8s.io/apimachinery@v0.18.2/pkg/util/wait/wait.go:155\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil\n\tpkg/mod/k8s.io/apimachinery@v0.18.2/pkg/util/wait/wait.go:156\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\tpkg/mod/k8s.io/apimachinery@v0.18.2/pkg/util/wait/wait.go:133\nk8s.io/apimachinery/pkg/util/wait.Until\n\tpkg/mod/k8s.io/apimachinery@v0.18.2/pkg/util/wait/wait.go:90"}

Given the postgres connection secrets are defined and passed in, why is the operator seemingly waiting (and thus erroring) on a postgres pod being available?

@tchellomello
Copy link
Contributor

tchellomello commented Apr 15, 2021

@daenney If you look at the code the task " Get the postgres pod information" is part of the https://github.com/ansible/awx-operator/blob/devel/roles/installer/tasks/migrate_data.yml#L11 which will be imported when these conditions met:

https://github.com/ansible/awx-operator/blob/devel/roles/installer/tasks/database_configuration.yml#L90

 - name: Migrate data from old Openshift instance
   import_tasks: migrate_data.yml
   when:
     - old_pg_config['resources'] is defined
     - old_pg_config['resources'] | length
     - this_awx['resources'][0]['status']['towerMigratedFromSecret'] is not defined

So are you trying to upgrade from a locally managed Postgres instance to an external instance? If so, do you have a PostgreSQL pod running? Can we get the output kubectl describe pod <pgsql-managed>?

If that is the case, it could be your local managed PostgreSQL pod does not the expected labels and since the reconciliation task runs after the database_configuration you are hitting this error.

@daenney
Copy link
Contributor Author

daenney commented Apr 15, 2021

No, there isn't a locally managed postgres. It's remote in both cases (RDS). It wasn't clear to me it could only upgrade from a locally managed instance.

@daenney
Copy link
Contributor Author

daenney commented Apr 15, 2021

The old instance predates 18.x release, and the migration docs suggested to me this is what we should do. Based on https://github.com/ansible/awx-operator/blob/devel/docs/migration.md there was nothing in there indicating to me it couldn't upgrade from an old, remote, database.

@tchellomello tchellomello added the component:docs Improvements or additions to documentation label Apr 17, 2021
@tchellomello
Copy link
Contributor

@daenney I see your point. Having this configuration will enable you to move the database from one place to another (see more at https://github.com/ansible/awx-operator/blob/devel/roles/installer/tasks/database_configuration.yml#L90-L95 and https://github.com/ansible/awx-operator/blob/devel/roles/installer/tasks/migrate_data.yml#L59-L70)

So in your case, remove from your awx kind the old tower_old_postgres_configuration_secret and the database schema should be upgraded by https://github.com/ansible/awx-operator/blob/devel/roles/installer/tasks/main.yml#L98-L108

Let us know if that works and I agree, the documentation requires a little bit of refinement to make it clear.

If you want to send a PR, it is welcome!

@daenney
Copy link
Contributor Author

daenney commented Apr 19, 2021

Thanks for the explanation. I tried pointing it at an existing external database and everything worked perfectly. I'll send a PR tomorrow with some doc updates!

@lingenavd
Copy link

lingenavd commented Apr 19, 2021

@tchellomello and @daenney,
brilliant, you were just a few hours ahead of me in my attempt to upgrade from 18.0.0 to 19.0.0 with external Postgres db!
I had the same issue tonight on k8s........I also have a remote Postgres db that could not be checked:

From the awx-operator (0.8.0) logs

"msg": "The conditional check 'postgres_pod['resources'][0]['status']['phase'] == 'Running'' failed. The error was: error while evaluating conditional (postgres_pod['resources'][0]['status']['phase'] == 'Running'): list object has no element 0"

I removed the line "tower_old_postgres_configuration_secret" from my-awx.yml as suggested and that worked for me.
Plain and simple upgrade and a working AWX 19.0.0 instance. Not found any issues with the credentials, so far so good.

grtz,
Andre

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component:docs Improvements or additions to documentation good first issue Good for newcomers help wanted Extra attention is needed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants