Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

awx-web missing #1832

Open
3 tasks done
Reign1 opened this issue Apr 18, 2024 · 12 comments
Open
3 tasks done

awx-web missing #1832

Reign1 opened this issue Apr 18, 2024 · 12 comments

Comments

@Reign1
Copy link

Reign1 commented Apr 18, 2024

Please confirm the following

  • I agree to follow this project's code of conduct.
  • I have checked the current issues for duplicates.
  • I understand that the AWX Operator is open source software provided for free and that I might not receive a timely response.

Bug Summary

Following https://ansible.readthedocs.io/projects/awx-operator/en/latest/installation/helm-install-on-existing-cluster.html documentation I've installed awx-operator with helm install. Ended up with these resources:

NAME READY STATUS RESTARTS AGE
pod/awx-operator-controller-manager-69d8f784d8-5llkl 2/2 Running 0 12h
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/awx-operator-controller-manager-metrics-service ClusterIP 10.101.89.100 8443/TCP 12h
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/awx-operator-controller-manager 1/1 1 1 12h
NAME DESIRED CURRENT READY AGE
replicaset.apps/awx-operator-controller-manager-69d8f784d8 1 1 1 12h

On top of that created awx-demo.yaml:


apiVersion: awx.ansible.com/v1beta1
kind: AWX
metadata:
name: awx-demo
spec:
service_type: nodeport

Applied it with "kubectl -n awx apply -f awx-demo.yaml", got output: "awx.awx.ansible.com/awx-demo created".

Still I see no awx-web. Checked the logs "kubectl logs -f awx-operator-controller-manager-69d8f784d8-5llkl -n awx" and see this:

AWX Operator version

2.15

AWX version

24.2.0

Kubernetes platform

kubernetes

Kubernetes/Platform version

1.29.3

Modifications

no

Steps to reproduce

On a fresh k8s cluster (created with kubeadm) I'm trying to setup AWX. As per documentation https://ansible.readthedocs.io/projects/awx-operator/en/latest/installation/helm-install-on-existing-cluster.html I did helm install. That is it.

Expected results

Default AWX setup up and running with fronted exposed to be able to login and try it out.

Actual results

awx-operator deplyed but no awx-web pods running.

Additional information

No response

Operator Logs

kubectl logs -f awx-operator-controller-manager-69d8f784d8-5llkl -n awx:

{"level":"info","ts":"2024-04-17T19:23:30Z","logger":"cmd","msg":"Version","Go Version":"go1.20.12","GOOS":"linux","GOARCH":"amd64","ansible-operator":"v1.34.0","commit":"d26c43bf94960d292152862a6685696be33190fb"}
{"level":"info","ts":"2024-04-17T19:23:30Z","logger":"cmd","msg":"Watching namespaces","namespaces":["awx"]}
{"level":"info","ts":"2024-04-17T19:23:30Z","logger":"watches","msg":"Environment variable not set; using default value","envVar":"ANSIBLE_VERBOSITY_AWX_AWX_ANSIBLE_COM","default":2}
{"level":"info","ts":"2024-04-17T19:23:30Z","logger":"watches","msg":"Environment variable not set; using default value","envVar":"ANSIBLE_VERBOSITY_AWXBACKUP_AWX_ANSIBLE_COM","default":2}
{"level":"info","ts":"2024-04-17T19:23:30Z","logger":"watches","msg":"Environment variable not set; using default value","envVar":"ANSIBLE_VERBOSITY_AWXRESTORE_AWX_ANSIBLE_COM","default":2}
{"level":"info","ts":"2024-04-17T19:23:30Z","logger":"watches","msg":"Environment variable not set; using default value","envVar":"ANSIBLE_VERBOSITY_AWXMESHINGRESS_AWX_ANSIBLE_COM","default":2}
{"level":"info","ts":"2024-04-17T19:23:30Z","logger":"ansible-controller","msg":"Watching resource","Options.Group":"awx.ansible.com","Options.Version":"v1beta1","Options.Kind":"AWX"}
{"level":"info","ts":"2024-04-17T19:23:30Z","logger":"ansible-controller","msg":"Watching resource","Options.Group":"awx.ansible.com","Options.Version":"v1beta1","Options.Kind":"AWXBackup"}
{"level":"info","ts":"2024-04-17T19:23:30Z","logger":"ansible-controller","msg":"Watching resource","Options.Group":"awx.ansible.com","Options.Version":"v1beta1","Options.Kind":"AWXRestore"}
{"level":"info","ts":"2024-04-17T19:23:30Z","logger":"ansible-controller","msg":"Watching resource","Options.Group":"awx.ansible.com","Options.Version":"v1alpha1","Options.Kind":"AWXMeshIngress"}
{"level":"info","ts":"2024-04-17T19:23:30Z","logger":"proxy","msg":"Starting to serve","Address":"127.0.0.1:8888"}
{"level":"info","ts":"2024-04-17T19:23:30Z","logger":"apiserver","msg":"Starting to serve metrics listener","Address":"localhost:5050"}
{"level":"info","ts":"2024-04-17T19:23:30Z","logger":"controller-runtime.metrics","msg":"Starting metrics server"}
{"level":"info","ts":"2024-04-17T19:23:30Z","logger":"controller-runtime.metrics","msg":"Serving metrics server","bindAddress":"127.0.0.1:8080","secure":false}
{"level":"info","ts":"2024-04-17T19:23:30Z","msg":"starting server","kind":"health probe","addr":"[::]:6789"}
I0417 19:23:30.391565 2 leaderelection.go:250] attempting to acquire leader lease awx/awx-operator...
E0417 19:24:00.393847 2 leaderelection.go:332] error retrieving resource lock awx/awx-operator: Get "https://10.96.0.1:443/apis/coordination.k8s.io/v1/namespaces/awx/leases/awx-operator": dial tcp 10.96.0.1:443: i/o timeout
...

@YaronL16
Copy link
Contributor

Have you used a customized values.yaml file to enable the AWX resource?

Are the postgress and awx-task pods creating?

@Reign1
Copy link
Author

Reign1 commented Apr 23, 2024

Have you used a customized values.yaml file to enable the AWX resource?

Are the postgress and awx-task pods creating?

@YaronL16 , I only did what's provided in the Helm install instructions here: https://ansible.readthedocs.io/projects/awx-operator/en/latest/installation/helm-install-on-existing-cluster.html , and also did this "kubectl -n awx apply -f awx-demo.yaml". Content of awx-demo.yaml provided above. I would expect Helm install document to be complete (eg. you get front end exposed). If it's not - what's missing? Thanks!

@YaronL16
Copy link
Contributor

Have you used a customized values.yaml file to enable the AWX resource?
Are the postgress and awx-task pods creating?

@YaronL16 , I only did what's provided in the Helm install instructions here: https://ansible.readthedocs.io/projects/awx-operator/en/latest/installation/helm-install-on-existing-cluster.html , and also did this "kubectl -n awx apply -f awx-demo.yaml". Content of awx-demo.yaml provided above. I would expect Helm install document to be complete (eg. you get front end exposed). If it's not - what's missing? Thanks!

Well technically you did install the Operator, you just havent told it to set up the AWX resource.

But I agree the documentation is a bit lackluster. Anyway, as it says on the documentation, you should customize the installation with your own values file to overwrite the default ones. Most importantly set AWX.enabled to 'true'.

More info here:
https://github.com/ansible/awx-operator/blob/devel/.helm/starter/README.md

@Reign1
Copy link
Author

Reign1 commented Apr 23, 2024

Have you used a customized values.yaml file to enable the AWX resource?
Are the postgress and awx-task pods creating?

@YaronL16 , I only did what's provided in the Helm install instructions here: https://ansible.readthedocs.io/projects/awx-operator/en/latest/installation/helm-install-on-existing-cluster.html , and also did this "kubectl -n awx apply -f awx-demo.yaml". Content of awx-demo.yaml provided above. I would expect Helm install document to be complete (eg. you get front end exposed). If it's not - what's missing? Thanks!

Well technically you did install the Operator, you just havent told it to set up the AWX resource.

But I agree the documentation is a bit lackluster. Anyway, as it says on the documentation, you should customize the installation with your own values file to overwrite the default ones. Most importantly set AWX.enabled to 'true'.

More info here: https://github.com/ansible/awx-operator/blob/devel/.helm/starter/README.md

@YaronL16 thanks for the input, really helpful and everything makes more sense now. Indeed I did Help install without -f passing my own values. What is still not clear though is content of myvalues.yaml. What is the very minimum to have frontend exposed and be able to login as admin?

AWX:
  enabled: true

Is this it?

@YaronL16
Copy link
Contributor

YaronL16 commented Apr 23, 2024

@YaronL16 thanks for the input, really helpful and everything makes more sense now. Indeed I did Help install without -f passing my own values. What is still not clear though is content of myvalues.yaml. What is the very minimum to have frontend exposed and be able to login as admin?

AWX:
  enabled: true

Is this it?

I would have something like this at the minimum:

---
AWX:
  enabled: true
  name: awx-demo
  spec:
    service_type: ClusterIP

@kurokobo created a nice base values file as seen here:
https://github.com/kurokobo/awx-on-k3s/blob/main/base/awx.yaml

You could also define custom images and other configs

@yyosha
Copy link

yyosha commented May 21, 2024

I have similar problem on existing EKS cluster.
Kubernetes and AWS-nodes are up-to-date.

Using following kustomization:

apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization

## Specify a custom namespace in which to install AWX
namespace: awx

generatorOptions:
  disableNameSuffixHash: true

secretGenerator:
### Postgesql secret was moved to awx-secrets.yaml which is included in resources

  - name: awx-admin-password
    type: Opaque
    literals:
      - password=BlaBlaBla

  - name: my-ca-bundle
    type: Opaque
    files:
      - bundle-ca.crt

resources:
  ## Find the latest tag here: https://github.com/ansible/awx-operator/releases
  - github.com/ansible/awx-operator/config/default?ref=2.16.1
  - awx-secrets.yaml
  - awx-custom-ee-docker-reg-secret.yaml
  - awx-coredns-cm.yaml
  - awx-gp3-sc-retain.yaml
  - awx-efs-sc.yaml
#  - awx-efs-pv.yaml
  - awx-efs-pv-pg15.yaml
  - awx-efs-pvc.yaml
  - awx-with-postgres.yaml

## Set the image tags to match the git version from above
images:
  - name: quay.io/ansible/awx-operator
    newTag: 2.16.1

Customizing resources with this manifest:

---
apiVersion: awx.ansible.com/v1beta1
kind: AWX

metadata:
  name: awx-dev

spec:
  ## These parameters are designed for use with:
  ## - AWX Operator: 2.10
  ##   https://github.com/ansible/awx-operator/blob/2.10.0/README.md
  ## - AWX: 23.6.0
  ##   https://github.com/ansible/awx/blob/23.6.0/INSTALL.md
  ##
  ## Upgraded to:
  ## - AWX Operator: 2.16.1
  ##   https://github.com/ansible/awx-operator/blob/2.16.1/README.md
  ## - AWX: 24.3.1
  ##   https://github.com/ansible/awx/blob/24.3.1/INSTALL.md

  ## This line controls the log output of the deployment
  no_log: false

  ## Disable ip_v6
  ipv6_disabled: true

  ##################################
  ##              awx             ##
  ##################################

  admin_user: admin
  admin_password_secret: awx-admin-password
  bundle_cacert_secret: my-ca-bundle

  ## hostname value is used in the ALB Listener rules
  ## if host is equal to <hostname value> then traffic will be forwarded to Target Group
  hostname: awx-dev.mydom.com

  ## Customized control-plane-ee
  control_plane_ee_image: myrepo/my-awx-ee:2.16.1_1

  ## Customized awx-ee
  ee_images:
    - name: custom-awx-ee
      image: myrepo/my-awx-ee:2.16.1_1

  ## Custom ee docker pull secret
  image_pull_secrets:
    - awx-custom-ee-docker-reg-secret

  ## console listens on nodes port so ALB ingress can be used
  service_type: NodePort
  nodeport_port: 30080

  ## make projects data persistent on EFS
  ## need storage class, filesystem & mount points on all subnets to be pre-configured
  projects_persistence: true
#  ## use either -
#  ## 'projects_storage_class' for dynamic allocation of persistent volume
#  ## 'projects_existing_claim' for pre-configured persistent volume claim
#  projects_storage_class: efs-projects-storageclass
#  projects_existing_claim: awx-projects-claim


  ##################################
  ##            ingress           ##
  ##################################

  ingress_type: ingress
  ingress_path: '/'
  ingress_path_type: Prefix
  ingress_annotations: |
    kubernetes.io/ingress.class: alb
    alb.ingress.kubernetes.io/listen-ports: '[{"HTTPS":443}, {"HTTP":80}]'
    alb.ingress.kubernetes.io/actions.ssl-redirect: '{"Type": "redirect", "RedirectConfig": { "Protocol": "HTTPS", "Port": "443", "StatusCode": "HTTP_301"}}'
    alb.ingress.kubernetes.io/certificate-arn: "arn:aws:acm:xxxxxxxxxxxxxxxxxx"
    alb.ingress.kubernetes.io/ssl-policy: 'ELBSecurityPolicy-TLS13-1-2-Res-2021-06'
    alb.ingress.kubernetes.io/scheme: 'internal'
    alb.ingress.kubernetes.io/target-type: 'instance'
    alb.ingress.kubernetes.io/ip-address-type: 'ipv4'
    alb.ingress.kubernetes.io/security-groups: 'sg-xxxxxxxxxxxxxxxxxx'
    alb.ingress.kubernetes.io/load-balancer-attributes: 'idle_timeout.timeout_seconds=360'
    alb.ingress.kubernetes.io/healthcheck-protocol: HTTP
    alb.ingress.kubernetes.io/healthcheck-port: traffic-port
    alb.ingress.kubernetes.io/healthcheck-interval-seconds: '15'
    alb.ingress.kubernetes.io/healthcheck-timeout-seconds: '5'
    alb.ingress.kubernetes.io/success-codes: '200'
    alb.ingress.kubernetes.io/healthy-threshold-count: '2'
    alb.ingress.kubernetes.io/unhealthy-threshold-count: '2'
    service.beta.kubernetes.io/aws-load-balancer-cross-zone-load-balancing-enabled: 'true'

  ##################################
  ##          postgresql          ##
  ##################################

  postgres_configuration_secret: awx-postgres-configuration

#  ## Select postresql image and image version
#  #
#  #  postgres_image: quay.io/sclorg/postgresql-15-c9s
#  #  postgres_image: postgres
#  #  postgres_image_version: 'latest'
#  image_pull_policy: Always

  ## make postgress db persistent on EFS
  ## need storage class, filesystem & mount points on all subnets to be pre-configured
  postgres_storage_class: efs-postgres-storageclass
  postgres_storage_requirements:
    requests:
      storage: 15Gi
    limits:
      storage: 35Gi


## EOF

This works perfectly with version 2.10.0, but when trying to deploy from scratch with version 2.16.1, in the logs I see that awx-dev-web is missing and when describing the pod, I get:

Events:
  Type     Reason     Age                  From               Message
  ----     ------     ----                 ----               -------
  Normal   Scheduled  35m                  default-scheduler  Successfully assigned awx/awx-dev-web-94cdf9d45-vkr54 to ip-10-167-0-76.ec2.internal
  Normal   Pulled     35m                  kubelet            Container image "quay.io/ansible/awx-ee:24.3.1" already present on machine
  Normal   Created    35m                  kubelet            Created container init
  Normal   Started    35m                  kubelet            Started container init
  Normal   Pulled     34m (x5 over 35m)    kubelet            Container image "quay.io/centos/centos:stream9" already present on machine
  Normal   Created    34m (x5 over 35m)    kubelet            Created container init-projects
  Normal   Started    34m (x5 over 35m)    kubelet            Started container init-projects
  Warning  BackOff    43s (x160 over 35m)  kubelet            Back-off restarting failed container init-projects in pod awx-dev-web-94cdf9d45-vkr54_awx(6c47c5a1-d1b6-4f9b-8b85-8a803da2df2c)

@YaronL16
Copy link
Contributor

@yyosha should probably look into the logs of the crashing init container

@yyosha
Copy link

yyosha commented May 21, 2024

@YaronL16

Pod is in CrashLoopBackOff status

kc logs -f pod/awx-dev-web-567665cb76-hmc5q -c awx-dev-web -n awx
Error from server (BadRequest): container "awx-dev-web" in pod "awx-dev-web-567665cb76-hmc5q" is waiting to start: PodInitializing

@YaronL16
Copy link
Contributor

@YaronL16

kc logs -f pod/awx-dev-web-567665cb76-hmc5q -c awx-dev-web -n awx
Error from server (BadRequest): container "awx-dev-web" in pod "awx-dev-web-567665cb76-hmc5q" is waiting to start: PodInitializing

Get logs from the container after it has failed, or from the previous container (--previous)

@yyosha
Copy link

yyosha commented May 21, 2024

@YaronL16

kc logs -f pod/awx-dev-web-567665cb76-hmc5q -c awx-dev-web -n awx --previous
Error from server (BadRequest): previous terminated container "awx-dev-web" in pod "awx-dev-web-567665cb76-hmc5q" not found

From operator logs I get this:

TASK [installer : Get the new resource pod information after updating resource.] ***
task path: /opt/ansible/roles/installer/tasks/resources_configuration.yml:258\nskipping: [localhost] => {\"changed\": false, \"false_condition\": \"this_deployment_result.changed\", \"skip_reason\": \"Conditional result was False\"}\n
TASK [installer : Update new resource pod as a variable.] **********************
task path: /opt/ansible/roles/installer/tasks/resources_configuration.yml:275\nskipping: [localhost] => {\"changed\": false, \"false_condition\": \"this_deployment_result.changed\", \"skip_reason\": \"Conditional result was False\"}\n
TASK [installer : Update new resource pod name as a variable.] *****************
task path: /opt/ansible/roles/installer/tasks/resources_configuration.yml:283\nskipping: [localhost] => {\"changed\": false, \"false_condition\": \"this_deployment_result.changed\", \"skip_reason\": \"Conditional result was False\"}\n
TASK [installer : Verify the resource pod name is populated.] ******************
task path: /opt/ansible/roles/installer/tasks/resources_configuration.yml:289\nfatal: [localhost]: FAILED! => {
    \"assertion\": \"awx_web_pod_name != ''\",
    \"changed\": false,
    \"evaluated_to\": false,
    \"msg\": \"Could not find the tower pod's name.\"
}\n
PLAY RECAP *********************************************************************
localhost                  : ok=69   changed=0    unreachable=0    failed=1    skipped=68   rescued=0    ignored=0   \n","job":"3522416367647485710","name":"awx-dev","namespace":"awx","error":"exit status 2","stacktrace":"github.com/operator-framework/ansible-operator-plugins/internal/ansible/runner.(*runner).Run.func1\n\tansible-operator-plugins/internal/ansible/runner/runner.go:269"}

Again, this work perfectly with version 2.10.0

@fosterseth
Copy link
Member

kc logs -f pod/awx-dev-web-567665cb76-hmc5q -c init-projects -n awx

does that return anything helpful?

@yyosha
Copy link

yyosha commented May 21, 2024

@fosterseth
I re-deployed ver. 2.16.1 (this is a VERY test env.), hance the different pod name...

kc logs -f pod/awx-dev-web-6b4b544584-mqppn -c init-projects -n awx

Yielded nothing.

But since now I have this

Events:
  Type     Reason     Age                   From               Message
  ----     ------     ----                  ----               -------
  Normal   Scheduled  4m6s                  default-scheduler  Successfully assigned awx/awx-dev-web-6b4b544584-mqppn to ip-10-167-0-76.ec2.internal
  Normal   Pulled     4m6s                  kubelet            Container image "quay.io/ansible/awx-ee:24.3.1" already present on machine
  Normal   Created    4m6s                  kubelet            Created container init
  Normal   Started    4m5s                  kubelet            Started container init
  Normal   Pulled     4m5s                  kubelet            Container image "quay.io/centos/centos:stream9" already present on machine
  Normal   Created    4m5s                  kubelet            Created container init-projects
  Normal   Started    4m5s                  kubelet            Started container init-projects
  Normal   Created    4m4s                  kubelet            Created container redis
  Normal   Pulled     4m4s                  kubelet            Container image "docker.io/redis:7" already present on machine
  Normal   Started    4m4s                  kubelet            Started container redis
  Normal   Pulled     4m4s                  kubelet            Container image "quay.io/ansible/awx:24.3.1" already present on machine
  Normal   Created    4m4s                  kubelet            Created container awx-dev-rsyslog
  Normal   Started    4m3s                  kubelet            Started container awx-dev-rsyslog
  Normal   Created    2m51s (x3 over 4m4s)  kubelet            Created container awx-dev-web
  Normal   Started    2m51s (x3 over 4m4s)  kubelet            Started container awx-dev-web
  Warning  BackOff    2m11s (x3 over 3m4s)  kubelet            Back-off restarting failed container awx-dev-web in pod awx-dev-web-6b4b544584-mqppn_awx(e5540567-38f8-4be9-86b3-8602ce7ff7d5)
  Normal   Pulled     2m (x4 over 4m4s)     kubelet            Container image "quay.io/ansible/awx:24.3.1" already present on machine

I ran this:

kc logs -f pod/awx-dev-web-6b4b544584-mqppn -c awx-dev-web -n awx

and got this very very long log, which I attached here.
awx-operator-2.16.1.txt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants