Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Maximo Application Suite Core Installer failing on ROSA HCP #1193

Open
aneeshfuj opened this issue Jan 29, 2024 · 5 comments
Open

Maximo Application Suite Core Installer failing on ROSA HCP #1193

aneeshfuj opened this issue Jan 29, 2024 · 5 comments

Comments

@aneeshfuj
Copy link

aneeshfuj commented Jan 29, 2024

Hi

I can't get Maximo Application Suite Core installed on ROSA HCP. This works fine on classic OpenShift 4.14.7 but not on HCP.

The Ansible playbook fails with the following error:

TASK [ibm.mas_devops.uds : Extract certificate chain into a variable] *******************************************************************************************************************************************************************************************************************
fatal: [localhost]: FAILED! => {"msg": "The task includes an option with an undefined variable. The error was: 'cluster_ingress_tls_crt' is undefined. 'cluster_ingress_tls_crt' is undefined\n\nThe error appears to be in '/root/.ansible/collections/ansible_collections/ibm/mas_devops/common_tasks/get_signed_ingress_cert.yml': line 99, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n# Break up the certificate into an array\n- name: "Extract certificate chain into a variable"\n ^ here\n"}

On HCP (openshift-ingress namespace), if you do

kubectl get secrets --field-selector type=kubernetes.io/tls

You get the following output:

NAME TYPE DATA AGE
router-metrics-certs-default kubernetes.io/tls 2 157m

On unmanaged OCP 4.14.7, you get the following output for the same command:

NAME TYPE DATA AGE
router-certs-default kubernetes.io/tls 2 65m
router-metrics-certs-default kubernetes.io/tls 2 65m

If you look at the file get_signed_ingress_cert.yml, you have the following code:

  • name: "Record that we have found the {{ ocp_ingress_tls_secret_name }} cert secret"
      when:
        - router_certs_default_secret is defined
        - router_certs_default_secret.resources | length > 0
      set_fact:
        found_router_default_secret: true
        cluster_ingress_secret_name: "{{ ocp_ingress_tls_secret_name }}"
        cluster_ingress_tls_crt: "{{ router_certs_default_secret.resources[0].data['tls.crt'] | b64decode }}"

You can see the first secret is what the Ansible code is looking for, and this is missing in HCP. Because of this, the cluster_ingress_tls_crt is undefined. I believe this is what is causing the subsequent tasks to fail.

Can you please assist?

@andrercm
Copy link
Contributor

andrercm commented Feb 2, 2024

Error preventing pre-install-check to pass : "'cluster_ingress_tls_crt' is undefined"

So this issue normally happens when neither of two expected secrets that contains cluster's ingress certificates are found.

They are:

  1. Secret name = 'router-certs-default' in 'openshift-ingress' namespace.
  2. Or if 1# is not found, then we try to lookup for a secret whose name is equal to the cluster's ingress endpoint.
    Example:
    Cluster's hostname = testcluster-6f1a1cac8216c06779b-0000.eu-gb.containers.appdomain.cloud
    then, there should be a corresponding secret name = testcluster-6f1a1cac8216c06779b-0000 in openshift-ingress namespace

Thus, it seems neither of the cluster's ingress default certificates are being found.

Could you please run the following commands for an initial diagnosis?

  1. This command should output the cluster's ingress controller details, one of the output should details what's the Default Certificate Name
oc describe ingresscontroller/default -n openshift-ingress-operator

Then, run this second command to output the available secrets in openshift-ingress namespace.

oc get secrets -n openshift-ingress

It's worth mentioning too that if the automation is not capable of finding the default certificate to use (as this will depend from cloud provider to another), users can set this via MAS cli ('Advanced Settings section' > 'Change default cluster ingress certificate secret name' prompt) or via ansible-devops by export OCP_INGRESS_TLS_SECRET_NAME variable, if user know which secret contains the cluster's ingress tls certificates.

@aneeshfuj
Copy link
Author

Hi Andre

Thanks a lot for your response. When I ran the following command:

oc describe ingresscontroller/default -n openshift-ingress-operator

I was able to get the following default certificate name:

Default Certificate:
Name: 297hfu7rptqvore48abe4qh5i01lgeju-primary-cert-bundle-secret

I was able to get this secret from the openshift-ingress namespace with the following command

oc get secrets -n openshift-ingress
NAME TYPE DATA AGE
297hfu7rptqvore48abe4qh5i01lgeju-primary-cert-bundle-secret Opaque 2 4m34s

I then set this value to the OCP_INGRESS_TLS_SECRET_NAME environment variable, before running the MAS Core installation.

This time, I did not get the undefined error and the installation did complete.

However, I noticed that MAS is still using a self-signed certificate. If I log in to the OpenShift console, I can see that the CA signed certificate is used and the browser does not report a warning.

However if I log in to MAS, then the browser shows the warning around insecure connection. And when I checked the certificate, it was a self signed certificate (I think it is generated by Cert Manager).

@andrercm
Copy link
Contributor

andrercm commented Feb 5, 2024

@aneeshfuj glad it worked! I have raised a fix to lookup the default ingress secret based on the ingresscontroller/default as well, hopefully that will solve most of the cases.

As to your question, you need to add MAS URL/Route CA certificate into your local keychain so that the browser can trust your server connection.

To do that you can lookup the CA Certificate content from any MAS route, you can download that directly from your browser or if you want to get from the OCP console there are few ways, as example:

export MAS_INSTANCE_ID=mymas
oc get routes $MAS_INSTANCE_ID-admin -n mas-$MAS_INSTANCE_ID-core -o jsonpath="{.spec.tls.caCertificate}{'\n'}" > $MAS_INSTANCE_ID-ca.pem

Above will download the CA certificate for your MAS instance to a pem file that you can add to your local keychain.

Or if you rather do it from OCP console UI:

  1. Login to the OCP console.
  2. Go to Networking > Routes, search for any MAS route in your MAS project/namespace, such as $MAS_INSTANCE_ID-admin.
  3. Go to yaml, and copy the certificate content under spec.tls.caCertificate, output should be something as:
-----BEGIN CERTIFICATE-----
      MIID3TCCAsWgAwIBAgIRALLDrdvuTbDUWL9sW2UrLJgwDQYJKoZIhvcNAQELBQAw
      gYcxCzAJBgNVBAYTAkdCMQ8wDQYDVQQHEwZMb25kb24xDzANBgNVBAkTBkxvbmRv
      bjEuMCwGA1UECxMlSUJNIE1heGltbyBBcHBsaWNhdGlvbiBTdWl0ZSAoUHVibGlj
      CSqGSIb3DQEBCwUAA4IBAQCr2A0oI7xlde/C7pGfxtMgerjJX7E2zGy2L8hVLEoF
      ueWI3CiZWYGF79s4nLiKLRcplFF8YZp7jpxrc0os0iiT16dK3mN6ZRpxG1o2dAvx
      CV0gTamFgVL7SNJSuhcFKzNGe2JbrpEQ8c5yM4W69VIwBG2S6mkHJ483B9Zx580S
      iAOdWu6pJSOUQ+1RR7XXgfpg0/zY+1fhtCkUIVONZZ5J3C6XRl6VTgLa/uQAzPA2
      Z2zTVaxix5UGnOeqRjtWYjaPKlz7rmdwxtVE3TjdKqsU/FCeY0SPKNmoMb/tImLV
      iRlJn5L30JxKWdfUKjIsEHOVJnxTNm0NmA6Zn0Z3Z1ns
      -----END CERTIFICATE-----
  1. Save that to a local .pem file and add to your local key chain.

Also a temporary solution would be to log to MAS URL, and switch the browser's url to api.<your-mas-domain> and hit enter. Then accept the browser's warning to tell you trust this connection. Once browser trusts the connection, then you can log normally to MAS using the admin.<your-mas-domain> or home.<your-mas-domain> urls.

@aneeshfuj
Copy link
Author

Thanks Andre. Fantastic, looking forward to the fix! I think once the fix is in, then MAS should automatically use the wildcard certificate signed by the CA (LetsEncrypt in the case of ROSA HCP). This will also be aligned with the OpenShift console.

@aneeshfuj
Copy link
Author

Just wanted to correct my comment above. It is not possible to use the OpenShift default ingress wildcard certificate (under the .apps sub-domain) as Maximo Application Suite is under the subdomain of the Maximo Instance ID (.masinst1.apps), and not directly under the apps sub-domain.

Therefore the only solution is to use a custom certificate for your Maximo instance domain. You can easily create one using certbot. Then you can pass this to the Maximo installer by setting the MAS_MANUAL_CERT_MGMT environment variable to True. You also need to save the certificate files under the $MAS_CONFIG_DIR/certs directory. You need to save the files as tls.crt, tls.key and ca.crt. Alternatively, you can update the certificate manually from within Maximo Application Suite after the installation completes, by clicking on the Certificates menu option within the Suite Admin console. This is what I did.

Hope this helps!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants