Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OCPBUGS-33631: show "Debug container" link for pods with status.phase… #13850

Merged
merged 1 commit into from
May 31, 2024

Conversation

rhamilto
Copy link
Member

… of "Succeeded"

This aligns 4.16 with the behavior in 3.x. See https://github.com/openshift/origin-web-console/blob/c37982397087036321312172282e139da378eff2/app/scripts/directives/resources.js#L33-L53 and note 'Completed' is not a valid value for pod.status.phase (see https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#pod-phase)

Screen.Recording.2024-05-14.at.9.03.36.AM.mov

@openshift-ci-robot openshift-ci-robot added jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. labels May 14, 2024
@openshift-ci-robot
Copy link
Contributor

@rhamilto: This pull request references Jira Issue OCPBUGS-33631, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.16.0) matches configured target version for branch (4.16.0)
  • bug is in the state ASSIGNED, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @yapei

The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

… of "Succeeded"

This aligns 4.16 with the behavior in 3.x. See https://github.com/openshift/origin-web-console/blob/c37982397087036321312172282e139da378eff2/app/scripts/directives/resources.js#L33-L53 and note 'Completed' is not a valid value for pod.status.phase (see https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#pod-phase)

Screen.Recording.2024-05-14.at.9.03.36.AM.mov

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot requested a review from yapei May 14, 2024 13:07
@openshift-ci openshift-ci bot added component/core Related to console core functionality approved Indicates a PR has been approved by an approver from all required OWNERS files. labels May 14, 2024
@XiyunZhao
Copy link

XiyunZhao commented May 15, 2024

Hi @rhamilto, for those Pods under 'openshift-*' namespace with 'Complete' state, the debug container always failed to launch with different kinds of errors? Do you think it is acceptable?
I think this might confuse the users and make them feel as if there is a bug in the openshift basic pod when they are trying to debug something
eg:

  • Pod under openshift-kube-controller-manager, openshift-etcd, openshift-kube-apiserver

    Error: /bin/sh: -=: invalid option Usage: /bin/sh [GNU long option] [option] ... /bin/sh [GNU long option] [option] script-file ... GNU long options: --debug --debugger --dump-po-strings --dump-strings --help --init-file --login --noediting --noprofile --norc --posix --pretty-print --rcfile --rpm-requires --restricted --verbose --version Shell options: -ilrsD or -c command or -O shopt_option (invocation only) -abefhkmnptuvxBCHP or -o option

  • Pod under opneshift-operator-lifecycle-manager

    Error: /bin/sh: openshift-operator-lifecycle-manager: No such file or directory

@rhamilto
Copy link
Member Author

Hi @rhamilto, for those Pods under 'openshift-*' namespace with 'Complete' state, the debug container always failed to launch with different kinds of errors? Do you think it is acceptable? I think this might confuse the users and make them feel as if there is a bug in the openshift basic pod when they are trying to debug something eg:

  • Pod under openshift-kube-controller-manager, openshift-etcd
    Error: /bin/sh: -=: invalid option Usage: /bin/sh [GNU long option] [option] ... /bin/sh [GNU long option] [option] script-file ... GNU long options: --debug --debugger --dump-po-strings --dump-strings --help --init-file --login --noediting --noprofile --norc --posix --pretty-print --rcfile --rpm-requires --restricted --verbose --version Shell options: -ilrsD or -c command or -O shopt_option (invocation only) -abefhkmnptuvxBCHP or -o option
  • Pod under opneshift-operator-lifecycle-manager
    Error: /bin/sh: openshift-operator-lifecycle-manager: No such file or directory
  • Pod under openshift-kube-apiserver
    Error: /bin/sh: -=: invalid option Usage: /bin/sh [GNU long option] [option] ... /bin/sh [GNU long option] [option] script-file ... GNU long options: --debug --debugger --dump-po-strings --dump-strings --help --init-file --login --noediting --noprofile --norc --posix --pretty-print --rcfile --rpm-requires --restricted --verbose --version Shell options: -ilrsD or -c command or -O shopt_option (invocation only) -abefhkmnptuvxBCHP or -o option

@jhadvig, WDYT?

@XiyunZhao
Copy link

@jhadvig Any thought?

@jhadvig
Copy link
Member

jhadvig commented May 20, 2024

@rhamilto @XiyunZhao I think its understandable, cause if you check most of the pods in the mentioned NS, they have a quite a long list of configmaps which are mounted into the pods when created. It looks liked when you create a debug pods those volumes will be missing.
@spadgett thought on this ?

@XiyunZhao
Copy link

XiyunZhao commented May 21, 2024

@jhadvig Thanks for the response, I think you are right. Let's wait for the response from @spadgett

@spadgett
Copy link
Member

Are the volume mounts gone because the config maps no longer exist? Or are we simply not mounting the config maps into the debug pod?

Let's compare the behavior here to what happens with oc debug. Can you successfully launch a debug pod using oc?

@rhamilto
Copy link
Member Author

Let's compare the behavior here to what happens with oc debug. Can you successfully launch a debug pod using oc?

Appears so.

❯ oc debug pod/installer-5-ip-10-0-33-186.us-west-1.compute.internal -c installer 
Starting pod/installer-5-ip-10-0-33-186us-west-1computeinternal-debug-7tdvv, command was: cluster-kube-controller-manager-operator installer -v=2 --revision=5 --namespace=openshift-kube-controller-manager --pod=kube-controller-manager-pod --resource-dir=/etc/kubernetes/static-pod-resources --pod-manifest-dir=/etc/kubernetes/manifests --configmaps=kube-controller-manager-pod --configmaps=config --configmaps=cluster-policy-controller-config --configmaps=controller-manager-kubeconfig --optional-configmaps=cloud-config --configmaps=kube-controller-cert-syncer-kubeconfig --configmaps=serviceaccount-ca --configmaps=service-ca --configmaps=recycler-config --secrets=service-account-private-key --optional-secrets=serving-cert --secrets=localhost-recovery-client-token --cert-dir=/etc/kubernetes/static-pod-resources/kube-controller-manager-certs --cert-configmaps=aggregator-client-ca --cert-configmaps=client-ca --optional-cert-configmaps=trusted-ca-bundle --cert-secrets=kube-controller-manager-client-cert-key --cert-secrets=csr-signer
Pod IP: 10.130.0.68
If you don't see a command prompt, try pressing enter.
sh-5.1# 

@rhamilto
Copy link
Member Author

@rhamilto @XiyunZhao I think its understandable, cause if you check most of the pods in the mentioned NS, they have a quite a long list of configmaps which are mounted into the pods when created. It looks liked when you create a debug pods those volumes will be missing.

I don't think missing configmaps is the issue. Based on the errors, I believe the issue is related to the container command. No idea what the fix is though. Am thinking oc debug may provide a the answer, but the Go code is unfamiliar to me.

@spadgett
Copy link
Member

The default command for Linux containers in the CLI is /bin/sh

https://github.com/openshift/oc/blob/a5a43b9e6407f8ddf4a73fe995c5eca75a9ac5ab/pkg/cli/debug/debug.go#L70
https://github.com/openshift/oc/blob/a5a43b9e6407f8ddf4a73fe995c5eca75a9ac5ab/pkg/cli/debug/debug.go#L321-L323

although it can be changed by the user.

Are we passing additional args to the command, or just running /bin/sh? The invalid option message is suspicious. Pods can have both a command and args. We should make sure we clear or replace both.

@spadgett
Copy link
Member

@rhamilto
Copy link
Member Author

@rhamilto It looks like we're not clearing the args value?

https://github.com/openshift/console/blob/master/frontend/public/components/debug-terminal.tsx#L44-L51

Good catch!

Debug pod created with console:
localhost_9000_k8s_ns_openshift-operator-lifecycle-manager_pods_collect-profiles-28616520-h5mwc-debug-rgqt9_containers_collect-profiles

Debug pod created with oc:
localhost_9000_k8s_ns_openshift-operator-lifecycle-manager_pods_collect-profiles-28616520-h5mwc-debug-mf2bp_containers_collect-profiles

@rhamilto
Copy link
Member Author

@XiyunZhao, I believe the issue with openshift-* namespaced pods is fixed; great catch on that!

Many thanks to @spadgett for the assist in resolving the issue.

Screen.Recording.2024-05-29.at.11.10.19.AM.mov

@rhamilto
Copy link
Member Author

/retest

2 similar comments
@rhamilto
Copy link
Member Author

/retest

@rhamilto
Copy link
Member Author

/retest

Copy link
Contributor

openshift-ci bot commented May 30, 2024

@rhamilto: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@XiyunZhao
Copy link

@rhamilto @spadgett Thank you for all your efforts in discussing and making modifications. Your expertise has been invaluable to me~
This PR has completed the pre-testing, and no other issues were found
/label qe-approved

@openshift-ci openshift-ci bot added the qe-approved Signifies that QE has signed off on this PR label May 30, 2024
@openshift-ci-robot openshift-ci-robot removed the jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. label May 30, 2024
@openshift-ci-robot openshift-ci-robot added the jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. label May 30, 2024
@openshift-ci-robot
Copy link
Contributor

@rhamilto: This pull request references Jira Issue OCPBUGS-33631, which is invalid:

  • expected the bug to target either version "4.17." or "openshift-4.17.", but it targets "4.16.0" instead

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

In response to this:

… of "Succeeded"

This aligns 4.16 with the behavior in 3.x. See https://github.com/openshift/origin-web-console/blob/c37982397087036321312172282e139da378eff2/app/scripts/directives/resources.js#L33-L53 and note 'Completed' is not a valid value for pod.status.phase (see https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#pod-phase)

Screen.Recording.2024-05-14.at.9.03.36.AM.mov

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link
Member

@spadgett spadgett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@rhamilto
Copy link
Member Author

/jira refresh

@openshift-ci-robot openshift-ci-robot added the jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. label May 30, 2024
@openshift-ci-robot
Copy link
Contributor

@rhamilto: This pull request references Jira Issue OCPBUGS-33631, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.17.0) matches configured target version for branch (4.17.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @XiyunZhao

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot openshift-ci-robot removed the jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. label May 30, 2024
@openshift-ci openshift-ci bot requested a review from XiyunZhao May 30, 2024 12:58
@rhamilto
Copy link
Member Author

/cherrypick release-4.16

@openshift-cherrypick-robot

@rhamilto: once the present PR merges, I will cherry-pick it on top of release-4.16 in a new PR and assign it to you.

In response to this:

/cherrypick release-4.16

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label May 30, 2024
Copy link
Contributor

openshift-ci bot commented May 30, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: rhamilto, spadgett

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-bot openshift-merge-bot bot merged commit 52f60f5 into openshift:master May 31, 2024
6 checks passed
@openshift-ci-robot
Copy link
Contributor

@rhamilto: Jira Issue OCPBUGS-33631: All pull requests linked via external trackers have merged:

Jira Issue OCPBUGS-33631 has been moved to the MODIFIED state.

In response to this:

… of "Succeeded"

This aligns 4.16 with the behavior in 3.x. See https://github.com/openshift/origin-web-console/blob/c37982397087036321312172282e139da378eff2/app/scripts/directives/resources.js#L33-L53 and note 'Completed' is not a valid value for pod.status.phase (see https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#pod-phase)

Screen.Recording.2024-05-14.at.9.03.36.AM.mov

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-cherrypick-robot

@rhamilto: new pull request created: #13914

In response to this:

/cherrypick release-4.16

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. component/core Related to console core functionality jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. qe-approved Signifies that QE has signed off on this PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants