Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GracePeriodSeconds failure problem #96090

Closed
Aaron-23 opened this issue Nov 2, 2020 · 10 comments
Closed

GracePeriodSeconds failure problem #96090

Aaron-23 opened this issue Nov 2, 2020 · 10 comments
Labels
kind/bug Categorizes issue or PR as related to a bug. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. sig/node Categorizes an issue or PR as relevant to SIG Node.

Comments

@Aaron-23
Copy link

Aaron-23 commented Nov 2, 2020

I set up for my pod terminationGracePeriodSeconds: 30, but the pod is still not stopped after the time has passed. The test image I used is nginx, and the yaml file is as follows

apiVersion: apps/v1
kind: StatefulSet
metadata:
  creationTimestamp: "2020-11-02T02:21:38Z"
  generateName: gr958699
  generation: 2
  labels:
    creater_id: "1604283698077855691"
    creator: Rainbond
    name: gr958699
    service_alias: gr958699
    service_id: 0510d4cc1a7e4ed3a172a7e6c6958699
    tenant_id: ef46e94724614478bd27eed9c1f22046
    tenant_name: ntelixhb
    version: "20201030105559"
  name: gr958699
  namespace: ef46e94724614478bd27eed9c1f22046
  resourceVersion: "920464"
  selfLink: /apis/apps/v1/namespaces/ef46e94724614478bd27eed9c1f22046/statefulsets/gr958699
  uid: 00072858-d78f-4a8d-b206-79797d7cc0b6
spec:
  podManagementPolicy: OrderedReady
  replicas: 2
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      name: gr958699
      service_id: 0510d4cc1a7e4ed3a172a7e6c6958699
      tenant_id: ef46e94724614478bd27eed9c1f22046
  serviceName: gr958699
  template:
    metadata:
      creationTimestamp: null
      labels:
        creater_id: "1604283698077855691"
        creator: Rainbond
        name: gr958699
        service_alias: gr958699
        service_id: 0510d4cc1a7e4ed3a172a7e6c6958699
        tenant_id: ef46e94724614478bd27eed9c1f22046
        tenant_name: ntelixhb
        version: "20201030105559"
      name: 0510d4cc1a7e4ed3a172a7e6c6958699-pod-spec
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: beta.kubernetes.io/os
                operator: NotIn
                values:
                - windows
      containers:
      - env:
        - name: LOGGER_DRIVER_NAME
          value: streamlog
        - name: PORT
          value: "80"
        - name: PROTOCOL
          value: http
        - name: DOMAIN_80
          value: 80.gr958699.ntelixhb.b62add.grapps.cn
        - name: DOMAIN_PROTOCOL_80
          value: http
        - name: DOMAIN
          value: 80.gr958699.ntelixhb.b62add.grapps.cn
        - name: DOMAIN_PROTOCOL
          value: http
        - name: MONITOR_PORT
          value: "80"
        - name: CUR_NET
          value: midonet
        - name: NGINX_VERSION
          value: 1.19.3
        - name: NJS_VERSION
          value: 0.4.4
        - name: PKG_RELEASE
          value: 1~buster
        - name: TENANT_ID
          value: ef46e94724614478bd27eed9c1f22046
        - name: SERVICE_ID
          value: 0510d4cc1a7e4ed3a172a7e6c6958699
        - name: MEMORY_SIZE
          value: medium
        - name: SERVICE_NAME
          value: gr958699
        - name: SERVICE_POD_NUM
          value: "2"
        - name: HOST_IP
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: status.hostIP
        - name: POD_IP
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: status.podIP
        image: goodrain.me/0510d4cc1a7e4ed3a172a7e6c6958699:20201030105559
        imagePullPolicy: IfNotPresent
        livenessProbe:
          failureThreshold: 3
          initialDelaySeconds: 2
          periodSeconds: 3
          successThreshold: 1
          tcpSocket:
            port: 80
          timeoutSeconds: 20
        name: 0510d4cc1a7e4ed3a172a7e6c6958699
        ports:
        - containerPort: 80
          protocol: TCP
        resources:
          limits:
            cpu: 640m
            memory: 512Mi
          requests:
            cpu: 120m
            memory: 512Mi
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
      dnsPolicy: ClusterFirst
      imagePullSecrets:
      - name: rbd-hub-credentials
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      terminationGracePeriodSeconds: 0
  updateStrategy:
    type: RollingUpdate
status:
  collisionCount: 0
  currentReplicas: 2
  currentRevision: gr958699-844656fc5
  observedGeneration: 2
  readyReplicas: 2
  replicas: 2
  updateRevision: gr958699-844656fc5
  updatedReplicas: 2

pod details

[root@master ~]# kubectl describe po -n ef46e94724614478bd27eed9c1f22046   gr958699-1
Name:                      gr958699-1
Namespace:                 ef46e94724614478bd27eed9c1f22046
Priority:                  0
Node:                      172.24.206.55/172.24.206.55
Start Time:                Mon, 02 Nov 2020 10:39:30 +0800
Labels:                    controller-revision-hash=gr958699-9699857cd
                           creater_id=1604283698077855691
                           creator=Rainbond
                           name=gr958699
                           service_alias=gr958699
                           service_id=0510d4cc1a7e4ed3a172a7e6c6958699
                           statefulset.kubernetes.io/pod-name=gr958699-1
                           tenant_id=ef46e94724614478bd27eed9c1f22046
                           tenant_name=ntelixhb
                           version=20201030105559
Annotations:               <none>
Status:                    Terminating (lasts 2m8s)
Termination Grace Period:  30s
IP:                        172.20.2.72
IPs:
  IP:           172.20.2.72
Controlled By:  StatefulSet/gr958699
Containers:
  0510d4cc1a7e4ed3a172a7e6c6958699:
    Container ID:   docker://81c7de83d664fc7dac6b2e7725ab11e955e4e51926593f9864a7620a4a521389
    Image:          goodrain.me/0510d4cc1a7e4ed3a172a7e6c6958699:20201030105559
    Image ID:       docker-pullable://goodrain.me/0510d4cc1a7e4ed3a172a7e6c6958699@sha256:4949aa7259aa6f827450207db5ad94cabaa9248277c6d736d5e1975d200c7e43
    Port:           80/TCP
    Host Port:      0/TCP
    State:          Running
      Started:      Mon, 02 Nov 2020 10:39:31 +0800
    Ready:          True
    Restart Count:  0
    Limits:
      cpu:     640m
      memory:  512Mi
    Requests:
      cpu:     120m
      memory:  512Mi
    Liveness:  tcp-socket :80 delay=2s timeout=20s period=3s #success=1 #failure=3
    Environment:
      LOGGER_DRIVER_NAME:  streamlog
      PORT:                80
      PROTOCOL:            http
      DOMAIN_80:           80.gr958699.ntelixhb.b62add.grapps.cn
      DOMAIN_PROTOCOL_80:  http
      DOMAIN:              80.gr958699.ntelixhb.b62add.grapps.cn
      DOMAIN_PROTOCOL:     http
      MONITOR_PORT:        80
      CUR_NET:             midonet
      NGINX_VERSION:       1.19.3
      NJS_VERSION:         0.4.4
      PKG_RELEASE:         1~buster
      TENANT_ID:           ef46e94724614478bd27eed9c1f22046
      SERVICE_ID:          0510d4cc1a7e4ed3a172a7e6c6958699
      MEMORY_SIZE:         medium
      SERVICE_NAME:        gr958699
      SERVICE_POD_NUM:     2
      HOST_IP:              (v1:status.hostIP)
      POD_IP:               (v1:status.podIP)
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-psdqs (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             False
  ContainersReady   True
  PodScheduled      True
Volumes:
  default-token-psdqs:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-psdqs
    Optional:    false
QoS Class:       Burstable
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 10s
                 node.kubernetes.io/unreachable:NoExecute for 10s
Events:
  Type    Reason     Age        From                    Message
  ----    ------     ----       ----                    -------
  Normal  Scheduled  <unknown>  default-scheduler       Successfully assigned ef46e94724614478bd27eed9c1f22046/gr958699-1 to 172.24.206.55
  Normal  Pulled     3m32s      kubelet, 172.24.206.55  Container image "goodrain.me/0510d4cc1a7e4ed3a172a7e6c6958699:20201030105559" already present on machine
  Normal  Created    3m32s      kubelet, 172.24.206.55  Created container 0510d4cc1a7e4ed3a172a7e6c6958699
  Normal  Started    3m32s      kubelet, 172.24.206.55  Started container 0510d4cc1a7e4ed3a172a7e6c6958699

Environment:

  • Kubernetes version (use kubectl version): "v1.16.2"
  • Cloud provider or hardware configuration:
  • OS (e.g: cat /etc/os-release):
  • Kernel (e.g. uname -a):
  • Install tools:
  • Network plugin and version (if this is a network-related bug): flannel:v0.11.0
  • Others:
@Aaron-23 Aaron-23 added the kind/bug Categorizes issue or PR as related to a bug. label Nov 2, 2020
@k8s-ci-robot k8s-ci-robot added the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Nov 2, 2020
@k8s-ci-robot
Copy link
Contributor

@Aaron-23: This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Nov 2, 2020
@Aaron-23
Copy link
Author

Aaron-23 commented Nov 2, 2020

/sig Cluster Lifecycle

@k8s-ci-robot
Copy link
Contributor

@Aaron-23: The label(s) sig/cluster, sig/lifecycle cannot be applied, because the repository doesn't have them

In response to this:

/sig Cluster Lifecycle

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@neolit123
Copy link
Member

/sig node

@k8s-ci-robot k8s-ci-robot added sig/node Categorizes an issue or PR as relevant to SIG Node. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Nov 3, 2020
@qiutongs
Copy link
Contributor

qiutongs commented Nov 5, 2020

Is it possible that the node is unreachable? According to k8s doc, Kubernetes (versions 1.5 or newer) will not delete Pods just because a Node is unreachable. The Pods running on an unreachable Node enter the 'Terminating' or 'Unknown' state after a timeout.

Also, here is a similar long-going issue: #51835

@Aaron-23
Copy link
Author

Aaron-23 commented Nov 6, 2020

Yes, I shut down the node directly

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 4, 2021
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Mar 6, 2021
@fejta-bot
Copy link

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-contributor-experience at kubernetes/community.
/close

@k8s-ci-robot
Copy link
Contributor

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-contributor-experience at kubernetes/community.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. sig/node Categorizes an issue or PR as relevant to SIG Node.
Projects
None yet
Development

No branches or pull requests

5 participants