Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] [v1.5.5] v2 volume fails to mount on pod #8583

Open
roger-ryao opened this issue May 16, 2024 · 4 comments
Open

[BUG] [v1.5.5] v2 volume fails to mount on pod #8583

roger-ryao opened this issue May 16, 2024 · 4 comments
Assignees
Labels
area/v2-data-engine v2 data engine (SPDK) investigation-needed Need to identify the case before estimating and starting the development kind/bug priority/0 Must be fixed in this release (managed by PO) require/backport Require backport. Only used when the specific versions to backport have not been definied. require/qa-review-coverage Require QA to review coverage
Milestone

Comments

@roger-ryao
Copy link

Describe the bug

I created a V2 volume vol-155 via the UI and created a pod to mount volume vol-155. vol-155 continuously appears in an 'attaching detaching' loop.
Recording 2024-05-16 at 14 44 16

Events:
  Type     Reason              Age                 From                     Message
  ----     ------              ----                ----                     -------
  Warning  FailedScheduling    33m                 default-scheduler        0/4 nodes are available: persistentvolumeclaim "vol-155" not found. preemption: 0/4 nodes are available: 4 Preemption is not helpful for scheduling..
  Warning  FailedScheduling    32m                 default-scheduler        running PreFilter plugin "VolumeBinding": error getting PVC "default/vol-155": could not find v1.PersistentVolumeClaim "default/vol-155"
  Normal   Scheduled           32m                 default-scheduler        Successfully assigned default/ubuntu-mountvol155 to ryao-155-w3-adbaac7c-2bbx2
  Warning  FailedAttachVolume  25m (x10 over 31m)  attachdetach-controller  AttachVolume.Attach failed for volume "vol-155" : rpc error: code = DeadlineExceeded desc = volume vol-155 failed to attach to node ryao-155-w3-adbaac7c-2bbx2 with attachmentID csi-3dfb0e84faa594ee833d39cb079041840eb012873609065baf3d7066bdfd7d5b
  Warning  FailedMount         85s (x14 over 30m)  kubelet                  Unable to attach or mount volumes: unmounted volumes=[vol-155], unattached volumes=[vol-155], failed to process volumes=[]: timed out waiting for the condition
  Warning  FailedAttachVolume  36s (x13 over 27m)  attachdetach-controller  AttachVolume.Attach failed for volume "vol-155" : rpc error: code = Aborted desc = volume vol-155 is not ready for workloads

To Reproduce

  1. Create one v2 1Gib volume "vol-155" on longhorn

  2. Create volume's PV & PVC from longhorn UI

  3. Deploy pod “ubuntu-mountvol155” via cli kubctl apply -f pod_mount_vol155.yaml

    pod_mount_vol155.yaml
    kind: Pod
    apiVersion: v1
    metadata:
      name: ubuntu-mountvol155
      namespace: default
    spec:
      containers:
        - name: ubuntu
          image: ubuntu
          command: ["/bin/sleep", "3650d"]
          volumeMounts:
          - mountPath: "/data/"
            name: vol-155
      volumes:
        - name: vol-155
          persistentVolumeClaim:
            claimName: vol-155

Expected behavior

v2 volume should be mounted on pod

Support bundle for troubleshooting

v1.5.5-supportbundle_190a0fc0-7784-4579-9986-778b8bdb9dfb_2024-05-16T06-26-57Z.zip

Environment

  • Longhorn version: v1.5.5
  • Installation method (e.g. Rancher Catalog App/Helm/Kubectl): Kubectl
  • Kubernetes distro (e.g. RKE/K3s/EKS/OpenShift) and version: v1.27.13+k3s1
    • Number of management node in the cluster: 1
    • Number of worker node in the cluster: 3
  • Node config
    • OS type and version: ubuntu 22.04
    • Kernel version:
    • CPU per node:
    • Memory per node:
    • Disk type(e.g. SSD/NVMe/HDD): SSD
    • Network bandwidth between the nodes:
  • Underlying Infrastructure (e.g. on AWS/GCE, EKS/GKE, VMWare/KVM, Baremetal): AWS EC2 t2.xlarge
  • Number of Longhorn volumes in the cluster: 1
  • Impacted Longhorn resources:
    • Volume names: vol-155

Additional context

NA

@roger-ryao roger-ryao added kind/bug require/qa-review-coverage Require QA to review coverage require/backport Require backport. Only used when the specific versions to backport have not been definied. labels May 16, 2024
@roger-ryao roger-ryao added this to the v1.5.6 milestone May 16, 2024
@derekbit
Copy link
Member

Does our test plan include v2 volume test for v1.5.5 release?

@roger-ryao
Copy link
Author

Does our test plan include v2 volume test for v1.5.5 release?

After checking the previous test records, we didn't test v2 for v1.5.5.

@derekbit
Copy link
Member

Thanks @roger-ryao for the update.
We need to make sure test plan includes both v1 and v2 volumes in the future.

cc @longhorn/qa @khushboo-rancher

@chriscchien
Copy link
Contributor

Just double checked. I am using sles sp15 and pod can mount v2 volume success
supportbundle_c6830ce5-fef0-4b17-83cf-a0acff335e86_2024-05-16T07-29-45Z.zip

  1. Set v2 environment and
  2. create v2 volume vol1 create PVC 'vol1' by UI
  3. apply below yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: test-deployment
  namespace: default
  labels:
    name: test-deployment
spec:
  replicas: 1
  selector:
    matchLabels:
      name: test-deployment
  template:
    metadata:
      labels:
        name: test-deployment
    spec:
      containers:
        - name: test-deployment
          image: ubuntu:20.04
          stdin: true
          stdinOnce: false
          tty: true
          volumeMounts:
            - name: vol1
              mountPath: /mnt/data
      volumes:
        - name: vol1
          persistentVolumeClaim:
            claimName: vol1
  1. pod running
k get pods | grep deploy
test-deployment-754dd9fc66-8fqb7       1/1     Running   0          3m13s

image

@derekbit derekbit added priority/0 Must be fixed in this release (managed by PO) area/v2-data-engine v2 data engine (SPDK) labels May 17, 2024
@derekbit derekbit self-assigned this May 17, 2024
@derekbit derekbit added the investigation-needed Need to identify the case before estimating and starting the development label May 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/v2-data-engine v2 data engine (SPDK) investigation-needed Need to identify the case before estimating and starting the development kind/bug priority/0 Must be fixed in this release (managed by PO) require/backport Require backport. Only used when the specific versions to backport have not been definied. require/qa-review-coverage Require QA to review coverage
Projects
None yet
Development

No branches or pull requests

3 participants