Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CSI Driver 2.5.4: error "failed to get shared datastores in kubernetes cluster" #2377

Open
jsoule6 opened this issue May 11, 2023 · 6 comments

Comments

@jsoule6
Copy link

jsoule6 commented May 11, 2023

Is this a BUG REPORT or FEATURE REQUEST?:

Uncomment only one, leave it on its own line:

/kind bug

What happened:

Installed the CSI driver and Cloud Controller Manager on a K8s cluster running on vSphere VMs. Everything installed well and it is connecting to vSphere and successfully getting node information. However, when we try to deploy a pod on the cluster with the new Storage Class that we created, we are getting the following error:

failed to get shared datastores in kubernetes cluster. Error: no shared datastores found for nodeVm.

We have a single vCenter and are not using a topology aware setup. We have checked the permissions on the vSphere side for the account that we are using and all looks good. The only thing we can think of that might be causing this is that the host that the Control Plane node is on does not have access to the same datastore that the Worker nodes all do. We tried applying and using a Storage Policy as well, but with the same result.

Is it a requirement that all nodes including the Control Plane have access to at least one Shared datastore?

What you expected to happen:

I would expect the PVC to be created

How to reproduce it (as minimally and precisely as possible):

Create a K8s cluster with CSI 2.5.4 and the other versions mentioned above.

Anything else we need to know?:

Environment:
Using the following versions:

vSphere: 6.7 Update 3
Kubernetes: 1.21
Cloud Controller Manager: 1.21
CSI Driver: 2.5.4

@divyenpatel
Copy link
Member

The only thing we can think of that might be causing this is that the host that the Control Plane node is on does not have access to the same datastore that the Worker nodes all do.

This is the reason the driver is not able to find the shared accessible datastore for all nodes. We should have a datastore accessible to all nodes in the cluster including control plane nodes.

@sba30
Copy link

sba30 commented Jun 21, 2023

We encountered the same issue, when we deploy our workers to a single Vsphere cluster using VSAN storage it works fine, but when we split the workers to be deployed across 2 vsphere clusters, each cluster with their own VSAN storage we get the same error when creating the PVC.

Is there a way in this setup for the PVC to only go to 1 of the vsphere clusters and its VSAN storage?

in our setup its now possible for the 2 vsphere clusters to have shared storage, they each have their own VSAN Storage

@divyenpatel
Copy link
Member

@sba30 you can define topology on the nodes, and utilize volume topology feature to provision volume on specific vSphere cluster.
https://docs.vmware.com/en/VMware-vSphere-Container-Storage-Plug-in/3.0/vmware-vsphere-csp-getting-started/GUID-162E7582-723B-4A0F-A937-3ACE82EAFD31.html

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 23, 2024
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle rotten
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Feb 22, 2024
@divyenpatel
Copy link
Member

/remove-lifecycle rotten

@k8s-ci-robot k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Mar 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants