Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sometimes missing LUN Mapping #442

Closed
Numblesix opened this issue Sep 2, 2020 · 5 comments
Closed

Sometimes missing LUN Mapping #442

Numblesix opened this issue Sep 2, 2020 · 5 comments

Comments

@Numblesix
Copy link

Describe the bug
LUN Mapping doesnt get created

Environment
Provide accurate information about the environment to help us reproduce the issue.

  • Trident version: 20.07.0
  • Trident installation flags used: operator install with silenceAutosupport
  • Container runtime: CRIO
  • Kubernetes version: v1.18.3+2cf11e2
  • Kubernetes orchestrator: OpenShift 4.5.7
  • Kubernetes enabled feature gates: Default
  • OS: RHEL CoreOS
  • NetApp backend types: ONTAP SAN & ONTAP NAS

To Reproduce
Create a PVC with iscsi backend

Expected behavior
Volume should get created with lun mapping

Additional context
Also please note that this issue doesnt happen all the times i have succesfully created other pvc with the same trident version and same backend configs.
When i login to the netapp i can see that the volume was created, but the lun failed to create therefore also the mapping wasnt created. In the Logs of the PVC i can see these events

failed to provision volume with StorageClass "netapp-csi-block": rpc error: code = Unknown desc = encountered error(s) in creating the volume: [Failed to create volume pvc-3117739c on storage pool foo_72k from backend ontap_san: backend cannot satisfy create request for volume osd1_iscsi_pvc_3117739c: (ONTAP-SAN pool foo_72k/foo_72k; error creating volume osd1_iscsi_pvc_3117739c: Post "https://1.2.3.4/servlets/netapp.servlets.admin.XMLrequest_filer": context deadline exceeded (Client.Timeout exceeded while awaiting headers))]

failed to provision volume with StorageClass "netapp-csi-block": rpc error: code = Unknown desc = encountered error(s) in creating the volume: [Failed to create volume pvc-3117739c on storage pool data4_nsad0014_72k from backend ontap_san: problem mapping LUN /vol/osd1_iscsi_pvc_3117739c/lun0: results: {http://www.netapp.com/filer/admin results} status,attr: failed reason,attr: No such LUN exists errno,attr: 9017 lun-id-assigned: nil ]
@Numblesix Numblesix added the bug label Sep 2, 2020
@gnarl gnarl added the tracked label Sep 2, 2020
@gnarl
Copy link
Contributor

gnarl commented Sep 2, 2020

Hi @Numblesix,

If the volume create operation in Trident failed then there should not be an empty FlexVol. We will investigate why Trident is failing to cleanup the FlexVol when a failure occurs during the create operation. However, please examine the Trident logs for why the LUN creation is failing. Make sure that you have debug turned on in Trident and look for errors after this log statement.

@Numblesix
Copy link
Author

Hi @gnarl

i checked the Logs and could find some more infos but nothing showed a try of trident to delete the flexvol after the failed mapping.

i could find the following after the creation of the volume it shows those lines which i found quite strange:

I0902 08:32:56.685744       1 controller.go:634] CreateVolume failed, supports topology = false, node selected false => may reschedule = false => state = Finished: rpc error: code = Unknown desc = encountered error(s) in creating the volume: [Failed to create volume pvc-3117739c on storage pool foo_72k from backend ontap_san: problem mapping LUN /vol/osd1_iscsi_pvc_3117739c/lun0: results: {http://www.netapp.com/filer/admin results}

time="2020-09-02T08:38:07Z" level=debug msg="LUN already mapped." id=8 igroup=trident_iqn lun=/vol/osd1_iscsi_pvc_3117739c/lun0

time="2020-09-02T08:38:07Z" level=warning msg="LUN attribute fstype not found, using default." LUN=/vol/osd1_iscsi_pvc_3117739c/lun0 fstype=ext4

time="2020-09-02T08:38:07Z" level=debug msg="Attempting volume publish." backend=ontap_san backendUUID=0d721b76-f727-458c-a4da-f57bd5e90bcd volume=pvc-3117739cvolumeInternal=osd1_iscsi_pvc_3117739c

@gnarl
Copy link
Contributor

gnarl commented Sep 3, 2020

@Numblesix, we confirmed yesterday that in the ontap-san driver the FlexVol is created and if that is successful then the LUN is created. If the LUN creation fails though Trident isn't deleting the FlexVol. We will fix that issue.

I was expecting to see a "error creating LUN" or "error saving file system type" string in the above error messages. From the error messages you provided it appears that LUN creation actually worked at create time.

Can you open a support case with NetApp Support so that we can collect more information? Details on contacting support are:

To open a case with NetApp, please go to https://mysupport.netapp.com/site/.

  • Bottom left, Click on 'Contact Support'
  • Find the appropriate number from your region to call in, or login.
  • Note: Trident is not listed on the page, but is a supported product by NetApp based on a supported Netapp storage SN.
  • Open the case on the NetApp storage SN, and provide the description of the problem.
  • Be sure to mention the product is Trident on Kubernetes, and provide the details. Mention this GitHub.
  • The case will be directed to Trident support engineers for response.

@Numblesix
Copy link
Author

I will open a case then :).

I will also check again if I might find an log entry, anyways I will add the whole logfile to the case anyways :)

@gnarl
Copy link
Contributor

gnarl commented Oct 24, 2020

This fix will be included in the Trident 20.10.0 release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants