Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Storage] MD5 missing when uploading large blobs #2717

Closed
6 tasks
wahyuen opened this issue May 6, 2019 · 7 comments
Closed
6 tasks

[Storage] MD5 missing when uploading large blobs #2717

wahyuen opened this issue May 6, 2019 · 7 comments
Assignees
Labels
Client This issue points to a problem in the data-plane of the library. customer-reported Issues that are reported by GitHub users external to the Azure organization. question The issue doesn't require a change to the product in order to be resolved. Most issues start as that Service Attention This issue is responsible by Azure service team. Storage Storage Service (Queues, Blobs, Files)

Comments

@wahyuen
Copy link

wahyuen commented May 6, 2019

  • Package Name: @azure/storage-blob
  • Package Version: 10.3.0
  • Operating system:
  • nodejs
    • version:
  • browser
    • name/version: Chrome (Version 74.0.3729.131)
  • typescript
    • version: 3.1.6
  • Is the bug related to documentation in

Describe the bug
When uploading blobs that are larger than 256Mb, it appears that the blob is missing the MD5 value.

To Reproduce
Steps to reproduce the behavior:

  1. Upload file larger than 256Mb using the uploadBrowserDataToBlockBlob function call.

Expected behavior
Blob is uploaded and the ContentMD5 value populated for the blob

Screenshots
image

Additional context
Unsure if related to this recent issue, MD5 is missing from uploaded files

@kurtzeborn kurtzeborn added Client This issue points to a problem in the data-plane of the library. Storage Storage Service (Queues, Blobs, Files) labels May 6, 2019
@kurtzeborn kurtzeborn added the Service Attention This issue is responsible by Azure service team. label May 6, 2019
@kurtzeborn
Copy link
Member

kurtzeborn commented May 6, 2019

Thank you for opening this issue! We are routing it to the appropriate team for follow up.

CC: @jeremymeng, @XiaoningLiu, @vinjiang

@amarzavery amarzavery added the customer-reported Issues that are reported by GitHub users external to the Azure organization. label May 6, 2019
@XiaoningLiu
Copy link
Member

@wahyuen Please refer to this thread Azure/azure-storage-js#40

@wahyuen
Copy link
Author

wahyuen commented May 7, 2019

@XiaoningLiu we are not programatically setting the MD5 currently. We were expecting the value to be set for us once the blob has been uploaded.

This appears to work fine for blobs under 256Mb (as i believe at that point it goes into a block list implementation)

@XiaoningLiu
Copy link
Member

XiaoningLiu commented May 8, 2019

It's an expected behavior. For blobs smaller than 256MB, Azure Storage server will calculate the MD5 in background. For larger blobs, you need calculate and provide MD5.

@wahyuen
Copy link
Author

wahyuen commented May 8, 2019

This seems to be relatively inconsistent when performing the same upload via azcopy. Does that then imply that azcopy is programatically calculating MD5 prior to be being sent to storage? Is this behaviour only a exhibited when performing block list transfer and might it go away if we were to increase the maxSingleShotSize?

@XiaoningLiu
Copy link
Member

Yes, you are right, AzCopy is a tool designed for high performance data transfer, it will calculate MD5 no matter the blob size.

Azure Storage server will create MD5 for blobs uploaded via only one PUT BLOB request. In SDK, blob smaller than maxSingleShotSize will be uploaded via one PUT BLOB request.

Upper limit for maxSingleShotSize is 256MB, this upper limitation is defined by Azure Storage Server. See here https://docs.microsoft.com/en-us/rest/api/storageservices/put-blob#remarks

@XiaoningLiu XiaoningLiu self-assigned this May 9, 2019
@zhusijia26 zhusijia26 added the question The issue doesn't require a change to the product in order to be resolved. Most issues start as that label Jun 4, 2019
@zhusijia26
Copy link

Close the item based on last comment and idle time.

@github-actions github-actions bot locked and limited conversation to collaborators Apr 12, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Client This issue points to a problem in the data-plane of the library. customer-reported Issues that are reported by GitHub users external to the Azure organization. question The issue doesn't require a change to the product in order to be resolved. Most issues start as that Service Attention This issue is responsible by Azure service team. Storage Storage Service (Queues, Blobs, Files)
Projects
None yet
Development

No branches or pull requests

5 participants