Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

md5 reported in list_path is wrong for files on S3 uploaded as multipart #105

Open
kmichel-aiven opened this issue May 10, 2023 · 1 comment

Comments

@kmichel-aiven
Copy link
Collaborator

What happened?

rohmu assumes the Etag reported by S3 listObjectV2 is an MD5, this is false if the
file was uploaded as multipart, or for some cases of server side encryption:

https://docs.aws.amazon.com/AmazonS3/latest/API/API_Object.html#AmazonS3-Type-Object-ETag

https://github.com/aiven/rohmu/blob/8ae24adf491f26b676ed1ae0339aaa2dd87d96ae/rohmu/object_storage/s3.py#L248

@giacomo-alzetta-aiven
Copy link
Contributor

I believe @fingon knew this very well when adding this. He chose to use md5 for "hash" because it was already used by other backends. The local backend reports the sha256 hash in the md5 field as an other example...
And swift is the only one using a generic hash key instead of md5 (which I'm not sure how it is computed).

We could probably review this handling..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants