Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GCS multipart upload behaviour #547

Open
larshagencognite opened this issue Aug 18, 2023 · 0 comments
Open

GCS multipart upload behaviour #547

larshagencognite opened this issue Aug 18, 2023 · 0 comments

Comments

@larshagencognite
Copy link
Contributor

I have some questions regarding the multipart upload behaviour for GCS.

I am working on getting FoundationDB to write backups in GCS through the S3Proxy. The backup has previously gone through the minio gateway, which is now deprecated, so I looking for a replacement.

When S3Proxy gets a multi-part PUT request, the debug log shows

[s3proxy] D 08-18 11:04:56.100 S3Proxy-Jetty-48 o.gaul.s3proxy.S3ProxyHandler:301 |::] request: Request(PUT http://<s3proxy-endpoint>/<bucket>/<path>?partNumber=5&uploadId=<uploadId>)@26638b8f
[s3proxy] D 08-18 11:04:56.100 S3Proxy-Jetty-48 o.gaul.s3proxy.S3ProxyHandler:326 |::] header: Authorization: AWS <identity:credential>
[s3proxy] D 08-18 11:04:56.100 S3Proxy-Jetty-48 o.gaul.s3proxy.S3ProxyHandler:326 |::] header: Accept: application/xml
[s3proxy] D 08-18 11:04:56.100 S3Proxy-Jetty-48 o.gaul.s3proxy.S3ProxyHandler:326 |::] header: Host: <s3proxy-endpoint>
[s3proxy] D 08-18 11:04:56.100 S3Proxy-Jetty-48 o.gaul.s3proxy.S3ProxyHandler:326 |::] header: Content-Length: 5242880
[s3proxy] D 08-18 11:04:56.100 S3Proxy-Jetty-48 o.gaul.s3proxy.S3ProxyHandler:326 |::] header: Date: 20230818T110456Z
[s3proxy] D 08-18 11:04:56.100 S3Proxy-Jetty-48 o.gaul.s3proxy.S3ProxyHandler:326 |::] header: Content-MD5: <md5>
[s3proxy] D 08-18 11:04:56.100 S3Proxy-Jetty-48 o.j.r.i.InvokeHttpMethod:56 |::] >> invoking Object:get
[s3proxy] D 08-18 11:04:56.100 S3Proxy-Jetty-48 o.j.h.i.JavaUrlHttpCommandExecutorService:56 |::] Sending request 753038757: GET https://storage.googleapis.com/storage/v1/b/<bucket>/o/<uploadId>_00000001 HTTP/1.1
[s3proxy] D 08-18 11:04:56.101 S3Proxy-Jetty-48 jclouds.headers:56 |::] >> GET https://storage.googleapis.com/storage/v1/b/<bucket>/o/<uploadId>_00000001 HTTP/1.1
[s3proxy] D 08-18 11:04:56.101 S3Proxy-Jetty-48 jclouds.headers:56 |::] >> Accept: application/json
[s3proxy] D 08-18 11:04:56.118 S3Proxy-Jetty-48 jclouds.headers:56 |::] >> Authorization: Bearer <token>
[s3proxy] D 08-18 11:04:56.198 S3Proxy-Jetty-48 o.j.h.i.JavaUrlHttpCommandExecutorService:56 |::] Receiving response 753038757: HTTP/1.1 404 Not Found
[s3proxy] D 08-18 11:04:56.198 S3Proxy-Jetty-48 jclouds.headers:56 |::] << HTTP/1.1 404 Not Found
[s3proxy] D 08-18 11:04:56.198 S3Proxy-Jetty-48 jclouds.headers:56 |::] << Server: UploadServer
[s3proxy] D 08-18 11:04:56.198 S3Proxy-Jetty-48 jclouds.headers:56 |::] << X-GUploader-UploadID: <uploadId>
[s3proxy] D 08-18 11:04:56.198 S3Proxy-Jetty-48 jclouds.headers:56 |::] << Vary: X-Origin
[s3proxy] D 08-18 11:04:56.198 S3Proxy-Jetty-48 jclouds.headers:56 |::] << Vary: Origin
[s3proxy] D 08-18 11:04:56.198 S3Proxy-Jetty-48 jclouds.headers:56 |::] << Pragma: no-cache
[s3proxy] D 08-18 11:04:56.198 S3Proxy-Jetty-48 jclouds.headers:56 |::] << Date: Fri, 18 Aug 2023 11:04:56 GMT
[s3proxy] D 08-18 11:04:56.198 S3Proxy-Jetty-48 jclouds.headers:56 |::] << Cache-Control: no-cache, no-store, max-age=0, must-revalidate
[s3proxy] D 08-18 11:04:56.198 S3Proxy-Jetty-48 jclouds.headers:56 |::] << Content-Type: application/json; charset=UTF-8
[s3proxy] D 08-18 11:04:56.199 S3Proxy-Jetty-48 jclouds.headers:56 |::] << Content-Length: 383
[s3proxy] D 08-18 11:04:56.199 S3Proxy-Jetty-48 jclouds.headers:56 |::] << Expires: Mon Jan 01 00:00:00 UTC 1990
[s3proxy] D 08-18 11:04:56.219 S3Proxy-Jetty-48 o.j.r.i.InvokeHttpMethod:56 |::] >> invoking Object:simpleUpload
[s3proxy] D 08-18 11:04:56.219 S3Proxy-Jetty-48 o.j.h.i.JavaUrlHttpCommandExecutorService:56 |::] Sending request 1133015402: POST https://storage.googleapis.com/upload/storage/v1/b/<bucket>/o?uploadType=media&name=<uploadId>_00000001_00000005 HTTP/1.1
[s3proxy] D 08-18 11:04:56.231 S3Proxy-Jetty-48 o.j.http.internal.HttpWire:56 |::] over limit 5242880/262144: wrote temp file
[s3proxy] D 08-18 11:05:56.021 S3Proxy-Jetty-48 jclouds.headers:56 |::] >> POST https://storage.googleapis.com/upload/storage/v1/b/<bucket>/o?uploadType=media&name=<uploadId>_00000001_00000005 HTTP/1.1
[s3proxy] D 08-18 11:05:56.021 S3Proxy-Jetty-48 jclouds.headers:56 |::] >> Accept: application/json
[s3proxy] D 08-18 11:05:56.024 S3Proxy-Jetty-48 jclouds.headers:56 |::] >> Authorization: Bearer <token>
[s3proxy] D 08-18 11:05:56.030 S3Proxy-Jetty-48 jclouds.headers:56 |::] >> Content-Type: application/unknown
[s3proxy] D 08-18 11:05:56.032 S3Proxy-Jetty-48 jclouds.headers:56 |::] >> Content-Length: 5242880
[s3proxy] D 08-18 11:05:56.036 S3Proxy-Jetty-48 jclouds.headers:56 |::] >> Content-MD5: lB8VF7jGtzaAuI0Rp/yPMg==
[s3proxy] D 08-18 11:05:56.492 S3Proxy-Jetty-48 o.j.h.i.JavaUrlHttpCommandExecutorService:56 |::] Receiving response -56785471: HTTP/1.1 200 OK
[s3proxy] D 08-18 11:05:56.493 S3Proxy-Jetty-48 jclouds.headers:56 |::] << HTTP/1.1 200 OK
[s3proxy] D 08-18 11:05:56.493 S3Proxy-Jetty-48 jclouds.headers:56 |::] << Server: UploadServer
[s3proxy] D 08-18 11:05:56.493 S3Proxy-Jetty-48 jclouds.headers:56 |::] << ETag: <etag>
[s3proxy] D 08-18 11:05:56.493 S3Proxy-Jetty-48 jclouds.headers:56 |::] << X-GUploader-UploadID: <uploadId>
[s3proxy] D 08-18 11:05:56.493 S3Proxy-Jetty-48 jclouds.headers:56 |::] << Vary: X-Origin
[s3proxy] D 08-18 11:05:56.493 S3Proxy-Jetty-48 jclouds.headers:56 |::] << Vary: Origin
[s3proxy] D 08-18 11:05:56.493 S3Proxy-Jetty-48 jclouds.headers:56 |::] << Pragma: no-cache
[s3proxy] D 08-18 11:05:56.493 S3Proxy-Jetty-48 jclouds.headers:56 |::] << Date: Fri, 18 Aug 2023 11:05:56 GMT
[s3proxy] D 08-18 11:05:56.493 S3Proxy-Jetty-48 jclouds.headers:56 |::] << Cache-Control: no-cache, no-store, max-age=0, must-revalidate
[s3proxy] D 08-18 11:05:56.493 S3Proxy-Jetty-48 jclouds.headers:56 |::] << Content-Type: application/json; charset=UTF-8
[s3proxy] D 08-18 11:05:56.500 S3Proxy-Jetty-48 jclouds.headers:56 |::] << Content-Length: 1045
[s3proxy] D 08-18 11:05:56.503 S3Proxy-Jetty-48 jclouds.headers:56 |::] << Expires: Mon Jan 01 00:00:00 UTC 1990

I don't get the point of the GET call, as it seems to be calling it on <uploadId>_00000001, and after it gets a 404 on the GET it calls POST on <uploadId>_00000001_00000005, presumably creating a new object for the part.

Are these parts later combined so that one can read the whole object?

The backup is currently failing to make progress, and I am trying to find out if it is related to this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant