Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Storage and url encoding #121

Closed
soujiro32167 opened this issue Feb 9, 2020 · 15 comments
Closed

Storage and url encoding #121

soujiro32167 opened this issue Feb 9, 2020 · 15 comments
Assignees
Labels
api: storage Issues related to the googleapis/java-storage API. priority: p2 Moderately-important priority. Fix may not be included in next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.

Comments

@soujiro32167
Copy link

Environment details

  1. Specify the API at the beginning of the title (for example, "BigQuery: ...")
    General, Core, and Other are also allowed as types
  • Storage
  1. OS type and version:
  • MacOS Catalina
  1. Java version:
  • 11.0.4
  • Scala 2.13.1
  1. google-cloud-java version(s):
  • 1.103.1

Steps to reproduce

  1. upload a file named 1+1 2 to my-bucket
  2. list files in my-bucket
  3. for each file, download

Code example

storage.list(bucket).iterateAll().asScala.foreach(blob => blob.downloadTo(blob.getName()))

Stack trace

com.google.cloud.RetryHelper$RetryHelperException: com.google.cloud.storage.StorageException: 404 Not Found
No such object: eli-test/1+1+2
	at com.google.cloud.RetryHelper.runWithRetries(RetryHelper.java:54)
	at com.google.cloud.storage.Blob.downloadTo(Blob.java:233)
	at com.google.cloud.storage.Blob.downloadTo(Blob.java:217)
	at Main$.$anonfun$download$1(Main.scala:73)
	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.scala:18)
	at zio.blocking.Blocking$Service.$anonfun$effectBlocking$5(Blocking.scala:134)
	at zio.internal.FiberContext.evaluateNow(FiberContext.scala:386)
	at zio.internal.FiberContext.$anonfun$fork$2(FiberContext.scala:655)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
	at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: com.google.cloud.storage.StorageException: 404 Not Found
No such object: eli-test/1+1+2
	at com.google.cloud.storage.spi.v1.HttpStorageRpc.translate(HttpStorageRpc.java:229)
	at com.google.cloud.storage.spi.v1.HttpStorageRpc.read(HttpStorageRpc.java:675)
	at com.google.cloud.storage.Blob$2.run(Blob.java:238)
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
	at com.google.api.gax.retrying.DirectRetryingExecutor.submit(DirectRetryingExecutor.java:105)
	at com.google.cloud.RetryHelper.run(RetryHelper.java:76)
	at com.google.cloud.RetryHelper.runWithRetries(RetryHelper.java:50)
	... 10 more
Caused by: com.google.api.client.http.HttpResponseException: 404 Not Found
No such object: eli-test/1+1+2
	at com.google.api.client.http.HttpRequest.execute(HttpRequest.java:1113)
	at com.google.api.client.googleapis.media.MediaHttpDownloader.executeCurrentRequest(MediaHttpDownloader.java:255)
	at com.google.api.client.googleapis.media.MediaHttpDownloader.download(MediaHttpDownloader.java:185)
	at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeMediaAndDownloadTo(AbstractGoogleClientRequest.java:685)
	at com.google.api.services.storage.Storage$Objects$Get.executeMediaAndDownloadTo(Storage.java:6996)
	at com.google.cloud.storage.spi.v1.HttpStorageRpc.read(HttpStorageRpc.java:671)
	... 15 more

External references such as API reference guides used

  • ?

Any additional information below

If I try to encode the blob name using blob.toBuilder.setBlodId(...).build, I see double encoding: 1%2B1%202 => 1%252B1%25202
Also, using blob.signUrl produces a valid url that allows download

@chingor13 chingor13 transferred this issue from googleapis/google-cloud-java Feb 9, 2020
@product-auto-label product-auto-label bot added the api: storage Issues related to the googleapis/java-storage API. label Feb 9, 2020
@yoshi-automation yoshi-automation added the triage me I really want to be triaged. label Feb 10, 2020
@elharo
Copy link
Contributor

elharo commented Feb 10, 2020

Can you reproduce this without scala?

@JesseLovelace JesseLovelace added priority: p2 Moderately-important priority. Fix may not be included in next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. and removed triage me I really want to be triaged. labels Feb 10, 2020
@soujiro32167
Copy link
Author

soujiro32167 commented Feb 10, 2020

With a single file names 1 1

import com.google.cloud.storage.Storage;
import com.google.cloud.storage.StorageOptions;

import java.nio.file.Path;

public class Yava {
  public static void main(String[] args) {
    Storage storage = StorageOptions.getDefaultInstance().getService();
    storage.list("eli-test").iterateAll().forEach(blob -> blob.downloadTo(Path.of("target", blob.getName())));
  }
}
Exception in thread "main" com.google.cloud.RetryHelper$RetryHelperException: com.google.cloud.storage.StorageException: 404 Not Found
No such object: eli-test/1+1
	at com.google.cloud.RetryHelper.runWithRetries(RetryHelper.java:54)
	at com.google.cloud.storage.Blob.downloadTo(Blob.java:233)
	at com.google.cloud.storage.Blob.downloadTo(Blob.java:217)
	at com.google.cloud.storage.Blob.downloadTo(Blob.java:260)
	at Yava.lambda$main$0(Yava.java:9)
	at java.base/java.lang.Iterable.forEach(Iterable.java:75)
	at Yava.main(Yava.java:9)
Caused by: com.google.cloud.storage.StorageException: 404 Not Found
No such object: eli-test/1+1
	at com.google.cloud.storage.spi.v1.HttpStorageRpc.translate(HttpStorageRpc.java:229)
	at com.google.cloud.storage.spi.v1.HttpStorageRpc.read(HttpStorageRpc.java:675)
	at com.google.cloud.storage.Blob$2.run(Blob.java:238)
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
	at com.google.api.gax.retrying.DirectRetryingExecutor.submit(DirectRetryingExecutor.java:105)
	at com.google.cloud.RetryHelper.run(RetryHelper.java:76)
	at com.google.cloud.RetryHelper.runWithRetries(RetryHelper.java:50)
	... 6 more
Caused by: com.google.api.client.http.HttpResponseException: 404 Not Found
No such object: eli-test/1+1
	at com.google.api.client.http.HttpRequest.execute(HttpRequest.java:1113)
	at com.google.api.client.googleapis.media.MediaHttpDownloader.executeCurrentRequest(MediaHttpDownloader.java:255)
	at com.google.api.client.googleapis.media.MediaHttpDownloader.download(MediaHttpDownloader.java:185)
	at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeMediaAndDownloadTo(AbstractGoogleClientRequest.java:685)
	at com.google.api.services.storage.Storage$Objects$Get.executeMediaAndDownloadTo(Storage.java:6996)
	at com.google.cloud.storage.spi.v1.HttpStorageRpc.read(HttpStorageRpc.java:671)
	... 11 more

@elharo
Copy link
Contributor

elharo commented Feb 10, 2020

Do you have a pom.xml or build.gradle for this?

@soujiro32167
Copy link
Author

I have sbt: "com.google.cloud" % "google-cloud-storage" % "1.103.1"

@soujiro32167
Copy link
Author

the only thing needed is the gcloud storage dependency

@athakor
Copy link
Contributor

athakor commented Feb 11, 2020

@soujiro32167 Thanks for pointing out this bug, but it's related to google-http-java-client.

FYI, A string of characters that do not need to be encoded when used in URI query strings, as
specified in RFC 3986. Note that some of these characters do need to be escaped when used in
other parts of the URI and space is one of them which needs to be escape.

you can check list of rules for an encoding a string which is parts of the URI.

Rules :
https://github.com/googleapis/google-http-java-client/blob/master/google-http-client/src/main/java/com/google/api/client/util/escape/PercentEscaper.java#L32

Also look at the below links which replacse the space with {'+'}

https://github.com/googleapis/google-http-java-client/blob/master/google-http-client/src/main/java/com/google/api/client/util/escape/PercentEscaper.java#L218

https://github.com/googleapis/google-http-java-client/blob/master/google-http-client/src/main/java/com/google/api/client/util/escape/PercentEscaper.java#L93

@elharo
Copy link
Contributor

elharo commented Feb 11, 2020

The problem is not that the space is not being encoded. It's that it's being encoded incorrectly as a plus sign. This is wrong. The plus sign is the plus sign. It is not a space and does not represent one in URLs. This is a common confusion between URL encoding and application-x-www-formurl-encoding. The latter only applies in HTML forms.

This is complicated because there's a lot of broken code out there that does not follow the spec. To make matters worse, some of the GCS code is trying to be bug-for-bug compatible with other non-Google code.

@dmitry-fa
Copy link
Contributor

I thinks this is a duplicate of #57
it's still reproducible with google-http-client 1.34.1

Workaround to add this dependency:

        <dependency>
            <groupId>com.google.http-client</groupId>
            <artifactId>google-http-client</artifactId>
            <version>1.33.0</version>
        </dependency>

@athakor
Copy link
Contributor

athakor commented Feb 11, 2020

The problem is not that the space is not being encoded. It's that it's being encoded incorrectly as a plus sign. This is wrong. The plus sign is the plus sign. It is not a space and does not represent one in URLs. This is a common confusion between URL encoding and application-x-www-formurl-encoding. The latter only applies in HTML forms.

This is complicated because there's a lot of broken code out there that does not follow the spec. To make matters worse, some of the GCS code is trying to be bug-for-bug compatible with other non-Google code.

Correct. Spaces should be encoded as %20 instead of a plus sign.

@elharo
Copy link
Contributor

elharo commented Feb 11, 2020

The bug is in com.google.api.client.http.UriTemplate#expand or something it calls. This is where the + sign is incorrectly introduced.

@dmitry-fa
Copy link
Contributor

I built google-http-client 1.34.2-SNAPSHOT, the problem with spaces is gone!

@athakor athakor self-assigned this Feb 13, 2020
@elharo
Copy link
Contributor

elharo commented Feb 15, 2020

FYI if you import com.google.cloud:libraries-bom:4.1.0 or later the bug is fixed. We still need to release a version of the google-cloud-storage artifact that updates its dependencies.

  <dependencyManagement>
	 <dependencies>
	  <dependency>
	    <groupId>com.google.cloud</groupId>
	    <artifactId>libraries-bom</artifactId>
	    <version>4.1.0</version>
	    <type>pom</type>
	    <scope>import</scope>
	   </dependency>
	 </dependencies>
  </dependencyManagement>

  <dependencies>
    <dependency>
  	  <groupId>com.google.cloud</groupId>
  	  <artifactId>google-cloud-storage</artifactId>
  	</dependency>
    <dependency>
      <groupId>junit</groupId>
      <artifactId>junit</artifactId>
      <version>4.13</version>
      <scope>test</scope>
    </dependency>
  </dependencies>

@frankyn
Copy link
Member

frankyn commented Feb 20, 2020

@soujiro32167 could you try using the latest release:
https://search.maven.org/artifact/com.google.cloud/google-cloud-storage-parent/1.104.0/pom

Thank you for your patience folks.

@frankyn
Copy link
Member

frankyn commented Feb 20, 2020

Verified the latest version has resolved the issue (GoogleCloudPlatform/java-docs-samples#2064). Closing issue, please reopen if the issue persists.

@frankyn frankyn closed this as completed Feb 20, 2020
@soujiro32167
Copy link
Author

@frankyn thanks! I'll try it out

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: storage Issues related to the googleapis/java-storage API. priority: p2 Moderately-important priority. Fix may not be included in next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.
Projects
None yet
Development

No branches or pull requests

7 participants