Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add disableGzipContent option for create from InputStream #82

Merged

Conversation

nblair
Copy link
Contributor

@nblair nblair commented Jan 23, 2020

Is your feature request related to a problem? Please describe.

I have a use case that necessitates using the Storage/Bucket write method variants with InputStream arguments. The inputs for my use case to store in Cloud Storage are:

  • predominantly already compressed,
  • have a wide range of file sizes, from a few KB to GB+,
  • we don't always know the size in advance.

Describe the solution you'd like

Given that our content is already compressed, I would prefer to avoid spending the CPU time on compressing the content again en route to the Bucket.

Describe alternatives you've considered

We have used the byte[] variants, with BlobTargetOption.disableGzipContent and the Compose request. This is suitable but leaves us with a tuning challenge:

  • Buffering these streams into memory (byte[] chunks) requires additional heap space be available
  • Compose is limited to only 32 chunks. If the file content we intend to store is far larger than 32 * bufferSize, we will upload 31 small sized chunks and 1 large chunk that we have to use the InputStream variant for anyways, and pay the additional overhead of gzip compression.

Additional context

I have done some exploration and it appears that the values on the BlobWriteOption (used on InputStream variants) and BlobTargetOption (used on byte[] variants) enums both are translated into StorageRpc.Option. I have a small contribution to offer in the form of a pull request to follow.

Fixes #36; ported from googleapis/google-cloud-java#7057

@googlebot googlebot added the cla: yes This human has signed the Contributor License Agreement. label Jan 23, 2020
@nblair nblair force-pushed the blobwriteoption-disableGzipCompression branch from 375fdb3 to d0ef86f Compare January 23, 2020 17:26
@elharo elharo added the kokoro:run Add this label to force Kokoro to re-run the tests. label Jan 28, 2020
@yoshi-kokoro yoshi-kokoro removed the kokoro:run Add this label to force Kokoro to re-run the tests. label Jan 28, 2020
@codecov
Copy link

codecov bot commented Jan 28, 2020

Codecov Report

Merging #82 into master will increase coverage by 0.01%.
The diff coverage is 100%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master      #82      +/-   ##
============================================
+ Coverage      63.3%   63.32%   +0.01%     
  Complexity      528      528              
============================================
  Files            30       30              
  Lines          4720     4722       +2     
  Branches        450      450              
============================================
+ Hits           2988     2990       +2     
  Misses         1571     1571              
  Partials        161      161
Impacted Files Coverage Δ Complexity Δ
...rc/main/java/com/google/cloud/storage/Storage.java 80.16% <100%> (+0.1%) 0 <0> (ø) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b93106b...183d0c6. Read the comment docs.

…s#36)

Previously, only the methods to create blobs that take a byte[] argument offer the option to disable gzip compression; the methods that accept an InputStream argument do not. This is due to the BlobWriteOption enum missing a matching constant for BlobTargetOption.IF_DISABLE_GZIP_CONTENT.

This change set adds a matching IF_DISABLE_GZIP_CONTENT constant to BlobWriteOption including the correct translation to StorageRpc.Option. The net result is that the Storage create functions that accept an InputStream now offer the option to disable gzip compression.
@nblair nblair force-pushed the blobwriteoption-disableGzipCompression branch from b84e5db to 183d0c6 Compare January 29, 2020 05:00
@athakor athakor added the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Jan 29, 2020
@yoshi-kokoro yoshi-kokoro removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Jan 29, 2020
@nblair
Copy link
Contributor Author

nblair commented Feb 13, 2020

Hi there! I'm curious if you have any feedback or questions on the idea and the approach?

Thanks in advance!

@frankyn frankyn self-requested a review February 14, 2020 05:37
@frankyn frankyn added the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Feb 14, 2020
@frankyn
Copy link
Member

frankyn commented Feb 14, 2020

Change LGTM, pending tests to finish and will merge. Thank you for your contribution @nblair!

@frankyn frankyn added kokoro:force-run Add this label to force Kokoro to re-run the tests. kokoro:run Add this label to force Kokoro to re-run the tests. and removed kokoro:force-run Add this label to force Kokoro to re-run the tests. labels Feb 14, 2020
@yoshi-kokoro yoshi-kokoro removed kokoro:run Add this label to force Kokoro to re-run the tests. kokoro:force-run Add this label to force Kokoro to re-run the tests. labels Feb 14, 2020
@frankyn frankyn merged commit 65d3739 into googleapis:master Feb 14, 2020
@nblair
Copy link
Contributor Author

nblair commented Feb 15, 2020

Thanks for your time @frankyn - cheers!

@nblair nblair deleted the blobwriteoption-disableGzipCompression branch February 15, 2020 14:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes This human has signed the Contributor License Agreement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Cloud Storage: provide option to disableGzipContent on InputStream variants
6 participants