Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversion from InputStream -> ByteBuffer on gRPC writes creates many byte[] allocations. #1123

Open
frankyn opened this issue Feb 26, 2024 · 1 comment

Comments

@frankyn
Copy link
Contributor

frankyn commented Feb 26, 2024

Hi team,

Investigating memory usage in the write path for gRPC; I found that significant allocations are coming from converting InputStream -> ByteBuffer code in the GCS connector: https://storage.googleapis.com/anima-frank/large-writes-grpc/grpc_100_write_100MiB_t_4_profile.html

Note: Workload runs Fsbenchmark uploading 10k 100MiB object across 4 threads in n2-standard-4 GCE using DirectPath.

~72% of allocations come from converting InputStream -> ByteBuffer creates 2 2MiB byte[]'s for every write:

Separately, java-storage does contribute to 19% of allocations, I'm digging into this as well. My current suspicion is that java-storage creates a buffer per upload and not per message.

@frankyn
Copy link
Contributor Author

frankyn commented Feb 27, 2024

Update: I attempted to make a change to the code to workaround this issue but overall wall time is not suitable:
For a sequential write of 10k 100MiB objects; existing implementation is takes around ~2+ hours using gRPC DP while my prototype version is still running after ~9+ hours.

@arunkumarchacko could you investigate alternatives?

cc: @schannahalli, @danielduhh

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant