Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GCS resumeable upload chunker bug #3108

Open
jroper opened this issue Feb 12, 2024 · 0 comments
Open

GCS resumeable upload chunker bug #3108

jroper opened this issue Feb 12, 2024 · 0 comments

Comments

@jroper
Copy link

jroper commented Feb 12, 2024

Just browsing the source code for the GCS resumeable upload implementation, I noticed a bug. This code is supposed to split the stream into byte string chunks of the same size, with the remaining chunks being in the last element of the stream. Here is the code:

  private def chunker(chunkSize: Int) = Flow[ByteString].map(Some(_)).concat(Source.single(None)).statefulMapConcat {
    () =>
      val chunkBuilder = ByteString.newBuilder
      bytes =>
        bytes.fold(Some(chunkBuilder.result()).filter(_.nonEmpty).toList) { bytes =>
          chunkBuilder ++= bytes
          if (chunkBuilder.length < chunkSize) {
            Nil
          } else if (chunkBuilder.length == chunkSize) {
            val chunk = chunkBuilder.result()
            chunkBuilder.clear()
            chunk :: Nil
          } else { // chunkBuilder.length > chunkSize
            val result = chunkBuilder.result()
            chunkBuilder.clear()
            val (chunk, init) = result.splitAt(chunkSize)
            chunkBuilder ++= init
            chunk :: Nil
          }
        }
  }

The issue is in the last few lines, if the chunkBuilder buffer has multiple chunks worth of bytes in it, they should be all emited, but this code is only emitting the first chunk. The result is that the additional chunks won't be emitted until the next ByteString is received, and possible consequences of this include unbounded buffer growth (if the size of the ByteString's coming in to the chunker are consistently larger than the chunk size) and emitting a final chunk that is larger (potentially much larger) than the chunk size.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants