New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use mapChunks
in CirceInstances.streamedJsonArray
#7010
base: series/0.23
Are you sure you want to change the base?
Use mapChunks
in CirceInstances.streamedJsonArray
#7010
Conversation
mapChunks
in CirceInstances.streamedJsonArray
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really like this refactoring 👍🏻 but wouldn't you mind adding a small micro-benchmark to the CirceJsonBench
suite to prove we aren't losing any microseconds with that implementation?
val bldr = Chunk.newBuilder[Byte] | ||
c.foreach { o => | ||
bldr += CirceInstances.comma | ||
bldr += fromJsonToChunk(printer)(o) | ||
} | ||
bldr.result |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps this can be a local def nonFirst(c: Chunk[Json]): Chunk[Byte]
function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additionally, rather than using the fromJsonToChunk
on every element, and using chunk concatenation (in the bldr builder), would it also help to use one bytebuffer per source chunk, directly print each json to that buffer, and use one view ?
Edit perhaps that would be an optimisation to do at the circe
library instead, with a printInterspersed
or printCommaSeparated
method that takes any iterable of Json, or similar. https://github.com/circe/circe/blob/6bbb7e73a529916faec89f1455d11356af03ea64/modules/core/shared/src/main/scala/io/circe/Printer.scala#L209-L221.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additionally, rather than using the fromJsonToChunk on every element, and using chunk concatenation (in the bldr builder), would it also help to use one bytebuffer per source chunk, directly print each json to that buffer, and use one view ?
Benchmarking streamJsonArrayEncoder
and jsonEncoder
makes it clear that there is a lot of room for improvement. That feels a bit out of scope of my small refactoring in this PR.
Finally found some time to add a benchmark. Previous
This PR:
|
09a06f4
to
05cb6d1
Compare
Benchmark streamJsonArrayEncoder (vs jsonEncoder)
import java.util.concurrent.TimeUnit | ||
|
||
// sbt "bench/jmh:run -i 10 -wi 10 -f 2 -t 1 org.http4s.bench.CirceJsonStreamBench" | ||
@BenchmarkMode(Array(Mode.AverageTime)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some benchmarks often use a "Throughput" mode, rather than average. Perhaps that would help smooth out results error?
Small clean up to use
mapChunks
instead of an implementation usingrepeatPull
.