Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Benchmark against the C implementation of OpenEXR #185

Open
Shnatsel opened this issue Jan 3, 2023 · 2 comments
Open

Benchmark against the C implementation of OpenEXR #185

Shnatsel opened this issue Jan 3, 2023 · 2 comments
Labels
enhancement New feature or request

Comments

@Shnatsel
Copy link
Contributor

Shnatsel commented Jan 3, 2023

What can be improved or is missing?

Provide benchmarks comparing the performance of this crate to the OpenEXR reference implementation.

Implementation Approach

The openexr crate provides high-level, mostly safe bindings to the C implementation.

@Shnatsel Shnatsel added the enhancement New feature or request label Jan 3, 2023
@Shnatsel
Copy link
Contributor Author

Shnatsel commented Jan 3, 2023

It would be interesting to have benchmarks on x86 as well as ARM.

I've tried running the benchmarks from #181 on a 16-core Ampere Altra machine from Google Cloud, and exrs in parallel mode absolutely rips 🚀

running 3 tests
test read_image_rgba_f32_to_f16 ... bench:  32,518,506 ns/iter (+/- 176,580)
test read_image_rgba_f32_to_f32 ... bench:  12,701,451 ns/iter (+/- 205,549)
test read_image_rgba_f32_to_u32 ... bench:  13,428,159 ns/iter (+/- 93,722)

test result: ok. 0 passed; 0 failed; 0 ignored; 3 measured

     Running benches/profiling.rs (target/release/deps/profiling-ddc84dc3d9a8e9fd)

running 2 tests
test read_single_image_all_channels             ... bench:  22,543,779 ns/iter (+/- 598,279)
test read_single_image_from_buffer_all_channels ... bench:  19,874,611 ns/iter (+/- 439,578)

test result: ok. 0 passed; 0 failed; 0 ignored; 2 measured

     Running benches/read.rs (target/release/deps/read-35e4db800494d5a6)

running 8 tests
test read_single_image_rle_all_channels               ... bench:  23,277,019 ns/iter (+/- 3,901,982)
test read_single_image_rle_non_parallel_all_channels  ... bench:  33,362,265 ns/iter (+/- 293,883)
test read_single_image_rle_non_parallel_rgba          ... bench:  36,240,049 ns/iter (+/- 247,417)
test read_single_image_rle_rgba                       ... bench:  16,579,403 ns/iter (+/- 301,560)
test read_single_image_uncompressed_non_parallel_rgba ... bench:  12,898,483 ns/iter (+/- 171,469)
test read_single_image_uncompressed_rgba              ... bench:  13,151,095 ns/iter (+/- 162,987)
test read_single_image_zips_non_parallel_rgba         ... bench: 137,174,659 ns/iter (+/- 417,130)
test read_single_image_zips_rgba                      ... bench:  13,942,807 ns/iter (+/- 319,728)

test result: ok. 0 passed; 0 failed; 0 ignored; 8 measured

     Running benches/write.rs (target/release/deps/write-35aaf83004c9be4c)

running 5 tests
test write_nonparallel_zip1_to_buffered      ... bench: 445,788,833 ns/iter (+/- 2,537,143)
test write_parallel_any_channels_to_buffered ... bench:  31,390,870 ns/iter (+/- 998,324)
test write_parallel_zip16_to_buffered        ... bench:  43,083,441 ns/iter (+/- 2,608,785)
test write_parallel_zip1_to_buffered         ... bench:  33,796,498 ns/iter (+/- 1,998,909)
test write_uncompressed_to_buffered          ... bench:  21,775,658 ns/iter (+/- 413,420)

Decoding a zipped image in 13 milliseconds, how cool is that?

That is, as long as you don't have to do any pixel format conversions, and don't run into #178 and #182. Those things really rain on our parade if you try to decode into RGBA f16 like the reference OpenEXR does.

@johannesvollmer
Copy link
Owner

we should differentiate, but do both, of the following comparisons:

  • Performance with equal settings (as close as possible)
  • Performance out of the box

@Shnatsel Shnatsel mentioned this issue Jan 7, 2023
6 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants