Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Encoding speed lags behind libpng #224

Open
shuffle2 opened this issue Jul 24, 2022 · 8 comments
Open

Encoding speed lags behind libpng #224

shuffle2 opened this issue Jul 24, 2022 · 8 comments
Labels
bug Something isn't working confirmed

Comments

@shuffle2
Copy link
Contributor

shuffle2 commented Jul 24, 2022

Describe the bug
I'm moving a project to use zlib-ng and spng instead of zlib(mainline) and libpng. Overall the perf improvement is great (e.g. for decoding png), but I notice spng actually lags behind libpng when encoding large-ish images.
For encoding image with properties:
17547840 bytes (2708 * 2160 * 3), color_type = SPNG_COLOR_TYPE_TRUECOLOR, bit_depth=8, SPNG_IMG_COMPRESSION_LEVEL=6

I'm seeing these times:
zlib + libpng: ~1600ms
zlib-ng + libpng: ~510ms
zlib-ng + spng: ~740ms (SPNG_SSE=1 or 4 seems about the same)

So zlib-ng is lifting the most weight, and spng is actually a bit of a regression.

Looking at some quick perf stats, the top functions (in the entire program, which is doing some other stuff besides png encoding) are:

Function Name Total CPU [unit, %] Self CPU [unit, %] Module
 longest_match_avx2 1957 (11.14%) 1952 (11.11%) dolphin
 lzma_decode 1250 (7.12%) 1219 (6.94%) dolphin
 encode_scanline 4485 (25.54%) 1159 (6.60%) dolphin
 paeth 788 (4.49%) 784 (4.46%) dolphin

(lzma_decode is not spng related, just showing that spng is up there near the top. longest_match_avx2 is from zlib-ng, but from spng calls into it)

or as flame graph, to show which parts are spng related better:
image

To Reproduce
The libpng version of the code is here and here (C part).

This is the function being timed, for spng case:

bool SavePNG(const std::string& path, const u8* input, ImageByteFormat format, u32 width,
             u32 height, u32 stride, int level)
{
  spng_color_type color_type;
  switch (format)  {
  case ImageByteFormat::RGB:
    color_type = SPNG_COLOR_TYPE_TRUECOLOR;
    break;
  case ImageByteFormat::RGBA:
    color_type = SPNG_COLOR_TYPE_TRUECOLOR_ALPHA;
    break;
  default:
    return false;
  }

  auto ctx = make_spng_ctx(SPNG_CTX_ENCODER);
  if (!ctx)
    return false;

  auto outfile = File::IOFile(path, "wb");
  if (spng_set_png_file(ctx.get(), outfile.GetHandle()))
    return false;

  if (spng_set_option(ctx.get(), SPNG_IMG_COMPRESSION_LEVEL, level))
    return false;

  spng_ihdr ihdr{};
  ihdr.width = width;
  ihdr.height = height;
  ihdr.color_type = color_type;
  ihdr.bit_depth = 8;
  if (spng_set_ihdr(ctx.get(), &ihdr))
    return false;

  if (spng_encode_image(ctx.get(), nullptr, 0, SPNG_FMT_PNG, SPNG_ENCODE_PROGRESSIVE))
    return false;
  for (u32 row = 0; row < height; row++)  {
    const int err = spng_encode_row(ctx.get(), &input[row * stride], stride);
    if (err == SPNG_EOI)
      break;
    if (err)
      return false;
  }
  return true;
}

Expected behavior
I expect spng to be faster than libpng :)

I'm mostly curious if these results are expected, or if I've made some simple error in my use of spng which would enable faster processing.

Platform (please complete the following information):

  • Architecture: x86-64 (amd zen2)
  • OS: Windows 11
  • Version v0.7.2
@shuffle2 shuffle2 added the bug Something isn't working label Jul 24, 2022
@randy408
Copy link
Owner

It's probably filtering that's slower. Filtering and the process for finding the best filter is the same for both libraries, but the code for spng doesn't optimize as well. At some point SIMD optimizations will be added for x86 and ARM which will make that process faster and won't rely on compiler optimizations.

@shuffle2
Copy link
Contributor Author

btw, spng has now replaced libpng in the project: dolphin-emu/dolphin@a9edf12 🎉

@ocpalo
Copy link
Contributor

ocpalo commented Sep 18, 2022

I have added libspng to OpenCV as optional PNG codec. I do experience encoding speed lags in OpenCV implementation too. Here are my results

spng

Note: Google Test filter = *PNG*
[==========] Running 2 tests from 1 test case.
[----------] Global test environment set-up.
[----------] 2 tests from PNG
[ RUN      ] PNG.decode
[ PERFSTAT ]    (samples=13   mean=21.16   median=21.10   min=20.96   stddev=0.21 (1.0%))
[       OK ] PNG.decode (280 ms)
[ RUN      ] PNG.encode
[ PERFSTAT ]    (samples=10   mean=72.29   median=71.78   min=71.15   stddev=1.18 (1.6%))
[       OK ] PNG.encode (745 ms)
[----------] 2 tests from PNG (1025 ms total)

[----------] Global test environment tear-down
[==========] 2 tests from 1 test case ran. (1025 ms total)
[  PASSED  ] 2 tests.

png

Note: Google Test filter = *PNG*
[==========] Running 2 tests from 1 test case.
[----------] Global test environment set-up.
[----------] 2 tests from PNG
[ RUN      ] PNG.decode
[ PERFSTAT ]    (samples=13   mean=27.85   median=27.66   min=27.26   stddev=0.56 (2.0%))
[       OK ] PNG.decode (366 ms)
[ RUN      ] PNG.encode
[ PERFSTAT ]    (samples=10   mean=54.36   median=54.17   min=53.88   stddev=0.43 (0.8%))
[       OK ] PNG.encode (572 ms)
[----------] 2 tests from PNG (938 ms total)

[----------] Global test environment tear-down
[==========] 2 tests from 1 test case ran. (938 ms total)
[  PASSED  ] 2 tests.

If i can solve this encoding speed problem, OpenCV might consider to make spng default png codec. Do you have any suggestions? @randy408

@randy408
Copy link
Owner

Double-check the same options are set for both libraries, quick way to verify that is to compare the the file sizes. But slower encoding is a known issue and performance isn't on par even if the settings are identical.

Do you have any suggestions?

Wait till encoding is optimized, I don't have a timeline for it right now.

@ocpalo
Copy link
Contributor

ocpalo commented Sep 19, 2022

fyi, both tests have the same options are set and have the same image.

@randy408
Copy link
Owner

Are the produced PNG's identical in size for both libraries? I just haven't seen it perform 30% worse before.

@ocpalo
Copy link
Contributor

ocpalo commented Sep 19, 2022

Edit : Sorry I misunderstood, I need to check if produced PNG`s identical in size. I will let you know.

Yes, iirc it is 3024x2050 or something like this.

OpenCV stores images as BGR for hysterical reasons. So I do some manual stuff to convert RGB->BGR and some conversions when format conversions are not supported. Also there might be a performance loss because of these. But libpng also does RGB->BGR in the same tests because it is how its implemented in OpenCV. I have tried to optimize my code, but I did not gain more than 5ms.

@ocpalo
Copy link
Contributor

ocpalo commented Sep 20, 2022

@randy408 Sorry for the delay.

Yes, the produced png files are have identical size. Their IHDR chunk is also identical. I just checked to be sure.

karlvr added a commit to cactuslab/JVips that referenced this issue May 7, 2023
But we’ll continue using libpng as it seems that libspng may not currently be faster at encoding randy408/libspng#224
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working confirmed
Projects
None yet
Development

No branches or pull requests

3 participants