Skip to content

Speed and memory use

John Cupitt edited this page Jan 5, 2023 · 33 revisions

We've written programs using number of different image processing system to load a TIFF image, crop 100 pixels off every edge, shrink by 10% with bilinear interpolation, sharpen with a 3x3 convolution and save again. It's a trivial test, but it does give some idea of the speed and memory behaviour of these libraries (and it's also quite fun to compare the code).

See also our main Benchmarks page for a more complex benchmark and timings on a variety of machines. The Why is libvips quick page tries to explain why libvips does well on this test.

Results

AMD Ryzen Threadripper PRO 3955WX 16-Cores, Ubuntu 22.10

Software Run time
(secs real)
Memory
(peak RSS MB)
Times slower
libvips 8.14 Lua 0.37 76 0.97
libvips 8.14 C/C++ 0.38 84 1.00
libvips 8.14 PHP 0.41 103 1.07
tiffcp 0.45 581 1.18
libvips 8.14 Python 0.47 133 1.24
libvips 8.14 Ruby 0.49 107 1.29
libvips 8.14, JPEG images 0.98 177 2.57
libvips 8.14 CLI 1.08 83 2.84
Pillow-SIMD 9.0.0 see 1 1.1 985 2.89
GraphicsMagick 1.4 1.42 1982 3.73
libvips 8.14, one thread 1.49 52 3.92
nip2 1.59 158 4.18
GEGL 0.4.38-1, JPEG images see 4 6.13 751 5.0
libgd 2.3.3-6, JPEG images see 2 4.9 1040 5.00
NetPBM 1.13 1.70 289 5.15
ImageJ 1.53 2.59 542 6.82
OpenCV 4.6 2.85 791 7.5
rmagick 6.9.11-60 2.87 2785 7.55
OpenImageIO 2.3.18 see 10 2.91 2223 7.65
convert 6.9.11-60 2.99 2002 7.86
imwand.py 6.9.11-60 3.08 2015 8.11
ExactImage 1.0.2-9 see 3 3.21 475 8.44
Imlib2 1.7.4-2 see 9 3.39 1018 8.92
FreeImage 3.18.0 see 7 4.21 752 11.08
imagick 6.9.11-60 6.61 2018 17.39
Pike 8.0.702 5.90 1376 17.88
GMIC 2.9.4-4 9.87 2995 25.97
Octave 7.2.0-1 see 5 27.08 6505 71.26

Graphically

This graph was made by running ps very quickly and piping the output to a simple script that calculated total RSS for all processes associated with a task.

This is a fancier one generated by vipsprofile showing the memory behaviour of libvips. The bottom graph shows total memory, the upper traces show threads calculating useful results (green), threads blocked on synchronisation (red) and memory allocations (white ticks). There's a blog post with some more detail on how this was made.

Notes

The benchmarks plus a simple driver program are in a github repository. See the README for details.

Except where noted, all timings are for a 10,000 by 10,000 pixel 8-bit RGB image in uncompressed tiled TIFF format. Each test was run with something like:

time ./vips.sh tmp/x.tif tmp/x2.tif

On a quiet system with the quickest real time of five runs recorded. There's no attempt to clear the disc cache, so disc speed should not be a factor. The peak memory column was found by sampling RES with ps using this script. I used the systems as packaged for Ubuntu unless otherwise indicated. I last ran these tests on 34 Jan 2023 and used the current stable version of every package except where otherwise noted. Tracker was disabled.

The benchmark hardware has 16 cores, so systems like Pillow, ymagine, OpenCV and ImageScience, which do not thread automatically, are lower in the table than they should be. On a single-core machine the table would look quite different. There's a separate entry for libvips with a single worker thread for comparison, although even when running with just a single worker libvips will still use a separate write-behind thread.

This test does a lot of file IO and relatively little processing, which flatters libvips.

Some systems, like ImageJ, GEGL and nip2, have relatively long start-up times and this hurts their position in the table.

The libvips command-line version generates a huge amount of disc traffic which makes it unsuitable for certain applications. This is not really considered in this table.

  1. Pillow is single-threaded, so the fairest comparison for raw processing speed would be against vips-1thread.

  2. libgd will not read tiff, so I used jpeg. Their "times slower" column is against libvips with a jpeg source. A lot of time is therefore being spent in libjpeg, which is slightly unfair to libvips.

  3. ExactImage will not read tiled tiff, so the benchmark uses a strip tiff for this test.

  4. GEGL does not really focus on batch-style processing -- it targets interactive applications, like paint programs. It was run with JPEG images, with 16 threads, and timed against a pyvips program which exactly matches GEGL's processing.

  5. Octave aims to be a very high-level prototyping language and is not primarily targeting speed.

  6. FreeImage does not have a sharpening or convolution operation so I skipped that part of the benchmark.

  7. Imlib2 is spending almost all its time in image input and output.

  8. The OpenImageIO test uses oiiotool, which may not be the best way to test the library.

Implementations

See the repository.

Clone this wiki locally