Skip to content

Benchmarks

Kirk Martinez edited this page Dec 2, 2022 · 33 revisions

This benchmark is useful for testing the VIPS multi-threading system, for comparing generations of processors and for testing for performance regressions or improvements between versions of VIPS. We have a separate set of benchmarks comparing VIPS to other image processing systems on the Speed-and-memory-use page.

VIPS (from version 7.11.12) includes a benchmark adapted from the system used to generate images for The National Gallery's Print on Demand service. The PARSEC Benchmark Suite includes this benchmark as one of their tests. There's a description of the benchmark, including some detail on application and performance, in the PARSEC Tecnical Report.

In this benchmark images from a 10k by 10k studio digital camera are colour processed, resized, cropped and sharpened. You can see the exact sequence of operations the benchmark performs in the source code. This was originally processing images from a remote server over a 100 Mbit/s network. No attempt was made to make it quick (there was no point); you could make im_benchmark() a lot faster very easily if that was your aim. In 2020 the time taken has reduced to less than 0.5s so it may become harder to measure from now without increasing its workload and scaling.

Interesting things to learn from these include the fact that a slightly slower CPU with more cores is better (also less complex to keep cool!) -- for example the i7 6700K is slower than the 6 core 3930K. Also that Hyperthreading is not very useful for float-intensive tasks such as this. The speed-up in the table is the speed-up compared to running one thread (on one core). Most CPUs now clock faster in this case, which skews the results. On intel CPUs -O2 is slightly faster than -O3.

After building and installing vips, try:

cd vips-x.y.z
cd benchmark
./benchmarkn.sh

And see results for your system. Feel free to email them to km@ecs.soton.ac.uk for inclusion here.

Results summary

Processor Clock (GHz) Cores Real Time (s) Speedup
Ryzen 7950x 5.4 16 (32ht) 0.15 9.5 x
Ryzen 7900x 5.4 12 (24ht) 0.18 7.8 x
Ryzen Threadripper 3970x 3.7 32 (64ht) 0.20 10 x
Ryzen Threadripper 3955x 4.3 16 (32ht) 0.22 10 x
i7-10700K 4.8-5 8 (16ht) 0.40 5.7 x
E5-2630V3 dual 2.4 2x8 (32ht) 0.44 8.4 x
Ryzen 7 3700X 3.6 8 (16ht) 0.44 5.6
AMD EPYC 7401P 2 24 0.48 10.2 x
i9-9900K 5 8 (16ht) 0.55 4.4 x
E5-2695V3 dual 2.3 28 (56 ht) 0.56 10.4 x
i7-6700K 4.4 4 (8 ht) 0.7 4 x
i7-3930K 3.2 6 (12 ht) 0.79 5.9 x
dual E5649 6core 2.5 12 (24 ht) 0.80 10.8 x
E5-1650 3.2 6 (12 ht) 0.87 5.94 x
i7-6700 3.2 4 (8 ht) 0.89 3.3 x
i7-8550U (XPS laptop) 1.8 4 (8 ht) 0.91 3.2 x
Xeon X5560 2.8 8 (16 ht) 1.08 13 x
Itanium2 ? 64 1.1 (est.) 39.4 x
Intel i7 MacBook Pro 2.6 4 (8HT) 1.31 3.7 x
i5-3470S (iMac 27") 2.9 4 1.47 3.1 x
Xeon E5402 (64 bit) 2.0 8 1.88 7.3 x
Opteron 8220 (64 bit) 3.0 8 1.96 7.6 x
Phenom II X6 3.2 6 2.39 4 x
i7-3540m laptop (64b) 3.0 2 (4ht) 2.58 2.2 x
Dual quad-core intel (64 bit) 3.0 8 2.8 7 x
i5-3210M 2.5 2 (4 ht) 3.51 1.8 x
Core 2 Extreme Quad (32 bit) 2.66 4 3.69 3.8 x
i5-5200U 2.20 2 (4 ht) 3.75 1.9 x
Opteron 850 (HP server) 2.4 4 4.25 3.7 x
Raspberry Pi4 (Pi OS) 1.5 4 5.13 3.1 x
Core 2 Duo (MacBook) 2.26 2 5.81 1.9 x
Opteron 254 (HP workstation) 2.7 2 6.14 1.9 x
P4 Xeon (64 bit) 3.6 2 (4 ht) 7 2.4 x
Core Duo (iMac) 2.0 2 11.5 1.85 x
ARM A15 Exynos 5 Chromebook 1.7 2 12.6 1.7 x
P4 Xeon (32 bit) 3.0 2 (4 ht) 19.7 1.6 x
ARM Exynos 5420 (-O0) 1.8-9 or 1.3 8 (4 big + 4 little) 21.31 2.2x
ARM A7 quad core Raspberry Pi 2 B (32 bit) 1 4 21.6 3.9 x
PM (HP laptop) 1.8 1 31.8 --
P4 (Dell desktop) 2.4 1 36.6 --
EeePC atom/ssd 1.6 1 (2 ht) 41.5 1.6 x

Time is lowest real time (wall clock time) in seconds, Speedup is (real-all-cpu-time / real-1-cpus-time) in other words speedup using all cores rather than just one. When we say CPUs - these days that really means "cores".

Results in detail

The results we've collected. Please paste more here.

For each one we've noted uname -a, gcc --version and vips --version.

AMD Ryzen 9 7900x 12 cores boosted to 5.4GHz, ran Ubuntu 22.04 with November distrib of libvips

AMD Ryzen Threadripper PRO 3955WX 16-Cores, 4.2 GHz

Linux banana 5.8.0-53-generic #60-Ubuntu SMP Thu May 6 07:46:32 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
gcc (Ubuntu 10.2.0-13ubuntu1) 10.2.0
Copyright (C) 2020 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
vips-8.11.0-Fri May 14 15:50:05 UTC 2021
building test image ...
tile=13
test image is 3770 by 5746 pixels
max cpus = 32
starting benchmark ...
/usr/bin/time -f %e vips --vips-concurrency=xx im_benchmarkn temp.v temp2.v 1
reported real-time is best of three runs
cpus real-time
1 2.27
2 1.19
3 0.83
4 0.64
5 0.51
6 0.44
7 0.39
8 0.34
9 0.32
10 0.29
11 0.28
12 0.28
13 0.25
14 0.25
15 0.24
16 0.24
17 0.24
18 0.25
19 0.25
20 0.22
21 0.22
22 0.23
23 0.23
24 0.23
25 0.23
26 0.23
27 0.22
28 0.24
29 0.24
30 0.24
31 0.23
32 0.25

i7-6700, 3.2 GHz

$ ./benchmarkn.sh 
Linux yingna 4.13.0-25-generic #29-Ubuntu SMP Mon Jan 8 21:14:41 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
gcc (Ubuntu 7.2.0-8ubuntu3) 7.2.0
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

vips-8.7.0-Fri Jan  5 16:29:51 UTC 2018
building test image ...
tile=13
test image is 3770 by 5746 pixels
max cpus = 8
starting benchmark ...
/usr/bin/time -f %e vips --vips-concurrency=xx im_benchmarkn temp.v temp2.v 1
reported real-time is best of three runs
cpus real-time
1 2.90
2 1.53
3 1.16
4 0.89
5 0.95
6 0.84
7 0.93
8 0.91

i7-3930K, 3.20 GHz

$ ./benchmarkn.sh
Linux oleg 3.10.7 #1 SMP Sat Aug 17 17:06:03 EEST 2013 x86_64 x86_64 x86_64 GNU/Linux
gcc (Ubuntu/Linaro 4.7.3-1ubuntu1) 4.7.3
Copyright (C) 2012 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
vips-7.34.2-Sat Aug 17 14:17:10 EEST 2013
building test image ...
tile=13
test image is 3770 by 5746 pixels
max cpus = 12
starting benchmark ...
/usr/bin/time -f %e vips --vips-concurrency=xx im_benchmarkn temp.v temp2.v 1
reported real-time is best of three runs
cpus real-time
1 4.69
2 2.50
3 1.72
4 1.38
5 1.16
6 0.91
7 0.97
8 0.81
9 0.79
10 0.86
11 0.86
12 0.83

E5-1650, 3.20 GHz

Mid-range 2013 HP workstation.

$ ./benchmarkn.sh 
Linux mm-jcupitt3 3.8.0-27-generic #40-Ubuntu SMP Tue Jul 9 00:17:05 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
gcc (Ubuntu/Linaro 4.7.3-1ubuntu1) 4.7.3
Copyright © 2012 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
vips-7.34.2-Mon Aug  5 12:31:56 BST 2013
building test image ...
tile=13
test image is 3770 by 5746 pixels
max cpus = 12
starting benchmark ...
/usr/bin/time -f %e vips --vips-concurrency=xx im_benchmarkn temp.v temp2.v 1
reported real-time is best of three runs
cpus real-time
1 5.17
2 2.88
3 2.18
4 2.04
5 1.79
6 1.40
7 1.62
8 0.97
9 1.55
10 1.53
11 0.87
12 1.23

i5-3210M, 2.5 GHz

A cheap Dell laptop from 2012.

Linux bambam 3.11.0-13-generic #20-Ubuntu SMP Wed Oct 23 07:38:26 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
gcc (Ubuntu/Linaro 4.8.1-10ubuntu9) 4.8.1
Copyright (C) 2013 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
vips-7.37.0-Fri Nov 29 22:34:55 GMT 2013
building test image ...
tile=13
test image is 3770 by 5746 pixels
max cpus = 4
starting benchmark ...
/usr/bin/time -f %e vips --vips-concurrency=xx im_benchmarkn temp.v temp2.v 1
reported real-time is best of three runs
cpus real-time
1 6.45
2 4.25
3 3.51
4 3.65

2 x Xeon X5560 (64bit), 2.8GHz running Ubuntu 9.04 server

gcc (Ubuntu 4.3.3-5ubuntu4) 4.3.3
vips-7.18.1-Fri May  8 15:01:54 BST 2009
IM_CONCURRENCY=1
13.25user 0.36system 0:13.97elapsed 97%CPU (0avgtext+0avgdata 0maxresident)k
368inputs+126936outputs (3major+34118minor)pagefaults 0swaps
13.15user 0.39system 0:13.98elapsed 96%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+126936outputs (0major+34121minor)pagefaults 0swaps
IM_CONCURRENCY=2
13.31user 0.31system 0:07.02elapsed 193%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+126944outputs (0major+29047minor)pagefaults 0swaps
9.50user 0.19system 0:04.99elapsed 194%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+126936outputs (0major+38221minor)pagefaults 0swaps
IM_CONCURRENCY=3
10.38user 0.29system 0:03.69elapsed 288%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+126936outputs (0major+29355minor)pagefaults 0swaps
8.76user 0.22system 0:03.18elapsed 282%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+126936outputs (0major+29056minor)pagefaults 0swaps
IM_CONCURRENCY=4
9.40user 0.32system 0:02.48elapsed 390%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+126936outputs (0major+28875minor)pagefaults 0swaps
13.51user 0.21system 0:03.55elapsed 385%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+126936outputs (0major+28957minor)pagefaults 0swaps
IM_CONCURRENCY=5
8.68user 0.19system 0:01.86elapsed 475%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+126936outputs (0major+28349minor)pagefaults 0swaps
7.71user 0.20system 0:01.74elapsed 454%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+126936outputs (0major+36092minor)pagefaults 0swaps
120.151
IM_CONCURRENCY=6
9.10user 0.13system 0:01.75elapsed 526%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+126936outputs (0major+33039minor)pagefaults 0swaps
9.81user 0.16system 0:01.79elapsed 556%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+126936outputs (0major+31385minor)pagefaults 0swaps
IM_CONCURRENCY=7
10.69user 0.26system 0:01.72elapsed 636%CPU (0avgtext+0avgdata 0maxresident)k
8inputs+126936outputs (0major+33137minor)pagefaults 0swaps
8.64user 0.24system 0:01.38elapsed 643%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+126936outputs (0major+34416minor)pagefaults 0swaps
IM_CONCURRENCY=8
10.25user 0.26system 0:01.56elapsed 671%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+126936outputs (0major+30208minor)pagefaults 0swaps
9.55user 0.30system 0:01.39elapsed 707%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+126936outputs (0major+32512minor)pagefaults 0swaps
IM_CONCURRENCY=9
8.74user 0.26system 0:01.08elapsed 831%CPU (0avgtext+0avgdata 0maxresident)k
8inputs+126936outputs (0major+34851minor)pagefaults 0swaps
8.55user 0.17system 0:01.14elapsed 758%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+126944outputs (0major+32756minor)pagefaults 0swaps

2 * quad core Xeon E5405 2.0GHz

cc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-44)
vips-7.18.1-Tue May 12 14:18:37 BST 2009
building test image ...
tile=13
test image is 3770 by 5746 pixels
starting benchmark ...
chain=1
IM_CONCURRENCY=1
time -p vips im_benchmarkn temp.v temp2.v 1

real	0m13.802s
user	0m13.664s
sys	0m0.287s

real	0m13.882s
user	0m13.714s
sys	0m0.303s
vips im_avg temp2.v
120.151
IM_CONCURRENCY=2
time -p vips im_benchmarkn temp.v temp2.v 1

real	0m6.972s
user	0m13.732s
sys	0m0.273s

real	0m6.995s
user	0m13.722s
sys	0m0.316s
vips im_avg temp2.v
120.151
IM_CONCURRENCY=3
time -p vips im_benchmarkn temp.v temp2.v 1

real	0m4.720s
user	0m13.767s
sys	0m0.284s

real	0m4.685s
user	0m13.725s
sys	0m0.331s
vips im_avg temp2.v
120.151
IM_CONCURRENCY=4
time -p vips im_benchmarkn temp.v temp2.v 1

real	0m3.534s
user	0m13.776s
sys	0m0.267s

real	0m3.564s
user	0m13.862s
sys	0m0.307s
vips im_avg temp2.v
120.151
IM_CONCURRENCY=5
time -p vips im_benchmarkn temp.v temp2.v 1

real	0m2.903s
user	0m13.886s
sys	0m0.350s

real	0m2.842s
user	0m13.753s
sys	0m0.288s
vips im_avg temp2.v
120.151
IM_CONCURRENCY=6
time -p vips im_benchmarkn temp.v temp2.v 1

real	0m2.403s
user	0m13.829s
sys	0m0.312s

real	0m2.391s
user	0m13.808s
sys	0m0.262s
vips im_avg temp2.v
120.151
IM_CONCURRENCY=7
time -p vips im_benchmarkn temp.v temp2.v 1

real	0m2.090s
user	0m13.860s
sys	0m0.322s

real	0m2.104s
user	0m13.878s
sys	0m0.331s
vips im_avg temp2.v
120.151
IM_CONCURRENCY=8
time -p vips im_benchmarkn temp.v temp2.v 1

real	0m1.880s
user	0m13.867s
sys	0m0.344s

real	0m1.880s
user	0m13.833s
sys	0m0.303s
vips im_avg temp2.v
120.151

2 x Opteron 254 (64 bit), 2.7 GHz

Linux mm-jcupitt2 3.5.0-22-generic #34-Ubuntu SMP Tue Jan 8 21:47:00 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
gcc (Ubuntu/Linaro 4.7.2-2ubuntu1) 4.7.2
vips-7.32.0-Wed Jan 23 12:05:28 GMT 2013
building test image ...
tile=13
test image is 3770 by 5746 pixels
max cpus = 2
starting benchmark ...
/usr/bin/time -f %e vips --vips-concurrency=xx im_benchmarkn temp.v temp2.v 1
reported real-time is best of three runs
cpus real-time
1 11.75
2 6.14

Pentium M (32 bit), 1.8 GHz

Linux banana 2.6.17-11-386 #2 Thu Feb 1 19:50:13 UTC 2007 i686 GNU/Linux
gcc (GCC) 4.1.2 20060928 (prerelease) (Ubuntu 4.1.1-13ubuntu5)
vips-7.11.20-Tue Feb 13 13:47:53 GMT 2007
building test image ...
tile=13
test image is 3770 by 5746 pixels
starting benchmark ...
chain=1
IM_CONCURRENCY=1
time -p vips im_benchmarkn temp.v temp2.v 1
real 31.83
user 31.41
sys 0.41
real 31.91
user 31.52
sys 0.37
vips im_avg temp2.v
120.134

Core Duo (32 bit), 2 GHz

Darwin pineapple.Belkin 9.4.0 Darwin Kernel Version 9.4.0: Mon Jun  9 19:30:53 PDT 2008; root:xnu1228.5.20~1/RELEASE_I386 i386 
i686-apple-darwin9-gcc-4.0.1 (GCC) 4.0.1 (Apple Inc. build 5465)
vips-7.16.0-Thu Sep  4 11:43:34 BST 2008
building test image ...
tile=13
test image is 3770 by 5746 pixels
starting benchmark ...
chain=1
IM_CONCURRENCY=1
time -p vips im_benchmarkn temp.v temp2.v 1
real 21.35
user 20.09
sys 1.34
real 21.37
user 20.09
sys 1.35
vips im_avg temp2.v
120.134
IM_CONCURRENCY=2
time -p vips im_benchmarkn temp.v temp2.v 1
real 11.62
user 20.76
sys 1.80
real 11.67
user 20.76
sys 1.86
vips im_avg temp2.v
120.134

4 x Opteron 850 (64 bit), 2.4 GHz

Linux roundtable 2.6.15-27-amd64-generic #1 SMP PREEMPT Fri Dec 8 17:50:54 UTC 2006 x86_64 GNU/Linux
gcc (GCC) 4.0.3 (Ubuntu 4.0.3-1ubuntu5)
vips-7.11.20-Mon Feb 12 18:05:51 GMT 2007
building test image ...
tile=13
test image is 3770 by 5746 pixels
starting benchmark ...
chain=1
IM_CONCURRENCY=1
time -p vips im_benchmarkn temp.v temp2.v 1
real 16.19
user 15.48
sys 0.59
real 15.81
user 15.36
sys 0.52
vips im_avg temp2.v
120.134
IM_CONCURRENCY=2
time -p vips im_benchmarkn temp.v temp2.v 1
real 8.19
user 15.77
sys 0.47
real 8.33
user 15.95
sys 0.49
vips im_avg temp2.v
120.134
IM_CONCURRENCY=3
time -p vips im_benchmarkn temp.v temp2.v 1
real 6.18
user 15.82
sys 0.46
real 6.04
user 15.95
sys 0.53
vips im_avg temp2.v
120.134
IM_CONCURRENCY=4
time -p vips im_benchmarkn temp.v temp2.v 1
real 4.35
user 16.11
sys 0.55
real 4.25
user 15.86
sys 0.56
vips im_avg temp2.v
120.134

2 x Xeon (32 bit), 3 GHz

2.6.9-42.0.3.ELsmp 
gcc (GCC) 3.4.6 20060404 (Red Hat 3.4.6-3)
vips-7.11.12-Fri Oct  6 13:15:22 BST 2006

IM_CONCURRENCY=1
time vips im_benchmark temp.v temp2.v
real    0m35.270s
user    0m34.366s
sys     0m0.934s
IM_CONCURRENCY=2
time vips im_benchmark temp.v temp2.v
real    0m21.914s
user    0m41.269s
sys     0m1.681s
IM_CONCURRENCY=3
time vips im_benchmark temp.v temp2.v
real    0m20.598s
user    0m57.306s
sys     0m2.765s
IM_CONCURRENCY=4
time vips im_benchmark temp.v temp2.v
real    0m19.781s
user    1m11.393s
sys     0m4.246s

2 x Xeon (64 bit), 3.6 GHz

Linux turner 2.6.17-10-generic #2 SMP Tue Dec 5 21:16:35 UTC 2006 x86_64 GNU/Linux
gcc (GCC) 4.1.2 20060928 (prerelease) (Ubuntu 4.1.1-13ubuntu5)
vips-7.11.18-Mon Dec 18 18:19:27 GMT 2006

building test image ...
tile=13
test image is 3770 by 5746 pixels
starting benchmark ...
chain=1
IM_CONCURRENCY=1
time -p vips im_benchmarkn temp.v temp2.v 1
real 17.60
user 16.58
sys 0.65
real 17.12
user 16.63
sys 0.59
vips im_avg temp2.v
120.134
IM_CONCURRENCY=2
time -p vips im_benchmarkn temp.v temp2.v 1
real 9.01
user 17.18
sys 0.78
real 8.99
user 17.12
sys 0.76
vips im_avg temp2.v
120.134
IM_CONCURRENCY=3
time -p vips im_benchmarkn temp.v temp2.v 1
real 7.78
user 22.02
sys 0.83
real 7.79
user 21.99
sys 1.00
vips im_avg temp2.v
120.134
IM_CONCURRENCY=4
time -p vips im_benchmarkn temp.v temp2.v 1
real 7.03
user 25.74
sys 1.16
real 7.02
user 25.60
sys 1.25
vips im_avg temp2.v
120.134

1 x P4, 2.4 GHz

MINGW32_NT-5.1 MM-DDAVIES1 1.0.10(0.46/3/2) 2004-03-15 07:17 i686 unknown 
gcc.exe (GCC) 3.4.2 (mingw-special)
vips-7.11.17-Wed Nov 29 12:01:14 GMTST 2006

building test image ...
tile=13
test image is 3770 by 5746 pixels
starting benchmark ...
chain=1
IM_CONCURRENCY=1
time -p vips im_benchmarkn temp.v temp2.v 1
real 36.59
user 0.01
sys 0.01
real 36.68
user 0.01
sys 0.01
vips im_avg temp2.v
120.072

Intel Core 2 Extreme Quad Core (QX6700), 2.66 GHz

A quick benchmark (11x11 unsharp mark of a 10kx10k image) shows:

1 Thread 166s
2 threads 82s
3 threads 55s
4 threads 42s

ie a linear speed-up

Linux degas.ecs.soton.ac.uk 2.6.19-1.2911.fc6 #1 SMP Sat Feb 10 15:51:47 EST 2007 i686 i686 i386 GNU/Linux
gcc (GCC) 4.1.1 20070105 (Red Hat 4.1.1-51)
vips-7.11.20-Fri Mar  2 12:47:29 GMT 2007
building test image ...
tile=13
test image is 3770 by 5746 pixels
starting benchmark ...
chain=1
IM_CONCURRENCY=1
time -p vips im_benchmarkn temp.v temp2.v 1
real 15.73
user 14.70
sys 0.30
real 13.96
user 13.86
sys 0.27
vips im_avg temp2.v
120.134
IM_CONCURRENCY=2
time -p vips im_benchmarkn temp.v temp2.v 1
real 7.15
user 14.02
sys 0.23
real 7.12
user 13.96
sys 0.29
vips im_avg temp2.v
120.134
IM_CONCURRENCY=3
time -p vips im_benchmarkn temp.v temp2.v 1
real 4.77
user 13.98
sys 0.26
real 4.78
user 13.97
sys 0.25
vips im_avg temp2.v
120.134
IM_CONCURRENCY=4
time -p vips im_benchmarkn temp.v temp2.v 1
real 4.28
user 13.65
sys 0.27
real 3.69
user 14.06
sys 0.28
vips im_avg temp2.v
120.134

8 x Opteron 8220 (64 bit), 3.0 GHz

Linux raphael 2.6.22-10-generic #1 SMP Wed Aug 22 07:42:05 GMT 2007 x86_64 GNU/Linux
gcc (GCC) 4.1.3 20070825 (prerelease) (Ubuntu 4.1.2-15ubuntu3)
vips-7.12.4-Fri Aug 31 12:02:06 BST 2007
building test image ...
tile=13
test image is 3770 by 5746 pixels
starting benchmark ...
chain=1
IM_CONCURRENCY=1
time -p vips im_benchmarkn temp.v temp2.v 1
real 15.04
user 14.64
sys 0.62
real 15.22
user 14.83
sys 0.72
vips im_avg temp2.v
120.134
IM_CONCURRENCY=2
time -p vips im_benchmarkn temp.v temp2.v 1
real 7.44
user 14.29
sys 0.61
real 7.01
user 13.36
sys 0.44
vips im_avg temp2.v
120.134
IM_CONCURRENCY=3
time -p vips im_benchmarkn temp.v temp2.v 1
real 4.58
user 13.29
sys 0.44
real 4.92
user 14.22
sys 0.44
vips im_avg temp2.v
120.134
IM_CONCURRENCY=4
time -p vips im_benchmarkn temp.v temp2.v 1
real 3.65
user 13.90
sys 0.60
real 3.93
user 14.59
sys 0.51
vips im_avg temp2.v
120.134
IM_CONCURRENCY=5
time -p vips im_benchmarkn temp.v temp2.v 1
real 2.98
user 14.25
sys 0.40
real 2.79
user 13.28
sys 0.38
vips im_avg temp2.v
120.134
IM_CONCURRENCY=6
time -p vips im_benchmarkn temp.v temp2.v 1
real 2.45
user 13.95
sys 0.42
real 2.32
user 13.07
sys 0.46
vips im_avg temp2.v
120.134
IM_CONCURRENCY=7
time -p vips im_benchmarkn temp.v temp2.v 1
real 12.57
user 13.43
sys 0.50
real 3.06
user 17.55
sys 0.58
vips im_avg temp2.v
120.134
IM_CONCURRENCY=8
time -p vips im_benchmarkn temp.v temp2.v 1
real 1.97
user 14.00
sys 0.44
real 2.16
user 15.31
sys 0.58
vips im_avg temp2.v
120.134

Asus Eee PC 1000 atom n270 1.6GHz SSD

Linux km-bigee 2.6.27-7-eeepc #1 SMP Fri Oct 31 11:36:36 MDT 2008 i686 GNU/Linux
IM_CONCURRENCY=1
time -p vips im_benchmarkn temp.v temp2.v 1
real 69.22
user 67.38
sys 1.03
real 70.64
user 67.76
sys 1.04
IM_CONCURRENCY=2
time -p vips im_benchmarkn temp.v temp2.v 1
real 42.09
user 76.66
sys 1.12
real 41.52
user 76.45
sys 1.10

Dual quad-core Intel E5320 (64-bit), 3 GHz

Intel Xeon E5320 x 2 so 8 cores.

vips-7.11.20-Wed Nov 28 11:39:32 GMT 2007
building test image ...
tile=13
test image is 3770 by 5746 pixels
starting benchmark ...
chain=1
IM_CONCURRENCY=1
time -p vips im_benchmarkn temp.v temp2.v 1
real 19.89
user 19.43
sys 0.33
real 20.41
user 19.45
sys 0.35
vips im_avg temp2.v
120.134
IM_CONCURRENCY=2
time -p vips im_benchmarkn temp.v temp2.v 1
real 10.01
user 19.61
sys 0.43
real 10.39
user 19.58
sys 0.41
vips im_avg temp2.v
120.134
IM_CONCURRENCY=3
time -p vips im_benchmarkn temp.v temp2.v 1
real 6.81
user 19.78
sys 0.38
real 6.78
user 19.80
sys 0.37
vips im_avg temp2.v
120.134
IM_CONCURRENCY=4
time -p vips im_benchmarkn temp.v temp2.v 1
real 5.17
user 19.87
sys 0.41
real 6.82
user 19.62
sys 0.37
vips im_avg temp2.v
120.134
IM_CONCURRENCY=5
time -p vips im_benchmarkn temp.v temp2.v 1
real 4.31
user 19.96
sys 0.46
real 5.16
user 19.79
sys 0.39
vips im_avg temp2.v
120.134
IM_CONCURRENCY=6
time -p vips im_benchmarkn temp.v temp2.v 1
real 3.48
user 19.83
sys 0.43
real 7.03
user 19.94
sys 0.40
vips im_avg temp2.v
120.134
IM_CONCURRENCY=7
time -p vips im_benchmarkn temp.v temp2.v 1
real 3.76
user 19.98
sys 0.44
real 3.68
user 19.86
sys 0.41
vips im_avg temp2.v
120.134
IM_CONCURRENCY=8
time -p vips im_benchmarkn temp.v temp2.v 1
real 2.84
user 20.06
sys 0.43
real 4.76
user 20.07
sys 0.48
vips im_avg temp2.v
120.134
IM_CONCURRENCY=9
time -p vips im_benchmarkn temp.v temp2.v 1
real 4.80
user 19.97
sys 0.47
real 2.79
user 20.04
sys 0.46
vips im_avg temp2.v
120.134

Intel Core2Duo P7550 (64 bit), 2.26 GHz

This is an Apple Macbook 6,1 running Ubuntu 11.04.

./benchmarkn.sh 
Linux banana 2.6.38-10-generic #46-Ubuntu SMP Tue Jun 28 15:07:17 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux
gcc (Ubuntu/Linaro 4.5.2-8ubuntu4) 4.5.2
vips-7.26.2-Wed Aug 10 10:02:22 BST 2011
building test image ...
tile=13
test image is 3770 by 5746 pixels
max cpus = 2
starting benchmark ...
/usr/bin/time -f %e vips --vips-concurrency=xx --vips-tile-width=64 --vips-tile-height=64 im_benchmarkn temp.v temp2.v 1
reported real-time is best of three runs
cpus real-time
1 10.95
2 5.81

Phenom II X6 1090T 3.2GHZ

Linux X7DWT-B 2.6.35-22-generic #35-Ubuntu SMP Sat Oct 16 20:45:36 UTC
2010 x86_64 GNU/Linux
gcc (Ubuntu/Linaro 4.4.4-14ubuntu5) 4.4.5
vips-7.24.5-Sat May  7 14:44:21 UTC 2011
building test image ...
tile=13
test image is 3770 by 5746 pixels
starting benchmark ...
chain=1
IM_CONCURRENCY=1
time -p vips im_benchmarkn temp.v temp2.v 1
real 9.54
user 11.84
sys 0.30
real 9.47
user 11.93
sys 0.21
vips im_avg temp2.v
120.151
IM_CONCURRENCY=2
time -p vips im_benchmarkn temp.v temp2.v 1
real 5.21
user 11.34
sys 0.30
real 5.23
user 11.37
sys 0.40
vips im_avg temp2.v
120.151
IM_CONCURRENCY=3
time -p vips im_benchmarkn temp.v temp2.v 1
real 3.90
user 11.60
sys 0.39
real 4.04
user 11.85
sys 0.32
vips im_avg temp2.v
120.151
IM_CONCURRENCY=4
time -p vips im_benchmarkn temp.v temp2.v 1
real 3.23
user 11.59
sys 0.32
real 3.20
user 11.60
sys 0.29
vips im_avg temp2.v
120.151
IM_CONCURRENCY=5
time -p vips im_benchmarkn temp.v temp2.v 1
real 2.69
user 11.28
sys 0.28
real 2.72
user 11.26
sys 0.30
vips im_avg temp2.v
120.151
IM_CONCURRENCY=6
time -p vips im_benchmarkn temp.v temp2.v 1
real 2.39
user 10.21
sys 0.13
real 2.48
user 10.44
sys 0.10
vips im_avg temp2.v
120.151

i5-3470S @ 2.9 GHz --- late 2012 27" iMac

Darwin katamata.local 12.2.1 Darwin Kernel Version 12.2.1: Thu Oct 18 12:13:47 PDT 2012; root:xnu-2050.20.9~1/RELEASE_X86_64 x86_64
i686-apple-darwin11-llvm-gcc-4.2 (GCC) 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2336.11.00)
Copyright (C) 2007 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

vips-7.32.0-Fri Jan 25 11:03:06 GMT 2013
building test image ...
tile=13
test image is 3770 by 5746 pixels
max cpus = 4
starting benchmark ...
/usr/bin/time vips --vips-concurrency=xx im_benchmarkn temp.v temp2.v 1
cpus = 1
        4.52 real         4.33 user         0.22 sys
cpus = 2
        2.49 real         4.57 user         0.33 sys
cpus = 3
        1.83 real         4.88 user         0.41 sys
cpus = 4
        1.47 real         5.00 user         0.47 sys

i5-5200U CPU @ 2.20GHz --- Dell XPS 13 from 2014

Linux kiwi 4.2.0-18-generic #22-Ubuntu SMP Fri Nov 6 18:25:50 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
gcc (Ubuntu 5.2.1-22ubuntu2) 5.2.1 20151010
Copyright (C) 2015 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

vips-8.2.0-Thu Nov 12 09:31:37 GMT 2015
building test image ...
tile=13
test image is 3770 by 5746 pixels
max cpus = 4
starting benchmark ...
/usr/bin/time -f %e vips --vips-concurrency=xx im_benchmarkn temp.v temp2.v 1
reported real-time is best of three runs
cpus real-time
1 6.87
2 3.92
3 3.72
4 3.63

ARM Exynos 5420

No optimisation so that it compiles on the platform.

root@linaro-server:~/vips-7.40.2/benchmark# ./benchmarkn.sh 
Linux linaro-server 3.14.0-1-linaro-arndale-octa #1ubuntu1~ci+140417073620-Ubuntu SMP 
Thu Apr 17 07:57:24 UTC 201 armv7l armv7l armv7l 
GNU/Linux gcc (Ubuntu/Linaro 4.8.1-10ubuntu8) 4.8.1 Copyright (C) 2013 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  
There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

vips-7.40.2-Mon Jun 30 18:51:43 UTC 2014 building test image ...
tile=13
test image is 3770 by 5746 pixels
max cpus = 4
starting benchmark ...
/usr/bin/time -f %e vips --vips-concurrency=xx im_benchmarkn temp.v temp2.v 1 reported real-time is best of three runs cpus real-time
1 47.59
2 31.59
3 24.44
4 21.31

SGI Origin2000 supercomputer

VIPS 7.11.20 has also been run on a 64-CPU supercomputer (an SGI Origin2000) at Princeton. The results are:

CPUs Run time (s) Speed up
1 651.85 1
2 335.9 1.94
4 170.07 3.83
8 86.06 7.57
16 44.56 14.63
32 24.06 27.09
64 16.54 39.41

So about a 40 x speedup for 64 CPUs.

If you graph these numbers you get:

So it's pretty much linear up to about 30 CPUs (with a 27x speedup). The image being processed is 1.3GB so perhaps we are starting to see IO bandwidth limits.

12 core Dell

A 2011 dual 6 core server is fairly linear up to 12 threads, then improves more slowly for the next 12 hyperthreaded cores.

Clone this wiki locally