Project uses CMake
as a build system. There are different flags that can be used to generate project with different level of optimizations:
USE_FMA
- use Fused multiply–add instructions. (default ON)USE_VECTORIZE
- Use SIMD instructions for computation. (default ON)USE_CACHE_OPT
- Use cache friendly implementation of the algorithm, using precomputation. (default ON)USE_BULK_COMPUTE
- Use special functions for getting average pixels that returns values for 2 levels of recursion at once. Can be used only as an addition to cache optimization. (default ON)RDTSC_FAILBACK
- Use RDTSC hardware counters for measuring performance. If disabled Intel PCM is used, requires aditional kernel modules and sudo root access to the machine. (default ON)
Building a project:
$ cmake -DCMAKE_BUILD_TYPE=Release -DUSE_FMA=ON -DRDTSC_FAILBACK=ON -DGENERATE_FLOP_COUNT=ON -DUSE_VECTORIZE=ON -DUSE_CACHE_OPT=ON -DUSE_BULK_COMPUTE=OFF
$ make
$ ./fractal-compression <path_to_image.bmp>
- Intel Intrinsics - Intrinsics references
- Agner's table - Agners instruction table
- Manual - Intel's software optimization manual
Implementation used as a reference: (https://github.com/kennberg/fractal-compression)