Skip to content

graysky2/kernel_compiler_patch

Repository files navigation

kernel_compiler_patch

This patch adds additional optimization/tuning for kernel builds by adding more micro-architectures options accessible under:

 Processor type and features  --->
 Processor family --->

Why a specific patch?

The kernel uses its own set of CFLAGS, KCFLAGS. For example, see:

Alternative way to define a -march= option without this patch

As pointed out by codemac in this topic, one can simply export the value/values for the KCFLAGS and KCPPFLAGS before calling make to achieve the same result, see here.

export KCFLAGS=' -march=znver3 -mtune=znver3'
export KCPPFLAGS=' -march=znver3 -mtune=znver3'
make all

Expanded CPUs include

CPU Family -march= Min GCC Ver Min Clang Ver
Native optimizations autodetected by GCC native 4.2 3.8
Generic 64-bit level v2 x86-64-v2 11.1 12.0
Generic 64-bit level v3 x86-64-v3 11.1 12.0
Generic 64-bit level v4 x86-64-v4 11.1 12.0
AMD Improved K8-family k8-sse3 9.3 9.0
AMD K10-family amdfam10 9.3 9.0
AMD Family 10h (Barcelona) barcelona 9.3 9.0
AMD Family 14h (Bobcat) btver1 9.3 9.0
AMD Family 16h (Jaguar) btver2 9.3 9.0
AMD Family 15h (Bulldozer) bdver1 9.3 9.0
AMD Family 15h (Piledriver) bdver2 9.3 9.0
AMD Family 15h (Steamroller) bdver3 9.3 9.0
AMD Family 15h (Excavator) bdver4 9.3 9.0
AMD Family 17h (Zen) znver1 9.3 9.0
AMD Family 17h (Zen 2) znver2 9.3 9.0
AMD Family 19h (Zen 3) znver3 10.3 12.0
AMD Family 19h (Zen 4) znver4 13.0 17.0
AMD Family 19h (Zen 5) znver5 14.1 ???
Intel Bonnell family Atom bonnell 9.3 9.0
Intel Silvermont family Atom silvermont 9.3 9.0
Intel Goldmont family Atom (Apollo Lake and Denverton) goldmont 9.3 9.0
Intel Goldmont Plus family Atom (Gemini Lake) goldmont-plus 9.3 9.0
Intel 1st Gen Core i3/i5/i7-family (Nehalem) nehalem 9.3 9.0
Intel 1.5 Gen Core i3/i5/i7-family (Westmere) westmere 9.3 9.0
Intel 2nd Gen Core i3/i5/i7-family (Sandybridge) sandybridge 9.3 9.0
Intel 3rd Gen Core i3/i5/i7-family (Ivybridge) ivybridge 9.3 9.0
Intel 4th Gen Core i3/i5/i7-family (Haswell) haswell 9.3 9.0
Intel 5th Gen Core i3/i5/i7-family (Broadwell) broadwell 9.3 9.0
Intel 6th Gen Core i3/i5/i7-family (Skylake) skylake 9.3 9.0
Intel 6th Gen Core i7/i9-family (Skylake X) skylake-avx512 9.3 9.0
Intel 8th Gen Core i3/i5/i7-family (Cannon Lake) cannonlake 9.3 9.0
Intel 10th Gen Core i7/i9-family (Ice Lake) icelake-client 9.3 9.0
Intel Xeon (Cascade Lake) cascadelake 10.2 10.0
Intel Xeon (Cooper Lake) cooperlake 10.2 10.0
Intel 3rd Gen 10nm++ i3/i5/i7/i9-family (Tiger Lake) cooperlake 10.2 10.0
Intel 4th Gen 10nm++ Xeon (Sapphire Rapids) sapphirerapids 11.1 12.0
Intel 11th Gen i3/i5/i7/i9-family (Rocket Lake) rocketlake 11.1 12.0
Intel 12th Gen i3/i5/i7/i9-family (Alder Lake) alderlake 11.1 12.0
Intel 13th Gen i3/i5/i7/i9-family (Raptor Lake) raptorlake 13.0 15.0.5
Intel 5th Gen 10nm++ Xeon (Emerald Rapids) emeraldrapids 13.0 ???

Benchmarks

Intro

Three different machines running a generic x86-64 kernel and an otherwise identical kernel running with the optimized gcc options were tested using a make based endpoint.

Conclusion

There are small but real speed increases to running with this patch as judged by a make endpoint. The increases are on par with the speed increase that the upstream sanctioned core2 option gives users, so not including additional options seems somewhat arbitrary to me.

Details

  1. Three test machines: Intel Xeon X3360, Intel i7-2620M, Intel Core i7-3660K.
  2. All ran the make benchmark (linked below) 35 times while booted into a 'generic' kernel. Then all ran the same make benchmark 35 times after booting into an optimized kernel. Below are the optimizations chosen for each machine.
    • X3360 = core2
    • i7-2620M = sandybridge
    • i7-3660K = ivybridge
  3. Results were analyzed for statistical significance via ANOVA plots that clearly show statistically significant albeit small differences.

Discussion

  1. All the assumptions for ANOVA are met:
    • Data are normally distributed as show in the normal quantile plots.
    • The population variances are fairly equal (Levene and Barlett tests).
  2. The ANOVA plots clearly show significance.
    • Pair-wise analysis by Tukey-Kramer shows significance at the 0.05 level for all CPUs compared.

Below are the differences in median values:

CPU Difference in median value
core2 +87.5 ms
sandybridge +79.7 ms
ivybridge +257.2 ms

References

Credit

Legacy support

Find support for older version of the linux kernel and of gcc in the outdated_versions directory.

Data

Sandybridge vs. Generic

2620_m

Ivybridge vs. Generic

3770_k

Core2 vs. Generic

x3360

About

Kernel patch enables compiler optimizations for additional CPUs.

Resources

License

Stars

Watchers

Forks

Packages

No packages published