Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance improvements #163

Open
g-bauer opened this issue Jun 27, 2023 · 0 comments
Open

Performance improvements #163

g-bauer opened this issue Jun 27, 2023 · 0 comments
Labels
enhancement New feature or request

Comments

@g-bauer
Copy link
Contributor

g-bauer commented Jun 27, 2023

Following some guidelines of the Rust Performance Book here are some things we can try to improve performance:

  • Add codegen-units = 1 to release build
  • Use a faster allocator. E.g. mimalloc works on all operating systems

Not so easy:

  • properly profile to identify hot parts
  • remove clones/allocations where not needed
  • use profile-guided optimization (e.g. via cargo-pgo)
    • unfortunately this is currently not working with LTO and the PGO version is 10-20% slower than LTO
    • might be available in the future in maturin directly, see here

Quick tests with codegen-units = 1 added to release-lto (see here) show performance improvements of benchmarks of up to 12% (mean is about 7%) while for dual_number, changes are a bit smaller (see below).

Proper benchmarks (across all benchmarks) with comparison to current release workflow are needed but this might be an easy-to-get improvement if it turns out to be faster for all cases.

  • Benchmark: dual_numbers
  • System: methane/CO2
  • main: main branch + lto
  • main_codegen: main branch + lto + codegen-units = 1
  • develop_: like main

Execution times in µs

name f64 dual dual2 hyperdual dual3
main 1.1382 1.2325 1.4539 1.6267 1.7563
main_codegen 1.0229 1.1741 1.3708 1.5777 1.6316
develop 1.0138 1.1989 1.4465 1.589 1.7549
develop_codegen 0.9761 1.1681 1.4195 1.5446 1.6304

Slowdown t_f64/t_d for each branch/option

f64 dual dual2 hyperdual dual3
main 1 1.08285 1.27737 1.42919 1.54305
main_codegen 1 1.14782 1.34011 1.54238 1.59507
develop 1 1.18258 1.42681 1.56737 1.73101
develop_codegen 1 1.1967 1.45426 1.58242 1.67032

Relative difference in % w.r.t. main + lto for each dual number (t_d_branch - t_d_main) / t_d_main * 100

name f64 dual dual2 hyperdual dual3
main_codegen -10.13 -4.74 -5.72 -3.01 -7.10
develop -10.93 -2.73 -0.51 -2.32 -0.08
develop_codegen -14.24 -5.23 -2.37 -5.05 -7.17
@g-bauer g-bauer added the enhancement New feature or request label Jun 27, 2023
@g-bauer g-bauer changed the title Benchmark codegen-units Easy performance improvements Oct 18, 2023
@g-bauer g-bauer changed the title Easy performance improvements Performance improvements Nov 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant