Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

compilation time and how to improve it #559

Open
kwikius opened this issue Feb 26, 2024 · 17 comments
Open

compilation time and how to improve it #559

kwikius opened this issue Feb 26, 2024 · 17 comments

Comments

@kwikius
Copy link
Contributor

kwikius commented Feb 26, 2024

Some comments re the current status here

I am thinking to propose mp-units for use with Ardupilot at some point (I think it might be a too hard sell but... ) and one subject that may come up is compilation time.

The following is maybe not a like for like comparison, but is only intended to get the ball rolling

I tested mp-units in the simplest possible compilation against my quan and pqs libraries. pqs came out on top with a compilation time of 2s. quan came out with a compilation time of 5s , but mp-units came out with a compilation time of 10s.

I hope that mp-units compilation time can be improved on ( sure maybe not using the examples below) , so how to do it?


#include <mp-units/systems/si/si.h>
#include <iostream>

int main()
{
   mp_units::quantity<mp_units::si::metre,float> length = { 1, mp_units::si::metre};

   float value = length.numerical_value_in( mp_units::si::metre);

   std::cout << value << '\n';

   return 0;
}
-------------- Build: Release in mp_units_test (compiler: GNU GCC13 Compiler)---------------

g++-13 -Wall -std=c++23 -O3 -I/home/andy/cpp/projects/mp-units/src/core/include -I/home/andy/cpp/projects/mp-units/src/systems/include -I/home/andy/cpp/projects/gsl-lite/include -c /home/andy/cpp/projects/mp_units_test/main.cpp -o obj/Release/main.o
g++  -o bin/Release/mp_units_test obj/Release/main.o  -s  
Output file is bin/Release/mp_units_test with size 14.17 KB
Process terminated with status 0 (0 minute(s), 10 second(s))
0 error(s), 0 warning(s) (0 minute(s), 10 second(s))
 
---------------------------------------------------------------------------------------------

#include <pqs/systems/si/length.hpp>
#include <iostream>

int main()
{
   pqs::si::length::m<float> length{1};

   float value = length.numeric_value();

   std::cout << value << '\n';

   return 0;
}

-------------- Build: Release in pqs (compiler: GNU GCC13 Compiler)---------------

g++-13 -Wall -std=c++23 -O3 -I/home/andy/cpp/projects/pqs/src/include -c /home/andy/cpp/projects/pqs/examples/test.cpp -o obj/Release/examples/test.o
g++  -o bin/Release/pqs obj/Release/examples/test.o  -s  
Output file is bin/Release/pqs with size 14.17 KB
Process terminated with status 0 (0 minute(s), 2 second(s))
0 error(s), 0 warning(s) (0 minute(s), 2 second(s))

------------------------------------------


#include <quan/length.hpp>
#include <iostream>

int main()
{
   quan::length_<float>::m length{1};

   float value = length.numeric_value();

   std::cout << value << '\n';

   return 0;
}

-------------- Build: Release in test (compiler: GNU GCC13 Compiler)---------------

g++-13 -Wall -std=c++23 -O3 -I/home/andy/cpp/projects/quan-trunk -c /home/andy/cpp/projects/test/main.cpp -o obj/Release/main.o
g++  -o bin/Release/test obj/Release/main.o  -s  
Output file is bin/Release/test with size 14.17 KB
Process terminated with status 0 (0 minute(s), 5 second(s))
0 error(s), 0 warning(s) (0 minute(s), 5 second(s))











 



@mpusz
Copy link
Owner

mpusz commented Feb 26, 2024

I am thinking to propose mp-units for use with Ardupilot at some point (I think it might be a too hard sell but... )

It would be great. Thanks!

@mpusz
Copy link
Owner

mpusz commented Feb 26, 2024

so how to do it?

Could you please try:

#include <mp-units/systems/si/units.h>

instead of si.h? This skips redefining all the unit symbols, which are quite expensive to compile.

@mpusz
Copy link
Owner

mpusz commented Feb 26, 2024

This library was created with C++ modules in mind. Predefining a lot of instances in the library and compiling the module once should result in faster compilation times for the end users. However, we just got clang to play with, GCC is still not there, and MSVC does not like out code (Compiler Internal Errors 😉). Moreover, none of the compilers supports import std; as of today.

@kwikius
Copy link
Contributor Author

kwikius commented Feb 26, 2024

That gets mp-units to 7 seconds, which is a good start! (Bear in mind my tests are quite unscientific. I am running ccache, so I clear that before each run. and clean the objects, but quite a bit of variability, for example when I switch or rerun a project, maybe headers are being cached etc)

-------------- Build: Release in mp_units_test (compiler: GNU GCC13 Compiler)---------------

g++-13 -Wall -std=c++23 -O3 -I/home/andy/cpp/projects/mp-units/src/core/include -I/home/andy/cpp/projects/mp-units/src/systems/include -I/home/andy/cpp/projects/gsl-lite/include -c /home/andy/cpp/projects/mp_units_test/main.cpp -o obj/Release/main.o
g++ -o bin/Release/mp_units_test obj/Release/main.o -s
Output file is bin/Release/mp_units_test with size 14.17 KB
Process terminated with status 0 (0 minute(s), 7 second(s))
0 error(s), 0 warning(s) (0 minute(s), 7 second(s))

Note that in quan, I used a per quantity header with an all-types header for the lazy. EDIT: That might be worth thinking about for mp-units

@mpusz
Copy link
Owner

mpusz commented Feb 26, 2024

We will definitely need to optimize compile times anyway. I implemented many features in the most straightforward way in many cases to save time and improve readability. I am aware that there are many areas for improvement.

However, first, I would like to finalize the design of the library. We still have a few big questions to solve first:

@mpusz
Copy link
Owner

mpusz commented Feb 26, 2024

Note that in quan, I used a per quantity header with an all-types header for the lazy. EDIT: That might be worth thinking about for mp-units

I am not sure what you mean by this? When I read your note I thought that you included something like quan.h but it is not the case. Do you mean that all length-related units are already provided?

mp-units is intended to be used with just

import mp_units;

You can find such use cases in our library.

@mpusz
Copy link
Owner

mpusz commented Feb 26, 2024

Actually, it would be interesting to check what are the compile times in case modules are being used. Unfortunately, GCC does not support those in gcc-13. Can you check clang-17 or clang-18 with modules?

Of course, we would just need to measure the example compilation (e.g., compile the mp-units module first) and then measure the rest.

@kwikius
Copy link
Contributor Author

kwikius commented Feb 26, 2024

Note that in quan, I used a per quantity header with an all-types header for the lazy. EDIT: That might be worth thinking about for mp-units

I am not sure what you mean by this? When I read your note I thought that you included something like quan.h but it is not the case. Do you mean that all length-related units are already provided?

So just include the headers for the individual quantities you need e.g length https://github.com/kwikius/quan-trunk/blob/master/quan/length.hpp. Typically you will only need maybe 5 or 10, depending how you break up your source files

https://github.com/kwikius/fg_ext_fdm/blob/master/src/include/autoconv_net_fdm.hpp

Sure the modules support looks great, but too young yet to be useful for me. I pretty much exclusively use gcc, since it compiles better for ARM , which is hardware I run on most. ( I think clang is trying hard though and maybe I should test it at some point)

@mpusz
Copy link
Owner

mpusz commented Feb 26, 2024

Also, there is a new C++26 pack indexing feature (https://en.cppreference.com/w/cpp/language/pack_indexing) that should improve compile times here. It should be available in clang-19.

@kwikius
Copy link
Contributor Author

kwikius commented Feb 26, 2024

Actually, it would be interesting to check what are the compile times in case modules are being used. Unfortunately, GCC does not support those in gcc-13. Can you check clang-17 or clang-18 with modules?

It will take me a while to set those up, but see what I can do.

@mpusz
Copy link
Owner

mpusz commented Feb 26, 2024

So just include the headers for the individual quantities you need e.g length

Yes, I thought about it, but it would be hundreds of header files for ISQ. This is why I divided ISQ into chapters as they are defined in ISO-80000: https://github.com/mpusz/mp-units/tree/master/src/systems/include/mp-units/systems/isq. For many users it would be enough to just include space_and_time.h.

@kwikius
Copy link
Contributor Author

kwikius commented Feb 26, 2024

The other problem for mp-units is that projects like Ardupilot are pretty wary of upgrading their toolchain. I think they are currently using C++17 , so if you use the latest features it will be harder again to sell.

@mpusz
Copy link
Owner

mpusz commented Feb 26, 2024

Sure, I totally agree. This is why this library will always have C++20 compatibility (at least I hope so). Newer features will be used, though, when available to improve the user experience.

@chiphogg
Copy link
Collaborator

@kwikius, could you please indulge me and also test the compile time for including this file with the following code?

#include "au.hh" // the above file

#include <iostream>

int main()
{
   auto length = au::meters(1);

   float value = length.in(au::meters);

   std::cout << value << '\n';

   return 0;
}

Note that Au is compatible all the way back to C++14. 😄

@kwikius
Copy link
Contributor Author

kwikius commented Feb 26, 2024

Again very subjective, but appears that on a good run AU does around 4 seconds

-------------- Build: Release in test (compiler: GNU GCC13 Compiler)---------------

g++-13 -Wall -std=c++23 -O3 -c /home/andy/cpp/projects/test/chiphogg.cpp -o obj/Release/chiphogg.o
g++ -o bin/Release/test obj/Release/chiphogg.o -s
Output file is bin/Release/test with size 14.17 KB
Process terminated with status 0 (0 minute(s), 4 second(s))
0 error(s), 0 warning(s) (0 minute(s), 4 second(s))

@chiphogg
Copy link
Collaborator

Hmm... I was expecting better. Would be interesting to dig into the compile time profile at some point.

Still, thanks very much for indulging me!

@kwikius
Copy link
Contributor Author

kwikius commented Feb 26, 2024

@chiphogg, here is the cpuinfo fwiw.

Lenovo-G50-45:~$ lscpu
Architecture:            x86_64
  CPU op-mode(s):        32-bit, 64-bit
  Address sizes:         40 bits physical, 48 bits virtual
  Byte Order:            Little Endian
CPU(s):                  4
  On-line CPU(s) list:   0-3
Vendor ID:               AuthenticAMD
  Model name:            AMD A8-6410 APU with AMD Radeon R5 Graphics
    CPU family:          22
    Model:               48
    Thread(s) per core:  1
    Core(s) per socket:  4
    Socket(s):           1
    Stepping:            1
    Frequency boost:     enabled
    CPU max MHz:         2000.0000
    CPU min MHz:         1000.0000
    BogoMIPS:            3992.22
    Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_o
                         pt pdpe1gb rdtscp lm constant_tsc rep_good acc_power nopl nonstop_tsc cpuid extd_apicid aperfmperf pni pclmulqdq monitor ssse
                         3 cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowpre
                         fetch osvw ibs skinit wdt topoext perfctr_nb bpext ptsc perfctr_llc cpb hw_pstate ssbd vmmcall bmi1 xsaveopt arat npt lbrv sv
                         m_lock nrip_save tsc_scale flushbyasid decodeassists pausefilter pfthreshold overflow_recov
Virtualisation features: 
  Virtualisation:        AMD-V
Caches (sum of all):     
  L1d:                   128 KiB (4 instances)
  L1i:                   128 KiB (4 instances)
  L2:                    2 MiB (1 instance)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants