Segfaults at random locations #42

nh2 · 2020-02-19T20:41:10Z

First off, thanks for zapcc, it seems like a big chunk of engineering.

Last week I tried to introduce it as an alternative compiler in my project, and did get integer factors speedup for incremental recompiles, just like i had hoped.

Unfortunately, I also found some problems that prevented me from using zapcc productively:

Nondeterministic compiler output if parallel builds are used. If I use more than -j1 on my build, then even adding a comment to a C++ file will result in changed .o files (it looks like sections are different).
Binaries that segfault sometimes. The binaries created by zapcc sometimes segfault at seemingly random, but reproducible locations. That is, a given binary produced by zapcc always crashes at the same place during my program execution. Adding some comments, and compiling again, sometimes creates a different binary that segfaults in a different location (but again reproducably so there).
- This does not happen on plain clang++ 7.
- Because of the previous nondeterminsim problem, it is extremely difficult to just diff the created binaries to try and spot what zapcc introduces that makes them crash.
- Trying to use gdb on it does not help much; crashes happen deep in libraries I use, hinting that invlid memory is at play (also violating assertions about the data that always hold with clang or gcc).

My project is a medium-sized propritary C++ code base depending on eigen, ceres, CGAL and other large libraries, so it is unfortunately difficult for me to provide a reproducer without too much effort.

I just wanted to report this; perhaps you have some ideas of where the problem might be.

Also, i believe that making zapcc deterministic would be hugely beneficial, so that I could just diff the crashing and non-crashing binaries more easily.

The text was updated successfully, but these errors were encountered:

yrnkrn · 2020-02-21T09:54:35Z

zapcc is non-deterministic since it keeps state between compilations and may use it to benefit, for example zpacc can inline a function from a previously-compiled source file, very similar to link time optimizaion phase. It will rememeber the dependency on the other source code in such case. Even with -j1 the binary may not be identical depending upon compilation order.

The usual way to debug such a problem is to use creduce. We had done maybe 1000 reduces of similar problems. Even very, very big projects were reduced to 1-3 files of few lines each and then made into the zapcc regression tests, single files into the single directory and multi-file tests into multi. Take a look.
The reduce process take several hours to several days to complete and requires some manual help where the human outsmarts creduce. The final manual reducing is sort of a C++ puzzle.
With the final reduced example it's possible to start debugging zapcc and seeing what it does wrong.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Segfaults at random locations #42

Segfaults at random locations #42

nh2 commented Feb 19, 2020

yrnkrn commented Feb 21, 2020

Segfaults at random locations #42

Segfaults at random locations #42

Comments

nh2 commented Feb 19, 2020

yrnkrn commented Feb 21, 2020