Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segfaults at random locations #42

Open
nh2 opened this issue Feb 19, 2020 · 1 comment
Open

Segfaults at random locations #42

nh2 opened this issue Feb 19, 2020 · 1 comment

Comments

@nh2
Copy link

nh2 commented Feb 19, 2020

First off, thanks for zapcc, it seems like a big chunk of engineering.

Last week I tried to introduce it as an alternative compiler in my project, and did get integer factors speedup for incremental recompiles, just like i had hoped.

Unfortunately, I also found some problems that prevented me from using zapcc productively:

  • Nondeterministic compiler output if parallel builds are used. If I use more than -j1 on my build, then even adding a comment to a C++ file will result in changed .o files (it looks like sections are different).
  • Binaries that segfault sometimes. The binaries created by zapcc sometimes segfault at seemingly random, but reproducible locations. That is, a given binary produced by zapcc always crashes at the same place during my program execution. Adding some comments, and compiling again, sometimes creates a different binary that segfaults in a different location (but again reproducably so there).
    • This does not happen on plain clang++ 7.
    • Because of the previous nondeterminsim problem, it is extremely difficult to just diff the created binaries to try and spot what zapcc introduces that makes them crash.
    • Trying to use gdb on it does not help much; crashes happen deep in libraries I use, hinting that invlid memory is at play (also violating assertions about the data that always hold with clang or gcc).

My project is a medium-sized propritary C++ code base depending on eigen, ceres, CGAL and other large libraries, so it is unfortunately difficult for me to provide a reproducer without too much effort.

I just wanted to report this; perhaps you have some ideas of where the problem might be.

Also, i believe that making zapcc deterministic would be hugely beneficial, so that I could just diff the crashing and non-crashing binaries more easily.

@yrnkrn
Copy link
Owner

yrnkrn commented Feb 21, 2020

zapcc is non-deterministic since it keeps state between compilations and may use it to benefit, for example zpacc can inline a function from a previously-compiled source file, very similar to link time optimizaion phase. It will rememeber the dependency on the other source code in such case. Even with -j1 the binary may not be identical depending upon compilation order.

The usual way to debug such a problem is to use creduce. We had done maybe 1000 reduces of similar problems. Even very, very big projects were reduced to 1-3 files of few lines each and then made into the zapcc regression tests, single files into the single directory and multi-file tests into multi. Take a look.
The reduce process take several hours to several days to complete and requires some manual help where the human outsmarts creduce. The final manual reducing is sort of a C++ puzzle.
With the final reduced example it's possible to start debugging zapcc and seeing what it does wrong.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants