Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CodeChecker store fails for report directories above ~10GB (report deduplication on the disk) #4129

Open
dkrupp opened this issue Dec 11, 2023 · 0 comments
Labels
analyzer 📈 Related to the analyze commands (analysis driver) bug 🐛 performance 🏃

Comments

@dkrupp
Copy link
Member

dkrupp commented Dec 11, 2023

CodeChecker cannot store report directories which are larger than 10GB. Unfortunately this can be a common case for C/C++ projects because some reports for headers are repeated for almost all TUs, which may cause a report count explosion on certain checkers.

When CodeChecker executes the analyzers, it stores every finding into the output report directory.

Some of the checkers report problems for C/C++ types that are commonly used across the whole code base. Such reports are repeated at every usage, which generates a huge number of redundant findings. An example for this is the cppcoreguidelines-special-member-functions clang-tidy checker which reports for classes where some but not all (copy constructor, copy assignment, move constructor, move assignment, destructor) of the special member functions are defined.

These reports are repeated in many PLIST files redundantly causing an excessively large report directory >~10GB. Such report directories make the diff, parse commands very slow and prohibit the storage of the results to the serve (using CodeChecker store) which would anyway throw away the duplicate findings.

If the report directory would be more compact the storage could be successful and would be significantly smaller.

Deduplicating the reports before storage would make the zipped content much smaller and the parsing on the server side much faster.

CodeChecker version
6.23.0

To Reproduce
Analyze the xerces project with --enable-all

CodeChecker analyze --enable-all ./compile_commands.json ./reports

Expected behaviour
I would expect more efficient report directory structure that is smaller in size, can be stored and can be handled by parse and diff.

Additional context
Add any other context about the problem.

@dkrupp dkrupp added this to the release 6.24.0 milestone Dec 11, 2023
@dkrupp dkrupp changed the title CodeChecker store fails for report directories above ~10GB CodeChecker store fails for report directories above ~10GB (report deduplication on the disk) Dec 11, 2023
@whisperity whisperity added the analyzer 📈 Related to the analyze commands (analysis driver) label Dec 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
analyzer 📈 Related to the analyze commands (analysis driver) bug 🐛 performance 🏃
Projects
None yet
Development

No branches or pull requests

2 participants