Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

introducing clang-format #1248

Open
tigercosmos opened this issue Nov 29, 2023 · 21 comments
Open

introducing clang-format #1248

tigercosmos opened this issue Nov 29, 2023 · 21 comments

Comments

@tigercosmos
Copy link
Collaborator

I found that the coding format always trouble me, also it's really troublesome to manually format the code.

for example

if {
 ///
}

// or

if
{
///
}

Maybe we can introduce clang-format, so people can easily follow the format standard. CI can also easily check the format. People can also let IDE to format for them.

@seladb
Copy link
Owner

seladb commented Nov 29, 2023

We can add clang-format if we can create a custom format that doesn't change the entire code base 🤯

clang-format supports 6 predefined styles: LLVM, Google, Chromium, Mozilla, WebKit, Microsoft.

I think I went over them before and each one will create tons of changes compared to the style we currently have...

@egecetin
Copy link
Collaborator

egecetin commented Nov 30, 2023

I think custom clang-format will be great, manually formatting can be problematic and both reviewers and contributors miss some minor things. A simple example,

image

Mixed use of tab and spaces (4 tabs + spaces equivalent for another 4 tabs).

Another example is from my PRs which I noticed yesterday, the alignment of public private keyword. Some PRs indent them with spaces but most of the code base not indent them. Probably both of us miss them during merge @seladb

@seladb
Copy link
Owner

seladb commented Nov 30, 2023

@egecetin I totally agree! Unfortunately I know there are many formatting issues in the current code base 😕

@egecetin @tigercosmos would you consider working on a custom clang-format style?

@tigercosmos
Copy link
Collaborator Author

I will find some time to work on this. Let's see how we can customize the style.

@clementperon
Copy link
Collaborator

clementperon commented Dec 7, 2023

@seladb what's the issue to have one big MR?
If the script is runned by clang-format by several contributors and we agree it doesn't change the code for me it's fine no?

My experience is that maintaining a custom styles is really painfull and sometimes not properly supported by Clang...
And for each rule people could agree/disagree...

@tigercosmos
Copy link
Collaborator Author

@clementperon
The intent behind incorporating clang-format is to standardize the coding style.

At present, the CI system does not verify the code format, and relying on manual inspection is not consistently reliable. Additionally, with the integration of a code formatter, individuals can effortlessly adhere to the official standard by using IDEs.

While acknowledging that a substantial merge request may ensue, I contend that the advantages outweigh the disadvantages.

@clementperon
Copy link
Collaborator

clementperon commented Dec 7, 2023

Sorry if I was not clear, but I'm 100% in favor to have a clang-format checker in the CI.

I'm not convince to have a custom clang-format coding style.

IMO I would stick with the LLVM default coding style

@tigercosmos
Copy link
Collaborator Author

I'm unsure if we're referring to the same "custom style." In this context, the term "custom" style doesn't involve creating new rules; rather, I mean selecting options from the predefined choices offered by clang-format.

@clementperon
Copy link
Collaborator

Yes but some rules are working properly but others have sometimes "cross-effect" between them.
It will also be more sensitive beetwen updates and could sometimes breaks.

IMO only predefined styles could be considered as "stable", so if there is one predefined styles that match our needs I will prefer to stuck to it as much as possible.

@seladb
Copy link
Owner

seladb commented Dec 8, 2023

@clementperon @tigercosmos the main issue I have is that any predefined clang-format style (LLVM, Google, Chromium, Mozilla, WebKit, Microsoft) will cause changes in almost every line in every file, so:

  • The PR will be impossible to review - how can we ensure no logic was changed?
  • This makes git history (or "git blame") less effective because almost every line will be seen as impacted from this PR, making it harder to track who introduced the change and when. We can still dig deeper into "git blame" but it less convenient

If we have a custom style that is very close to what we currently have, there should be a lot fewer changes and we can enforce this style in the future

@tigercosmos
Copy link
Collaborator Author

The PR will be impossible to review - how can we ensure no logic was changed?

We use scripts to generate code, and we also run the tests, there should be no issues.

This makes git history (or "git blame") less effective

I don't think this is a strong support for stopping us, since people can easily check the history before the format.

If we have a custom style that is very close to what we currently have, there should be a lot fewer changes and we can enforce this style in the future

I agree with @clementperon, following the predefined styles is simpler for unifying the code. Not to mention that the current codebase includes many different coding styles from different people. Yes, a big MR is inevitable, but later on, life becomes easy.

@egecetin
Copy link
Collaborator

egecetin commented Dec 8, 2023

Since the changes are made by automated scripts, just a quick glance will be enough I think. I'm not sure it will break any code. And unit tests, fuzzers and also compilation itself eliminates a lot of risks. The only problem is "git blame" but it is not a big deal

@clementperon
Copy link
Collaborator

clementperon commented Dec 8, 2023

I also think that the checksum of the file generated will be identical.

CMake should not even trig a recompilation (maybe not)

@seladb
Copy link
Owner

seladb commented Dec 9, 2023

I also think that the checksum of the file generated will be identical.

CMake should not even trig a recompilation (maybe not)

If this is true then it's a good way to test the migration script!

@tigercosmos
Copy link
Collaborator Author

@seladb @clementperon @egecetin I really want to have this feature.
Here is my plan:

  1. Choose a default clang-format setting to utilize. As per @clementperon's suggestion, opting for the default setting is both safe and reliable.
  2. Firstly, execute clang-format on all code and verify the checksum of the compiled binary (which ideally should remain unchanged). As we refrain from altering the test code at this stage, we can verify the code's correctness. (Perhaps, there's no need for comparing the checksum.)
  3. Secondly, apply clang-format to all test code. Given that the non-test code should already be correct, any errors arising from the test code will become apparent.
  4. We can apply the same method for python codes using flake8

@clementperon
Copy link
Collaborator

@tigercosmos, just to be more clear on my opinion. As we don't have a strong opinion on the coding style, let's enforce with a properly define one and properly supported by clang-format.

I'm personally used to the default LLVM one but I think PcapPlusPlus coding style looks like Microsoft coding style.
So if we all agree we can choose the Microsoft one.

The checksum should be indeed identical if you remove debug symbols, SHA1 and date version info of the build. I can help on that if you need.

@tigercosmos
Copy link
Collaborator Author

tigercosmos commented Mar 26, 2024

I think PcapPlusPlus coding style looks like Microsoft coding style.

I agree to use "Microsoft", after playing around with LLVM, Google, Webkit, etc, I think it's the most similar to PCPP.

The checksum should be indeed identical if you remove debug symbols, SHA1 and date version info of the build. I can help on that if you need.

@clementperon Sure, if you can help with this, it will be great.

@seladb
Copy link
Owner

seladb commented Mar 26, 2024

@clementperon @tigercosmos do you have an example of how the Microsoft format looks like?

@egecetin
Copy link
Collaborator

@seladb
Copy link
Owner

seladb commented Mar 26, 2024

@seladb Check this site https://zed0.co.uk/clang-format-configurator/

Nice!! It does look quite similar to the current formatting

@tigercosmos
Copy link
Collaborator Author

Seems we all agree with using "Microsoft" for clang-format. That's great.

@tigercosmos tigercosmos added this to the Augest 2024 Release milestone Jun 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants