Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

THRIFT-5766: Replace std::endl with "\n" #2943

Closed
wants to merge 9 commits into from

Conversation

CJCombrink
Copy link
Contributor

@CJCombrink CJCombrink commented Mar 8, 2024

Replaced usage of endl with \n directly

  • Using endl for linebreaks is bad practice
  • This is confirmed by the static const string endl = "\n" (removed)
  • Did you create an Apache Jira ticket? (Request account here, not required for trivial changes)
  • If a ticket exists: Does your pull request title follow the pattern "THRIFT-NNNN: describe my issue"?
  • Did you squash your changes to a single commit? (not required, but preferred)
  • Did you do your best to avoid breaking changes? If one was needed, did you label the Jira ticket with "Breaking-Change"?
  • If your change does not involve any code, include [skip ci] anywhere in the commit message to free up build resources.

Reference to my comment here

@emmenlau
Copy link
Member

emmenlau commented Mar 8, 2024

Looks good to me, albeit I do not have a strong opinion with respect to this change. I guess its a matter of personal taste whether \n or << endl << are more readable...

@CJCombrink
Copy link
Contributor Author

(Personally)I don't think it is a matter of taste.
std::endl's claim to fame (if I can phrase it like that) is that it forces a newline and flush the stream.
At some point in the past someone decided that the "flush the stream" is undesirable and redefined endl to just force a new line.
Thus the usage of endl enforces a known bad habit, or rather misplaced/misguided habit.
The more appropriate thing to do at that time was make this change, instead of enforcing/entertaining/hiding a bad habit.

I am also bringing up C++ Core Guidelines SL.io.50 (PS Which Bjarne commented on himself).
And then if you pay attention to the following comment on SL.io.50:

Note Apart from the (occasionally important) issue of performance, the choice between '\n' and endl is almost completely aesthetic.

I would argue if it was only aesthetic why was the overload added in the first place...

PS: I am probably passionate about the wrong things, shoot down the PR if this is 'out of scope' for this repo

@CJCombrink
Copy link
Contributor Author

I doubled down and applied the changes to the whole codebase in my fork @ CJCombrink#1
If there is any appetite for this type of change, please indicate a preference between baby steps (just "t_cpp_generator.cc") or all or nothing as per the linked PR.

CJCombrink referenced this pull request in CJCombrink/thrift Mar 8, 2024
Both the default constructor and operator== implementations reference
certain member functions of the class' members. As an example, the default
constructor references (i.e., "uses") the default constructors of its
members.

If a class contains a std::vector<Foo>, and Foo has only been *forward*-
declared (which happens often in Thrift-generated code), this creates
undefined behavior: The std::vector specification states that as long as
Foo is an incomplete type, it is fine to reference std::vector<Foo>, but
not any members (such as its default constructor).

Thus, we must defer our default constructor's implementation (which references
the default constructor of std::vector<Foo>) to a point where Foo is a
complete type. That is the case in the .cpp file.

The same holds for operator==.
@emmenlau
Copy link
Member

emmenlau commented Mar 8, 2024

(Personally)I don't think it is a matter of taste. std::endl's claim to fame (if I can phrase it like that) is that it forces a newline and flush the stream.

Well we are certainly on the same page when it comes to std::endl, as this has a number of significant drawbacks! But the code in question is already using \n rather than std::endl. The main difference is that it is concatenating every time with << which may have a negative performance impact. The impact on readability is (in my humble opinion) debatable, because I can see also advantages of recognizing << endl << as a clear sign of newline, whereas "hello\nworld" may not jump as much into the eye of the reader.

In any case, I'm ok with the PR, but without a strong opinion pro or con on my side :-)

@Jens-G
Copy link
Member

Jens-G commented Mar 8, 2024

static const string endl = "\n"; // avoid ostream << std::endl flushes

That line was introduced intentionally in all generators. You may look up git and JIRA history and find out why, including the whole debate about why this is a good change, before removing it.

I would also argue that forcing people into the literal inclusion of "\n" by removing the above most certainly will not be respected by people. Simply because nobody will think about it and just use endl instead, which due to the removal effectively results in the exact opposite of what is intended. You are setting it up for failure IMHO.

Using endl for linebreaks is bad practice

Not quite. Using std::endl is bad practice for the reasons linked above. But we're not doing that at all. That is the whole point why we have that endl constant, so nobody falls into that trap by accident.

@Jens-G Jens-G added the c++ label Mar 8, 2024
@CJCombrink
Copy link
Contributor Author

CJCombrink commented Mar 9, 2024

@Jens-G
Sorry I did not do the research before making the PR, but the linked ticket confirms my suspicion 100%:
A bad practice was followed and instead of fixing the actual issue, by not using std::endl, the PR decided to entertain a bad practice with the motivation of "minimizing code churn" by changing expected behaviour with hidden unexpected behaviour (PS I was fully aware of that constant before making this PR).

11 Years later I am hoping that at least we can get to a point where sanity is restored and bad practices are not encouraged/entertained in such important code bases.

PS: I come from a background where juniors used std::endl on embedded devices that flushed over serial. On confrontation the response was "we were taught that way". We should be better in teaching and educating on these matters, not hide it with consts that change documented behaviour and make bad practices acceptable.

I am saying this again, I realise I am passionate about the wrong things thus I will not take offence if this/these PR(s) is/are declined, but I might keep stating the point :P

@emmenlau

advantages of recognizing << endl << as a clear sign of newline

Depends on perspective: I see it as: "why do you need flushing" and my editor (VS Code with no custom modification) highlights \n differently in string literals, thus I can argue it stands out even more and communicates the intend clearly (no hidden oh but we manually changed behaviour).

@Jens-G
Copy link
Member

Jens-G commented Mar 12, 2024

+1 Go on then, change all the files and add a comment to the ticket, to put an end on this fruitless discussion. Question remains, why bad practice can't be fixed where it belongs - by properly deprecating std::endl

We should be better in teaching and educating on these matters,

Usually deprecation and removal teaches quite well.

It is bad, and most people know it,

Agree, I wasn't fully aware of it.


PS: Waiting for the great Rust rewrite, to save more walls.

@CJCombrink
Copy link
Contributor Author

I Updated this PR with replacing std::endl in most of the std::cout calls.
I did go through most of the changes not in the generators folder manually.
For the generators I did spot check and I did test the new output vs the old output for at least C++, there were no changes in the generated files.

Do I still need to create a JIRA ticket for this change? I did request a user already thus waiting for that.

@CJCombrink CJCombrink changed the title Remove usage of endl in the cpp generator THRIFT-5766 Replace std::endl with "\n" Mar 13, 2024
@CJCombrink CJCombrink changed the title THRIFT-5766 Replace std::endl with "\n" THRIFT-5766: Replace std::endl with "\n" Mar 13, 2024
@@ -221,11 +221,11 @@ void t_cl_generator::package_def(std::ostream &out) {
}
out << ")";
}
out << ")" << endl << endl;
out << ")\n\n";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in cases like this, the usage of endl/std::endl makes it much more obvious how many lines we are adding while \n being part of the longer string just meshes everything together.

I'd suggest still explicitly adding every \n individually, like this:

Suggested change
out << ")\n\n";
out << ")" << "\n" << "\n";

but I don't feel strongly either way

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fishy Thanks for the review and feedback.

Although I do understand the suggestion I specifically went through the effort to combine these. It does sort of stand out, but some editors color the \n different and I deem that good enough. I find myself looking at the generated code much more often compared to the generator code thus there I quickly see how much (or few) newlines are added.

If there is enough pushback against the combination I can revert that combination change and (force)push the updates.

I don't think speed matters that much but I guess calling the operator<< multiple times, more than the needed amount of time, would have a performance and speed implication.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm slightly with @fishy here, as my comment in #2943 (comment) was actually going in the same direction: Having dedicated calls to << "\n" jumps much more to the eye and conveyes how many newlines are added.

If you would be willing to change that, to make every newline jump out more, I'm two thumbs up for this PR! But without that change, I'm still ok to merge, albeit with minor hesitation...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@emmenlau Thank you again for the feedback.
I think there is enough reason to put back the << "\n thus I will put in the effort (tonight) to 'revert' that change.

@CJCombrink
Copy link
Contributor Author

@fishy @emmenlau
Is there anything else that I need to address in this PR, something else that stands out as 'not so nice' or blockers?

Also, I am not sure that I am responsible for the build failures, is there a way to know?

@emmenlau
Copy link
Member

@CJCombrink from my point of view, everything else is good, thanks a lot for the nice work!

The build failures are a continuous nightmare. The best you could do (and this is what I typically do) is check them individually, and see if the error seems related. Often builds fail because some third party dependency package can not be installed in some build machine, and those are the obvious ones to ignore. The others are harder, for those you can just compare the latest master build status with yours, to see if "you" broke it, or the person before you...

In any case, we will also take a look at those before merging, but its great if you validate as much as you can upfront...

@CJCombrink
Copy link
Contributor Author

CJCombrink commented Mar 14, 2024

I updated the branch to use << '\n' instead.
TBH I am starting to appreciate the clarity it brings on many of the places. I still need more time to process, or rather get used to << '\n' << '\n' ...

@fishy
Copy link
Member

fishy commented Mar 15, 2024

FYI the lib-java-kotlin failure in CI is being addressed in #2945

@CJCombrink
Copy link
Contributor Author

@fishy I think I have done what I can.

I did check the test, master vs my changes for the parts that my environment is set up for and I don't see any regression.
I did run into #2937 when I started this work, which would enable more test to be run probably.

I have also compared the generated code for the unit tests with those against master and see no difference.

What else can I do and what are the next steps for this PR?

@fishy
Copy link
Member

fishy commented Mar 18, 2024

@CJCombrink sorry I'm in some family emergency and won't have time to review it any further for the next few weeks. None of my comment is blocking anyways :)

Mario, Jens, and/or others should be able to guide you through the process.

- Moves away from the bad habid of using endl which is known to flush
- Remove references to endl and replace with line break
@CJCombrink
Copy link
Contributor Author

I did some thinking...
For a major project like this, a PR touching 56 files with more than 10,135 changes from a random person might not be the easiest thing to consider.
Thus, would it not be better if we keep the changes local to one generator and then maybe in future consider this type of change?
Obviously I would want the changes in t_cpp_generator.cc

Or maybe just make the changes in the generators. I did go all the way with all other files, but baby steps might be the best or safest course of action.

PS: I started a new ticket: THRIFT-5772 and if I have to choose I would rather have energy put there compared to this change, although this would be nice to have (in the C++ generator :P )

Copy link
Member

@Jens-G Jens-G left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

plus some files were totally missed

  • contrib\fb303\TClientInfo.cpp
  • lib\cpp\src\thrift\async\TEvhttpClientChannel.cpp
  • lib\cpp\src\thrift\async\TEvhttpServer.cpp
  • lib\cpp\src\thrift\transport\TFileTransport.cpp
  • lib\cpp\test\concurrency\ThreadManagerTests.h
  • lib\cpp\test\concurrency\TimerManagerTests.h
  • lib\cpp\test\concurrency\Tests.cpp

Most of the missing single lines are cerr outputs. I mean, if we want to save walls, we should be more consequent and include cerr outputs as well, maybe doing the flush explicitly if needed. Or would cerr be the exception from the rule that endl is generally evil?

compiler/cpp/src/thrift/generate/t_dart_generator.cc Outdated Show resolved Hide resolved
compiler/cpp/src/thrift/generate/t_cpp_generator.cc Outdated Show resolved Hide resolved
compiler/cpp/src/thrift/generate/t_js_generator.cc Outdated Show resolved Hide resolved
contrib/fb303/cpp/ServiceTracker.cpp Show resolved Hide resolved
lib/cpp/test/concurrency/Tests.cpp Outdated Show resolved Hide resolved
test/cpp/src/TestServer.cpp Show resolved Hide resolved
test/cpp/src/StressTestNonBlocking.cpp Show resolved Hide resolved
test/cpp/src/StressTest.cpp Show resolved Hide resolved
lib/cpp/test/OneWayHTTPTest.cpp Show resolved Hide resolved
@Jens-G
Copy link
Member

Jens-G commented Apr 1, 2024

For a major project like this, a PR touching 56 files with more than 10,135 changes from a random person might not be the easiest thing to consider.

That's right but I'm fine in this particular case. If you could just add the commits re above changes at the end, that would make followup review easier for me. I will squash them before merge myself.

@Jens-G
Copy link
Member

Jens-G commented Apr 11, 2024

Any news here?

@CJCombrink
Copy link
Contributor Author

Any news here?

Hi Sorry, I was busy with a few other things.
Will work through the feedback today.

- Fixed a few places that were then literal + literal to ensure at least one is a std::string
- PR Review feedback
- PR Review feedback
- Initially skipped cerr but after some reading, cerr will flush after every << thus endl can also be avoided.
@CJCombrink
Copy link
Contributor Author

@Jens-G

Most of the missing single lines are cerr outputs. I mean, if we want to save walls, we should be more consequent and include cerr outputs as well, maybe doing the flush explicitly if needed. Or would cerr be the exception from the rule that endl is generally evil?

Initially I thought it might be good to keep cerr to flush. After your comment I did some thinking and reading and I see std::cerr is unbuffered, thus every << will flush and endl will then be a double flush, thus I think it is safe to conclude it is evil there as well.

Copy link
Member

@Jens-G Jens-G left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, thanks! Two minor bugs left though.

@Jens-G Jens-G closed this in 4a280d5 Apr 15, 2024
@CJCombrink
Copy link
Contributor Author

@Jens-G Thank you for your effort and meticulous attention to the details

@emmenlau
Copy link
Member

nice work @CJCombrink and (as usually) @Jens-G !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
4 participants