Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rethink thread_local (take 2) #29952

Open
maflcko opened this issue Apr 24, 2024 · 12 comments
Open

Rethink thread_local (take 2) #29952

maflcko opened this issue Apr 24, 2024 · 12 comments

Comments

@maflcko
Copy link
Member

maflcko commented Apr 24, 2024

After dbfca4a from the 23.x release (and fe1b325 from 25.x), I think the --disable-threadlocal can now be dropped, as those commits removed the need for it?

cc @hebasto @fanquake

@fanquake
Copy link
Member

Possibly. Someone will first need to test it on FreeBSD and mingw-w64. Given we still assume those implementations to be broken.

@maflcko
Copy link
Member Author

maflcko commented Apr 24, 2024

I couldn't get wine running locally, so my testing failed. But given that we require a C++20 compiler, I'd be surprised if there is still a platform out there that hasn't correctly implemented C++11 at this point.

I presume, at least for the mingw issue, the existing unit tests should detect the issue, as I presume the RenameEnMasse unit test is equivalent to the C++ shared in the linked gist?

@laanwj
Copy link
Member

laanwj commented Apr 25, 2024

Concept ACK (if indeed all platforms we care about can do this now, including mingw)

@maflcko
Copy link
Member Author

maflcko commented Apr 25, 2024

I couldn't get wine running locally

I hopped into the win64 CI container and ran https://gist.github.com/jamesob/fe9a872051a88b2025b1aa37bfa98605 there. It passed.

But given that the previous test was done on Trusty (14.04?), it will probably be hard to check when it started passing.

@laanwj
Copy link
Member

laanwj commented Apr 26, 2024

@vasild Do you know if C++11 thread-local storage is still broken on FreeBSD?

@maflcko
Copy link
Member Author

maflcko commented Apr 26, 2024

FreeBSD

I am not familiar with the *BSD, but is there indication that OpenBSD or NetBSD are unaffected? What are the steps to test this?

@vasild
Copy link
Contributor

vasild commented Apr 26, 2024

bitcoin/configure.ac

Lines 1054 to 1057 in 7973a67

*freebsd*)
dnl FreeBSD's implementation of thread_local is also buggy (per
dnl https://groups.google.com/d/msg/bsdmailinglist/22ncTZAbDp4/Dii_pII5AwAJ)
AC_MSG_RESULT([no])

The file https://github.com/freebsd/freebsd-src/blob/master/lib/libc/stdlib/cxa_thread_atexit_impl.c has not been changed since 2017.

Running the unit tests on FreeBSD with thread_local enabled (had to edit configure.ac) prints a bunch of those:

__cxa_thread_call_dtors: dtr 0x375164b2d70 from unloaded dso, skipping

which is at least annoying and scary.

We only use thread_local for storing the thread name in std::string. Can use dumb char array for this:

[patch] avoid thread_local with std::string
diff --git i/configure.ac w/configure.ac
index febb352cdb..3eb58e2558 100644
--- i/configure.ac
+++ w/configure.ac
@@ -1047,17 +1047,12 @@ if test "$use_thread_local" = "yes" || test "$use_thread_local" = "auto"; then
        *mingw*)
           dnl mingw32's implementation of thread_local has also been shown to behave
           dnl erroneously under concurrent usage; see:
           dnl https://gist.github.com/jamesob/fe9a872051a88b2025b1aa37bfa98605
           AC_MSG_RESULT([no])
           ;;
-        *freebsd*)
-          dnl FreeBSD's implementation of thread_local is also buggy (per
-          dnl https://groups.google.com/d/msg/bsdmailinglist/22ncTZAbDp4/Dii_pII5AwAJ)
-          AC_MSG_RESULT([no])
-          ;;
         *)
           AC_DEFINE([HAVE_THREAD_LOCAL], [1], [Define if thread_local is supported.])
           AC_MSG_RESULT([yes])
           ;;
       esac
     ],
diff --git i/src/logging.cpp w/src/logging.cpp
index 578650f856..1582bc1a17 100644
--- i/src/logging.cpp
+++ w/src/logging.cpp
@@ -358,13 +358,13 @@ void BCLog::Logger::LogPrintStr(const std::string& str, const std::string& loggi
     if (m_log_sourcelocations && m_started_new_line) {
         str_prefixed.insert(0, "[" + RemovePrefix(source_file, "./") + ":" + ToString(source_line) + "] [" + logging_function + "] ");
     }
 
     if (m_log_threadnames && m_started_new_line) {
         const auto& threadname = util::ThreadGetInternalName();
-        str_prefixed.insert(0, "[" + (threadname.empty() ? "unknown" : threadname) + "] ");
+        str_prefixed.insert(0, std::string{"["} + (std::strlen(threadname) == 0 ? "unknown" : threadname) + "] ");
     }
 
     str_prefixed = LogTimestampStr(str_prefixed);
 
     m_started_new_line = !str.empty() && str[str.size()-1] == '\n';
 
diff --git i/src/sync.cpp w/src/sync.cpp
index a8bdfc1dea..5fa9fbb7c0 100644
--- i/src/sync.cpp
+++ w/src/sync.cpp
@@ -34,13 +34,13 @@
 struct CLockLocation {
     CLockLocation(
         const char* pszName,
         const char* pszFile,
         int nLine,
         bool fTryIn,
-        const std::string& thread_name)
+        const char* thread_name)
         : fTry(fTryIn),
           mutexName(pszName),
           sourceFile(pszFile),
           m_thread_name(thread_name),
           sourceLine(nLine) {}
 
@@ -57,13 +57,13 @@ struct CLockLocation {
     }
 
 private:
     bool fTry;
     std::string mutexName;
     std::string sourceFile;
-    const std::string& m_thread_name;
+    const std::string m_thread_name;
     int sourceLine;
 };
 
 using LockStackItem = std::pair<void*, CLockLocation>;
 using LockStack = std::vector<LockStackItem>;
 using LockStacks = std::unordered_map<std::thread::id, LockStack>;
diff --git i/src/util/threadnames.cpp w/src/util/threadnames.cpp
index 91883fe4ff..2bb12f9da1 100644
--- i/src/util/threadnames.cpp
+++ w/src/util/threadnames.cpp
@@ -39,23 +39,26 @@ static void SetThreadName(const char* name)
 }
 
 // If we have thread_local, just keep thread ID and name in a thread_local
 // global.
 #if defined(HAVE_THREAD_LOCAL)
 
-static thread_local std::string g_thread_name;
-const std::string& util::ThreadGetInternalName() { return g_thread_name; }
+static thread_local char g_thread_name[128] = {'\0'};
+const char* util::ThreadGetInternalName() { return g_thread_name; }
 //! Set the in-memory internal name for this thread. Does not affect the process
 //! name.
-static void SetInternalName(std::string name) { g_thread_name = std::move(name); }
+static void SetInternalName(std::string name)
+{
+    std::memcpy(g_thread_name, name.c_str(), std::min(sizeof(g_thread_name), name.length() + 1));
+    g_thread_name[sizeof(g_thread_name) - 1] = '\0';
+}
 
 // Without thread_local available, don't handle internal name at all.
 #else
 
-static const std::string empty_string;
-const std::string& util::ThreadGetInternalName() { return empty_string; }
+const char* util::ThreadGetInternalName() { return ""; }
 static void SetInternalName(std::string name) { }
 #endif
 
 void util::ThreadRename(std::string&& name)
 {
     SetThreadName(("b-" + name).c_str());
diff --git i/src/util/threadnames.h w/src/util/threadnames.h
index 64b2689cf1..a5b7581e99 100644
--- i/src/util/threadnames.h
+++ w/src/util/threadnames.h
@@ -16,11 +16,11 @@ void ThreadRename(std::string&&);
 
 //! Set the internal (in-memory) name of the current thread only.
 void ThreadSetInternalName(std::string&&);
 
 //! Get the thread's internal (in-memory) name; used e.g. for identification in
 //! logging.
-const std::string& ThreadGetInternalName();
+const char* ThreadGetInternalName();
 
 } // namespace util
 
 #endif // BITCOIN_UTIL_THREADNAMES_H

@maflcko
Copy link
Member Author

maflcko commented Apr 27, 2024

We only use thread_local for storing the thread name in std::string. Can use dumb char array for this:

If this works, then string_view may also work? Yet another alternative may be to just completely nuke thread_local with a map from id to $obj (See #18678 (comment))

@maflcko maflcko changed the title Require thread_local Rethink thread_local (take 2) Apr 27, 2024
@laanwj
Copy link
Member

laanwj commented Apr 27, 2024

i think the main problem with the map approach is that it doesn't clean up data when threads disappear, this is something that TLS handles automatically, and also why it's so hard for platforms to get right

@vasild
Copy link
Contributor

vasild commented Apr 30, 2024

It just occurred to me on Friday evening and I forgot about this during the weekend - we may have a bug in our code and FreeBSD may just be the messenger - we return a reference to the thread_local, store it in CLockLocation, from there in the global lockdata / lock_stack. It looks like the reference in lockdata may still be existent after the thread has exited.

@maflcko, thanks for the string_view hint!

@vasild
Copy link
Contributor

vasild commented May 2, 2024

This is indeed some problem in FreeBSD's libc which I reported upstream with a minimal, reproducable test case:

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=278701

Is anybody interested in reviewing the patch I posted above if I PR it:
#29952 (comment)
reworked to return std::string_view to the callers, but still storing the thread name in a thread_local char []? That would work around the FreeBSD weird message printout because the thread_local variable will not have a destructor.

As for storing a reference to thread_local in a global variable - that is dangerous and should be avoided IMO. I checked that there is no bug currently in the code but it looks fragile.

@laanwj
Copy link
Member

laanwj commented May 2, 2024

reworked to return std::string_view to the callers, but still storing the thread name in a thread_local char []?

Yes. i think (for this one very rare exception) it's acceptable to store a string in a fixed-size buffer. To not need a destructor and heap deallocation when a thread goes away, works around a large part of the complexity of handling thread-local data.

And making it use string_view throughout the changed functions instead of char* is a good idea, a lot less ugly.

As for storing a reference to thread_local in a global variable - that is dangerous and should be avoided IMO. I checked that there is no bug currently in the code but it looks fragile.

Agree. The memory can go away at any time when the thread goes away, and it will be a dangling reference. It's brittle.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants