Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ThreadSanitizer reports destruction of locked mutex in threadpool_free. #106

Open
nephatrine opened this issue Sep 23, 2022 · 4 comments
Open
Labels
bug Something isn't working

Comments

@nephatrine
Copy link

Describe the bug

When I call discord_cleanup in a build using TSAN, I get diagnostic messages stating a locked mutex is being destroyed.

WARNING: ThreadSanitizer: destroy of a locked mutex (pid=24515)
    #0 pthread_mutex_destroy ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:1244 (libtsan.so.0+0x39398)
    #1 threadpool_free <null> (game.so+0xcea1d)
    [redacted]

  and:
    #0 pthread_mutex_lock ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:4240 (libtsan.so.0+0x53908)
    #1 threadpool_free <null> (game.so+0xcea15)
    [redacted]

Looking in threadpool.c and the threadpool_free function it seems you are intentionally locking the mutex just prior to deletion. I don't think this impacts the program's functionality in any way, but it is undefined behaviour according to the spec.

Expected behavior

If the lock is just to make sure anything else that could have been using it has terminated, then you should be able to add in an unlock immediately after the lock and prior to the destroy. If there is a concern that something else could try to initiate a new lock on the mutex at this point in the code where you're destroying it, then I'd imagine that is a completely separate concurrency issue that would need to be resolved elsewhere.

Version

commit 1da91a6 (HEAD, tag: v2.1.0)

Distributor ID: Ubuntu
Description: Ubuntu 22.04.1 LTS
Release: 22.04
Codename: jammy

Additional context

I am not attempting to debug or QA concord's threading implementation. I am merely debugging my own threaded code in my library that uses concord and was very confused upon seeing these diagnostics. The similar orca library has the same "issue" fyi and given that it does not seem to actually have any negative impact aside from that noise in tsan when closing my application, it's understandable if you elect to just leave it as it is.

@nephatrine nephatrine added the bug Something isn't working label Sep 23, 2022
@lcsmuller
Copy link
Collaborator

lcsmuller commented Sep 23, 2022

I suspect that this is not an actual bug, but a consequence of pthread's optimization, see: https://stackoverflow.com/a/68514238

I had also come across a similar error when running my application under valgrind, but after some research I came to the conclusion that it's a harmless "leak". You can probably circumvent it by having your process sleep for a little while before termination. If you believe that your issue is unrelated to this, feel free to re-open this issue.

@nephatrine
Copy link
Author

It's not just a consequence of optimization, the lock is literally right before the destroy call in the code. No amount of sleeping will change that.

        /* Because we allocate pool->threads after initializing the
           mutex and condition variable, we're sure they're
           initialized. Let's lock the mutex just in case. */
        pthread_mutex_lock(&(pool->lock));
        pthread_mutex_destroy(&(pool->lock));

https://linux.die.net/man/3/pthread_mutex_destroy
https://pubs.opengroup.org/onlinepubs/9699919799/functions/pthread_mutex_destroy.html

Attempting to destroy a locked mutex results in undefined behavior.

I'd imagine that this is actually "harmless" in practice, but for what reason is it being locked before being destroyed? If we need that lock there, then just inserting an unlock immediately afterwards would suffice to comply with the POSIX standard. The only concern I see is if you think something else could lock it between the unlock and the destroy, but if you've got other threads still trying to use the mutex at that point then you've got bigger issues.

@lcsmuller lcsmuller reopened this Sep 23, 2022
@lcsmuller
Copy link
Collaborator

I'll take a better look at this once I get home, I may have mixed up your issue with something else I was having. My bad for just assuming.

It's not just a consequence of optimization, the lock is literally right before the destroy call in the code. No amount of sleeping will change that.

In the meantime, can you confirm that changing the ordering fixes the issue? If so, could you please open a PR fixing it?

@lcsmuller
Copy link
Collaborator

Just to give a bit of a context, the issue arises from how this third-party threadpool lib manages its clean up. We can either look for an alternative (C99 compliant) threadpool implementation, write our own, or fix this one (and send a fix PR to the original's out of courtesy).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants