Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Synchronization on shutdown. #13

Open
l90lpa opened this issue Jun 11, 2018 · 5 comments
Open

Synchronization on shutdown. #13

l90lpa opened this issue Jun 11, 2018 · 5 comments
Assignees

Comments

@l90lpa
Copy link

l90lpa commented Jun 11, 2018

I have noticed that there seems to be a race condition in the shutdown process of ThreadPool. By example let Thread1 be the main thread executing shutdown() and let Thread2 be a ThreadWorker executing operator(). If we take the following interleaving of calls (assuming that the queue is empty by the start):

Thread2: checks the while condition and returns true.
Thread1: sets m_shutdown to false.
Thread1: calls m_conditional_lock.notify_all().
Thread2: takes ownership of m_pool->m_conditional_mutex.
Thread2: call m_pool->m_conditional_lock.wait( lock ).

In the above scenario it can be seen that the notify_all() is completed before Thread2 calls wait which means that the ThreadWorker remains waiting indefinitely (unless a spurious wake-up occurs) and so the call to join() in shutdown() can't be completed stopping the shutdown of the thread pool. One suggestion to fix this would be to provide a predicate to the wait(lock), such as m_pool->m_conditional_lock.wait( lock , [this]{ return (!this->m_pool->m_queue.empty()) || this->m_pool->m_shutdown;});

@mtrebi mtrebi self-assigned this Jun 11, 2018
@kermado
Copy link

kermado commented Jan 16, 2019

I don't think that this is sufficient. The wait predicate AFAIK is to guard against spurious wakeup. It's probably recommended, but I don't think that it solves the race condition.

Your call operator should look more like this:

void operator()()
{
  std::unique_lock<std::mutex> lock(m_pool->m_conditional_mutex);
  while (!m_pool->m_shutdown) {
  {
    if (m_pool->m_queue.empty()) {
      m_pool->m_conditional_lock.wait(lock, [this] { return this->m_pool->m_shutdown || this->m_pool->m_queue.empty() == false; });
    }
    std::function<void()> func;
    if (m_pool->m_queue.dequeue(func)) {
      func();
    }
  }
}

The problem described is then fixed by locking the mutex in your shutdown() function. Note that wait() releases the lock when called and acquires the lock when woken. You need to acquire the lock in order to prevent wait() being called after the call to notify_all() in shutdown().

void shutdown()
{
  {
    std::lock_guard<std::mutex> lock(m_conditional_mutex);
    m_shutdown = true;
    m_conditional_lock.notify_all();
  }
  
  for (int i = 0; i < m_threads.size(); ++i)
  {
    if (m_threads[i].joinable())
    {
      m_threads[i].join();
    }
  }
}

Similarly, in submit(), you need to acquire the lock before calling notify_all() to prevent notify_all() being called between checking whether the queue is empty and calling wait().

{
  std::lock_guard<std::mutex> lock(m_conditional_mutex);
  m_conditional_lock.notify_one();
}

@yvoinov
Copy link
Contributor

yvoinov commented Jan 16, 2019

@kermado
Copy link

kermado commented Jan 16, 2019

@yvoinov In the case of submit(), if we don't lock the mutex then I believe that notify_one() could be called between checking whether the queue is empty and wait(). In that case, we will wait forever, or until something else notifies the condition variable. I placed the lock to ensure that we only notify if the other thread is currently waiting. If you don't lock then how else do you prevent this race condition?

https://stackoverflow.com/questions/20982270/sync-is-unreliable-using-stdatomic-and-stdcondition-variable

@yvoinov
Copy link
Contributor

yvoinov commented Jan 17, 2019

@yvoinov In the case of submit(), if we don't lock the mutex then I believe that notify_one() could be called between checking whether the queue is empty and wait(). In that case, we will wait forever, or until something else notifies the condition variable. I placed the lock to ensure that we only notify if the other thread is currently waiting. If you don't lock then how else do you prevent this race condition?

https://stackoverflow.com/questions/20982270/sync-is-unreliable-using-stdatomic-and-stdcondition-variable

After a lot of testing, I'm using quite different thread pool implementation:

https://github.com/yvoinov/thread-pool-cpp

There is no race condition on submit. Without lock notify_*. This implementation works perfectly, no deadlocks, no races.

To prevent lost wakeup it is enough to lock conditional variable mutex during notify_one(). It described here:

http://www.modernescpp.com/index.php/c-core-guidelines-be-aware-of-the-traps-of-condition-variables

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants