Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data race during shutdown #26

Open
michihenning opened this issue Apr 24, 2023 · 6 comments
Open

Data race during shutdown #26

michihenning opened this issue Apr 24, 2023 · 6 comments

Comments

@michihenning
Copy link

I have a client implemented with NTF that talks to a pre-existing server. Everything works fine. I can resolve, connect, and read and write messages. The code runs clean under address sanitizer. However, thread sanitizer complains about data races whenever I shut down the interface. The interface is created like so:

    ntca::InterfaceConfig icf;
    string name = "prx_cli_" + to_string(id_);
    icf.setThreadName(name);
    icf.setMinThreads(1);
    icf.setMaxThreads(1);

    ntca::ResolverConfig resolver_config;
    resolver_config.setClientTimeout(10); // We wait for up to 10 seconds for the DNS to respond.
    icf.setResolverConfig(resolver_config);
    icf.setResolverEnabled(true);

    {
        lock_guard<decltype(mutex_)> lock(mutex_);
        interface_ = ntcf::System::createInterface(icf);
        interface_->start();
        strand_ = interface_->createStrand();  // Single strand to serialize callbacks.
    }

    auto promise = make_shared<std::promise<void>>();
    auto fut = promise->get_future();
    interface_->execute([this, promise]() {
        thread_id_ = this_thread::get_id();
        promise->set_value();
    });

    fut.get();  // Wait until the interface thread has come up.

    interface_->execute([this]() { resolve_and_connect(true); });

There is a single thread in the interface. I have assertions all through the code to verify that callbacks are invoked on the single interface thread. All callbacks are created using the appropriate createCallback functions, and they specify the single strand for the interface.

To shut down the interface, I have this code:

    bsl::shared_ptr<ntci::Interface> interface;
    {
        lock_guard<decltype(mutex_)> lock(mutex_);
        interface = interface_;
    }

    interface->resolver()->shutdown();
    interface->resolver()->linger(
    interface->closeAll();
    interface->shutdown();
    interface->linger();

At the time I call stop(), there is no activity, other than a single receive() call pending, whose completion handler has not popped yet because the server has not written anything. There are no pending send() calls or other async operations.

In case it matters, my reads specify a timeout:

    ntca::ReceiveOptions r_opts;
    r_opts.setMinSize(GW_HDR_SIZE);
    r_opts.setMaxSize(GW_HDR_SIZE);
    const int64_t secs =  2 * heartbeat_interval_sec;
    bsls::TimeInterval deadline(socket_->currentTime().addSeconds(secs));
    r_opts.setDeadline(deadline);

    auto receive_done = [this](auto receiver, auto blob, auto event) {
        BSLS_ASSERT(receiver == socket_);
        assert(this_thread::get_id() == thread_id_.load());
        if (event.isError()) {
            if (state_ != ShuttingDown) {
                // Print error, close the connection, and reconnect...
                try_reconnect();
            }
            return;
        }
        read_gw_hdr_handler(std::move(blob));
    };
    auto callback = socket_->createReceiveCallback(receive_done, strand_);
    auto error = socket_->receive(r_opts, callback);
    if (error) {

        CS_LOG(FATAL) << "Cannot read: " << error.text();
        abort();
    }

Once I call my stop code, thread sanitizer complains. I have pasted some of the salient output here. T11 is the interface thread.

==================
WARNING: ThreadSanitizer: data race (pid=1625761)
  Read of size 8 at 0x7b2400004f38 by main thread:
    #0 bsl::shared_ptr<BloombergLP::ntci::Authorization>::~shared_ptr() /opt/bb/include/bslstl_sharedptr.h:4702 (ProxyClientStream_test+0x1d6f5b)
    #1 BloombergLP::ntci::Invoker<void (bsl::shared_ptr<BloombergLP::ntci::Timer> const&, BloombergLP::ntca::TimerEvent const&)>::~Invoker() /opt/bb/include/ntci_invoker.h:324 (ProxyClientStream_test+0x1d44e5)
    #2 void BloombergLP::bslma::SharedPtrInplaceRep_ImpUtil::dispose<BloombergLP::ntci::Invoker<void (bsl::shared_ptr<BloombergLP::ntci::Timer> const&, BloombergLP::ntca::TimerEvent const&)> >(BloombergLP::ntci::Invoker<void (bsl::shared_ptr<BloombergLP::ntci::Timer> const&, BloombergLP::ntca::TimerEvent const&)> const&) /opt/bb/include/bslma_sharedptrinplacerep.h:332 (ProxyClientStream_test+0x1dfadb)
    #3 BloombergLP::bslma::SharedPtrInplaceRep<BloombergLP::ntci::Invoker<void (bsl::shared_ptr<BloombergLP::ntci::Timer> const&, BloombergLP::ntca::TimerEvent const&)> >::disposeObject() /opt/bb/include/bslma_sharedptrinplacerep.h:371 (ProxyClientStream_test+0x1dee5b)
    #4 BloombergLP::bslma::SharedPtrRep::releaseRef() <null> (ProxyClientStream_test+0x4a6b85)
    #5 ProxyClientStreamTest_connect_after_send_Test::TestBody() /home/mhenning/code/cpp/ts-gateway/test/ProxyClientStream/ProxyClientStream_test.cc:205 (ProxyClientStream_test+0x8f165)
    #6 void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) <null> (ProxyClientStream_test+0x15c619)
    #7 void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) <null> (ProxyClientStream_test+0x151809)
    #8 testing::Test::Run() /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/src/gtest.cc:2508 (ProxyClientStream_test+0x12331b)
    #9 testing::TestInfo::Run() /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/src/gtest.cc:2684 (ProxyClientStream_test+0x123fd6)
    #10 testing::TestSuite::Run() /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/src/gtest.cc:2816 (ProxyClientStream_test+0x124a88)
    #11 testing::internal::UnitTestImpl::RunAllTests() /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/src/gtest.cc:5338 (ProxyClientStream_test+0x132d28)
    #12 bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/src/gtest.cc:2433 (ProxyClientStream_test+0x15e571)
    #13 bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) <null> (ProxyClientStream_test+0x15344b)
    #14 testing::UnitTest::Run() /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/src/gtest.cc:4925 (ProxyClientStream_test+0x130e8f)
    #15 RUN_ALL_TESTS() /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/include/gtest/gtest.h:2473 (ProxyClientStream_test+0x179df2)
    #16 main /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googlemock/src/gmock_main.cc:63 (ProxyClientStream_test+0x179ced)

  Previous write of size 8 at 0x7b2400004f38 by thread T11 (mutexes: write M1597):
    #0 operator new(unsigned long) ../../../../src/libsanitizer/tsan/tsan_new_delete.cpp:64 (libtsan.so.0+0x8c032)
    #1 operator new(unsigned long, BloombergLP::bslma::Allocator&) /opt/bb/include/bslma_allocator.h:632 (ProxyClientStream_test+0x1be4cc)
    #2 void bsl::shared_ptr<BloombergLP::ntci::Invoker<void (bsl::shared_ptr<BloombergLP::ntci::Timer> const&, BloombergLP::ntca::TimerEvent const&)> >::createInplace<bsl::function<void (bsl::shared_ptr<BloombergLP::ntci::Timer> const&, BloombergLP::ntca::TimerEvent const&)> const&, BloombergLP::bslma::Allocator*&>(BloombergLP::bslma::Allocator*, bsl::function<void (bsl::shared_ptr<BloombergLP::ntci::Timer> const&, BloombergLP::ntca::TimerEvent const&)> const&, BloombergLP::bslma::Allocator*&) /opt/bb/include/bslstl_sharedptr.h:4944 (ProxyClientStream_test+0x1cbca8)
    #3 BloombergLP::ntci::Callback<void (bsl::shared_ptr<BloombergLP::ntci::Timer> const&, BloombergLP::ntca::TimerEvent const&)>::Callback(bsl::function<void (bsl::shared_ptr<BloombergLP::ntci::Timer> const&, BloombergLP::ntca::TimerEvent const&)> const&, bsl::shared_ptr<BloombergLP::ntci::Strand> const&, BloombergLP::bslma::Allocator*) /home/mhenning/code/ntf-core/groups/ntc/ntci/ntci_callback.h:940 (ProxyClientStream_test+0x3141a0)
    #4 BloombergLP::ntci::TimerCallbackFactory::createTimerCallback(bsl::function<void (bsl::shared_ptr<BloombergLP::ntci::Timer> const&, BloombergLP::ntca::TimerEvent const&)> const&, BloombergLP::bslma::Allocator*) /home/mhenning/code/ntf-core/groups/ntc/ntci/ntci_timercallbackfactory.h:97 (ProxyClientStream_test+0x3141a0)
    #5 BloombergLP::ntcr::StreamSocket::receive(BloombergLP::ntca::ReceiveOptions const&, BloombergLP::ntci::Callback<void (bsl::shared_ptr<BloombergLP::ntci::Receiver> const&, bsl::shared_ptr<BloombergLP::bdlbb::Blob> const&, BloombergLP::ntca::ReceiveEvent const&)> const&) /home/mhenning/code/ntf-core/groups/ntc/ntcr/ntcr_streamsocket.cpp:4923 (ProxyClientStream_test+0x3141a0)
    #6 operator() /home/mhenning/code/cpp/ts-gateway/src/ProxyClientStream.cc:1133 (ProxyClientStream_test+0x1ae284)
    #7 invoke /opt/bb/include/bslstl_function_invokerutil.h:927 (ProxyClientStream_test+0x1b78ef)
    #8 BloombergLP::bslstl::Function_Variadic<void ()>::operator()() const /opt/bb/include/bslstl_function.h:1518 (ProxyClientStream_test+0x3e6fd9)
    #9 BloombergLP::ntcs::Chronology::announce() /home/mhenning/code/ntf-core/groups/ntc/ntcs/ntcs_chronology.cpp:1157 (ProxyClientStream_test+0x3e6fd9)

  As if synchronized via sleep:
    #0 sleep ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:348 (libtsan.so.0+0x6314b)
    #1 ProxyClientStreamTest_connect_after_send_Test::TestBody() /home/mhenning/code/cpp/ts-gateway/test/ProxyClientStream/ProxyClientStream_test.cc:203 (ProxyClientStream_test+0x8f14e)
    #2 void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) <null> (ProxyClientStream_test+0x15c619)
    #3 void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) <null> (ProxyClientStream_test+0x151809)
    #4 testing::Test::Run() /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/src/gtest.cc:2508 (ProxyClientStream_test+0x12331b)
    #5 testing::TestInfo::Run() /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/src/gtest.cc:2684 (ProxyClientStream_test+0x123fd6)
    #6 testing::TestSuite::Run() /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/src/gtest.cc:2816 (ProxyClientStream_test+0x124a88)
    #7 testing::internal::UnitTestImpl::RunAllTests() /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/src/gtest.cc:5338 (ProxyClientStream_test+0x132d28)
    #8 bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/src/gtest.cc:2433 (ProxyClientStream_test+0x15e571)
    #9 bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) <null> (ProxyClientStream_test+0x15344b)
    #10 testing::UnitTest::Run() /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/src/gtest.cc:4925 (ProxyClientStream_test+0x130e8f)
    #11 RUN_ALL_TESTS() /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/include/gtest/gtest.h:2473 (ProxyClientStream_test+0x179df2)
    #12 main /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googlemock/src/gmock_main.cc:63 (ProxyClientStream_test+0x179ced)
==================
WARNING: ThreadSanitizer: data race (pid=1625761)
  Read of size 8 at 0x7b2400004ed0 by main thread:
    #0 BloombergLP::bslma::SharedPtrInplaceRep<BloombergLP::ntci::Invoker<void (bsl::shared_ptr<BloombergLP::ntci::Timer> const&, BloombergLP::ntca::TimerEvent const&)> >::disposeRep() /opt/bb/include/bslma_sharedptrinplacerep.h:378 (ProxyClientStream_test+0x1dee90)
    #1 BloombergLP::ntcs::Chronology::Timer::close() /opt/bb/include/bslstl_sharedptr.h:4703 (ProxyClientStream_test+0x3e3e85)
    #2 ProxyClientStreamTest_connect_after_send_Test::TestBody() /home/mhenning/code/cpp/ts-gateway/test/ProxyClientStream/ProxyClientStream_test.cc:205 (ProxyClientStream_test+0x8f165)
    #3 void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) <null> (ProxyClientStream_test+0x15c619)
    #4 void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) <null> (ProxyClientStream_test+0x151809)
    #5 testing::Test::Run() /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/src/gtest.cc:2508 (ProxyClientStream_test+0x12331b)
    #6 testing::TestInfo::Run() /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/src/gtest.cc:2684 (ProxyClientStream_test+0x123fd6)
    #7 testing::TestSuite::Run() /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/src/gtest.cc:2816 (ProxyClientStream_test+0x124a88)
    #8 testing::internal::UnitTestImpl::RunAllTests() /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/src/gtest.cc:5338 (ProxyClientStream_test+0x132d28)
    #9 bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/src/gtest.cc:2433 (ProxyClientStream_test+0x15e571)
    #10 bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) <null> (ProxyClientStream_test+0x15344b)
    #11 testing::UnitTest::Run() /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/src/gtest.cc:4925 (ProxyClientStream_test+0x130e8f)
    #12 RUN_ALL_TESTS() /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/include/gtest/gtest.h:2473 (ProxyClientStream_test+0x179df2)
    #13 main /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googlemock/src/gmock_main.cc:63 (ProxyClientStream_test+0x179ced)

  Previous write of size 8 at 0x7b2400004ed0 by thread T11 (mutexes: write M1597):
    #0 operator new(unsigned long) ../../../../src/libsanitizer/tsan/tsan_new_delete.cpp:64 (libtsan.so.0+0x8c032)
    #1 operator new(unsigned long, BloombergLP::bslma::Allocator&) /opt/bb/include/bslma_allocator.h:632 (ProxyClientStream_test+0x1be4cc)
    #2 void bsl::shared_ptr<BloombergLP::ntci::Invoker<void (bsl::shared_ptr<BloombergLP::ntci::Timer> const&, BloombergLP::ntca::TimerEvent const&)> >::createInplace<bsl::function<void (bsl::shared_ptr<BloombergLP::ntci::Timer> const&, BloombergLP::ntca::TimerEvent const&)> const&, BloombergLP::bslma::Allocator*&>(BloombergLP::bslma::Allocator*, bsl::function<void (bsl::shared_ptr<BloombergLP::ntci::Timer> const&, BloombergLP::ntca::TimerEvent const&)> const&, BloombergLP::bslma::Allocator*&) /opt/bb/include/bslstl_sharedptr.h:4944 (ProxyClientStream_test+0x1cbca8)
    #3 BloombergLP::ntci::Callback<void (bsl::shared_ptr<BloombergLP::ntci::Timer> const&, BloombergLP::ntca::TimerEvent const&)>::Callback(bsl::function<void (bsl::shared_ptr<BloombergLP::ntci::Timer> const&, BloombergLP::ntca::TimerEvent const&)> const&, bsl::shared_ptr<BloombergLP::ntci::Strand> const&, BloombergLP::bslma::Allocator*) /home/mhenning/code/ntf-core/groups/ntc/ntci/ntci_callback.h:940 (ProxyClientStream_test+0x3141a0)
    #4 BloombergLP::ntci::TimerCallbackFactory::createTimerCallback(bsl::function<void (bsl::shared_ptr<BloombergLP::ntci::Timer> const&, BloombergLP::ntca::TimerEvent const&)> const&, BloombergLP::bslma::Allocator*) /home/mhenning/code/ntf-core/groups/ntc/ntci/ntci_timercallbackfactory.h:97 (ProxyClientStream_test+0x3141a0)
    #5 BloombergLP::ntcr::StreamSocket::receive(BloombergLP::ntca::ReceiveOptions const&, BloombergLP::ntci::Callback<void (bsl::shared_ptr<BloombergLP::ntci::Receiver> const&, bsl::shared_ptr<BloombergLP::bdlbb::Blob> const&, BloombergLP::ntca::ReceiveEvent const&)> const&) /home/mhenning/code/ntf-core/groups/ntc/ntcr/ntcr_streamsocket.cpp:4923 (ProxyClientStream_test+0x3141a0)
    #6 operator() /home/mhenning/code/cpp/ts-gateway/src/ProxyClientStream.cc:1133 (ProxyClientStream_test+0x1ae284)
    #7 invoke /opt/bb/include/bslstl_function_invokerutil.h:927 (ProxyClientStream_test+0x1b78ef)
    #8 BloombergLP::bslstl::Function_Variadic<void ()>::operator()() const /opt/bb/include/bslstl_function.h:1518 (ProxyClientStream_test+0x3e6fd9)
    #9 BloombergLP::ntcs::Chronology::announce() /home/mhenning/code/ntf-core/groups/ntc/ntcs/ntcs_chronology.cpp:1157 (ProxyClientStream_test+0x3e6fd9)

  As if synchronized via sleep:
    #0 sleep ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:348 (libtsan.so.0+0x6314b)
    #1 ProxyClientStreamTest_connect_after_send_Test::TestBody() /home/mhenning/code/cpp/ts-gateway/test/ProxyClientStream/ProxyClientStream_test.cc:203 (ProxyClientStream_test+0x8f14e)
    #2 void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) <null> (ProxyClientStream_test+0x15c619)
    #3 void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) <null> (ProxyClientStream_test+0x151809)
    #4 testing::Test::Run() /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/src/gtest.cc:2508 (ProxyClientStream_test+0x12331b)
    #5 testing::TestInfo::Run() /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/src/gtest.cc:2684 (ProxyClientStream_test+0x123fd6)
    #6 testing::TestSuite::Run() /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/src/gtest.cc:2816 (ProxyClientStream_test+0x124a88)
    #7 testing::internal::UnitTestImpl::RunAllTests() /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/src/gtest.cc:5338 (ProxyClientStream_test+0x132d28)
    #8 bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/src/gtest.cc:2433 (ProxyClientStream_test+0x15e571)
    #9 bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) <null> (ProxyClientStream_test+0x15344b)
    #10 testing::UnitTest::Run() /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/src/gtest.cc:4925 (ProxyClientStream_test+0x130e8f)
    #11 RUN_ALL_TESTS() /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/include/gtest/gtest.h:2473 (ProxyClientStream_test+0x179df2)
    #12 main /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googlemock/src/gmock_main.cc:63 (ProxyClientStream_test+0x179ced)
==================
WARNING: ThreadSanitizer: data race (pid=1625761)
  Write of size 8 at 0x7b2400004ec0 by main thread:
    #0 operator delete(void*) ../../../../src/libsanitizer/tsan/tsan_new_delete.cpp:126 (libtsan.so.0+0x8b2c8)
    #1 BloombergLP::bslma::SharedPtrInplaceRep<BloombergLP::ntci::Invoker<void (bsl::shared_ptr<BloombergLP::ntci::Timer> const&, BloombergLP::ntca::TimerEvent const&)> >::disposeRep() /opt/bb/include/bslma_sharedptrinplacerep.h:378 (ProxyClientStream_test+0x1deee9)
    #2 BloombergLP::ntcs::Chronology::Timer::close() /opt/bb/include/bslstl_sharedptr.h:4703 (ProxyClientStream_test+0x3e3e85)
    #3 ProxyClientStreamTest_connect_after_send_Test::TestBody() /home/mhenning/code/cpp/ts-gateway/test/ProxyClientStream/ProxyClientStream_test.cc:205 (ProxyClientStream_test+0x8f165)
    #4 void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) <null> (ProxyClientStream_test+0x15c619)
    #5 void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) <null> (ProxyClientStream_test+0x151809)
    #6 testing::Test::Run() /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/src/gtest.cc:2508 (ProxyClientStream_test+0x12331b)
    #7 testing::TestInfo::Run() /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/src/gtest.cc:2684 (ProxyClientStream_test+0x123fd6)
    #8 testing::TestSuite::Run() /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/src/gtest.cc:2816 (ProxyClientStream_test+0x124a88)
    #9 testing::internal::UnitTestImpl::RunAllTests() /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/src/gtest.cc:5338 (ProxyClientStream_test+0x132d28)
    #10 bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/src/gtest.cc:2433 (ProxyClientStream_test+0x15e571)
    #11 bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) <null> (ProxyClientStream_test+0x15344b)
    #12 testing::UnitTest::Run() /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/src/gtest.cc:4925 (ProxyClientStream_test+0x130e8f)
    #13 RUN_ALL_TESTS() /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/include/gtest/gtest.h:2473 (ProxyClientStream_test+0x179df2)
    #14 main /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googlemock/src/gmock_main.cc:63 (ProxyClientStream_test+0x179ced)

  Previous write of size 8 at 0x7b2400004ec0 by thread T11 (mutexes: write M1597):
    #0 operator new(unsigned long) ../../../../src/libsanitizer/tsan/tsan_new_delete.cpp:64 (libtsan.so.0+0x8c032)
    #1 operator new(unsigned long, BloombergLP::bslma::Allocator&) /opt/bb/include/bslma_allocator.h:632 (ProxyClientStream_test+0x1be4cc)
    #2 void bsl::shared_ptr<BloombergLP::ntci::Invoker<void (bsl::shared_ptr<BloombergLP::ntci::Timer> const&, BloombergLP::ntca::TimerEvent const&)> >::createInplace<bsl::function<void (bsl::shared_ptr<BloombergLP::ntci::Timer> const&, BloombergLP::ntca::TimerEvent const&)> const&, BloombergLP::bslma::Allocator*&>(BloombergLP::bslma::Allocator*, bsl::function<void (bsl::shared_ptr<BloombergLP::ntci::Timer> const&, BloombergLP::ntca::TimerEvent const&)> const&, BloombergLP::bslma::Allocator*&) /opt/bb/include/bslstl_sharedptr.h:4944 (ProxyClientStream_test+0x1cbca8)
    #3 BloombergLP::ntci::Callback<void (bsl::shared_ptr<BloombergLP::ntci::Timer> const&, BloombergLP::ntca::TimerEvent const&)>::Callback(bsl::function<void (bsl::shared_ptr<BloombergLP::ntci::Timer> const&, BloombergLP::ntca::TimerEvent const&)> const&, bsl::shared_ptr<BloombergLP::ntci::Strand> const&, BloombergLP::bslma::Allocator*) /home/mhenning/code/ntf-core/groups/ntc/ntci/ntci_callback.h:940 (ProxyClientStream_test+0x3141a0)
    #4 BloombergLP::ntci::TimerCallbackFactory::createTimerCallback(bsl::function<void (bsl::shared_ptr<BloombergLP::ntci::Timer> const&, BloombergLP::ntca::TimerEvent const&)> const&, BloombergLP::bslma::Allocator*) /home/mhenning/code/ntf-core/groups/ntc/ntci/ntci_timercallbackfactory.h:97 (ProxyClientStream_test+0x3141a0)
    #5 BloombergLP::ntcr::StreamSocket::receive(BloombergLP::ntca::ReceiveOptions const&, BloombergLP::ntci::Callback<void (bsl::shared_ptr<BloombergLP::ntci::Receiver> const&, bsl::shared_ptr<BloombergLP::bdlbb::Blob> const&, BloombergLP::ntca::ReceiveEvent const&)> const&) /home/mhenning/code/ntf-core/groups/ntc/ntcr/ntcr_streamsocket.cpp:4923 (ProxyClientStream_test+0x3141a0)
    #6 operator() /home/mhenning/code/cpp/ts-gateway/src/ProxyClientStream.cc:1133 (ProxyClientStream_test+0x1ae284)
    #7 invoke /opt/bb/include/bslstl_function_invokerutil.h:927 (ProxyClientStream_test+0x1b78ef)
    #8 BloombergLP::bslstl::Function_Variadic<void ()>::operator()() const /opt/bb/include/bslstl_function.h:1518 (ProxyClientStream_test+0x3e6fd9)
    #9 BloombergLP::ntcs::Chronology::announce() /home/mhenning/code/ntf-core/groups/ntc/ntcs/ntcs_chronology.cpp:1157 (ProxyClientStream_test+0x3e6fd9)

  As if synchronized via sleep:
    #0 sleep ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:348 (libtsan.so.0+0x6314b)
    #1 ProxyClientStreamTest_connect_after_send_Test::TestBody() /home/mhenning/code/cpp/ts-gateway/test/ProxyClientStream/ProxyClientStream_test.cc:203 (ProxyClientStream_test+0x8f14e)
    #2 void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) <null> (ProxyClientStream_test+0x15c619)
    #3 void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) <null> (ProxyClientStream_test+0x151809)
    #4 testing::Test::Run() /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/src/gtest.cc:2508 (ProxyClientStream_test+0x12331b)
    #5 testing::TestInfo::Run() /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/src/gtest.cc:2684 (ProxyClientStream_test+0x123fd6)
    #6 testing::TestSuite::Run() /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/src/gtest.cc:2816 (ProxyClientStream_test+0x124a88)
    #7 testing::internal::UnitTestImpl::RunAllTests() /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/src/gtest.cc:5338 (ProxyClientStream_test+0x132d28)
    #8 bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/src/gtest.cc:2433 (ProxyClientStream_test+0x15e571)
    #9 bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) <null> (ProxyClientStream_test+0x15344b)
    #10 testing::UnitTest::Run() /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/src/gtest.cc:4925 (ProxyClientStream_test+0x130e8f)
    #11 RUN_ALL_TESTS() /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/include/gtest/gtest.h:2473 (ProxyClientStream_test+0x179df2)
    #12 main /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googlemock/src/gmock_main.cc:63 (ProxyClientStream_test+0x179ced)
@mattrm456
Copy link
Contributor

Are you able to post the complete code for your client? Stripped down if possible, but something that still reproduces this data race report from tsan.

@michihenning
Copy link
Author

michihenning commented Apr 27, 2023

I've distilled this down as much as I can. Attached is the source for a trivial program. It implements a very simple server and client.
To avoid any potential confusion, the server is implemented using system calls only, not using NTF. It binds to port 10001, listens, accepts a single incoming connection, and then goes into a read loop that runs until the corresponding file descriptor is closed. (This is just to make the server thread block without doing anything.)

The client is implemented using NTF. It initializes the interface and connects to the server, which works fine. The main() function instantiates the server and the client and sleeps for five seconds. Then it shuts down the NTF run time. At this point, I'm getting complaints from thread sanitizer (attached).

The code is compiled with --std=c++17 -g -Wall -fsanitize=thread -fsanitize-recover=address -fno-omit-frame-pointer

If port 10001 is in use on your machine, change it to something unused at the top of the source. To keep things simple, I hardwired it. I also used assert() to bail out if there is an error, rather than confusing the issue with irrelevant error-handling noise.
tsan.txt
ntf_test.cc.txt

@michihenning
Copy link
Author

Apologies, I closed this accidentally.

@rdilipk
Copy link

rdilipk commented May 23, 2023

@michihenning I played around with this and the problem seems to be with the way the shutdown sequence is initiated in the Client destructor. You have:

Client::~Client()
{
    interface_->closeAll(); // <-- this one should go below everything
    interface_->shutdown();
    interface_->linger();
}

It looks like you want to call closeAll() as the very last thing to do after shutdown() and linger() complete. Once I did that, the tsan complaints went away in the dozen times I tried. Of course, this is just experimentation. I don't know if it's actually true but if it is, maybe it should be called out in the docs explicitly.

@michihenning
Copy link
Author

It never occurred to me to do this. I find this highly surprising. After all, I want to tell it to tear everything down and then wait until it’s all finished. Seems really counter-intuitive.

In particular, if I call closeAll() as the last thing, how do I know when it’s done? I don’t want to exit with threads still running, and I might want to shut down something else (not NTF-related) after NTF has finished shutting down.

I had a look at InterfaceStopGuard. It calls shutdown() and linger(), but not closeAll(). The doc is really lacking here. There is no mention of what actually happens when I call closeAll() or shutdown(). Does shutdown() imply closeAll()? When would I call closeAll() instead of shutdown()? When I call shutdown(), will the completion handlers for everything get invoked with a special error? I would hope so, because might have to do some cleanup of my own.

@rdilipk
Copy link

rdilipk commented May 24, 2023

To clarify, I was just experimenting with various ways of shutting down. It may have gotten rid of the tsan complaints but certainly is not necessarily the right thing to do. Let's wait for the official word.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants