Investigate data races during concurrent chunk inserts #2454

mweisgut · 2022-04-25T09:25:42Z

While working on #2453, I added a stress test, which concurrently appends chunks to a table. Executing the test with the thread sanitizer enabled results in data races.
The data races occurred with both the tbb::zero_allocator and the ZeroAllocator, so the issue was not introduced in #2453.

Stress test:

hyrise/src/test/lib/concurrency/stress_test.cpp

Lines 213 to 258 in e8820e0

    
           TEST_F(StressTest, ChunkInserts) { 
        
             auto column_definitions = 
        
                 TableColumnDefinitions({{"column_1", DataType::Int, false}, {"column_2", DataType::String, true}}); 
        
             auto table = std::make_shared<Table>(column_definitions, TableType::Data, ChunkOffset{2}); 
        
             const auto iterations_per_thread = 1000; 
        
             const auto run = [&]() { 
        
               for (auto iteration = 0; iteration < iterations_per_thread; ++iteration) { 
        
                 auto vs_int = std::make_shared<ValueSegment<int>>(); 
        
                 auto vs_str = std::make_shared<ValueSegment<pmr_string>>(); 
        
                 vs_int->append(5); 
        
                 vs_str->append("five"); 
        
                 table->append_chunk({vs_int, vs_str}); 
        
               } 
        
             }; 
        
             // Create the async objects and spawn them asynchronously (i.e., as their own threads). 
        
             const auto thread_count = 50u; 
        
             std::vector<std::future<void>> thread_futures; 
        
             thread_futures.reserve(thread_count); 
        
             for (auto thread_num = 0u; thread_num < thread_count; ++thread_num) { 
        
               // We want a future to the thread running, so we can kill it after a future.wait(timeout) or the test would freeze. 
        
               thread_futures.emplace_back(std::async(std::launch::async, run)); 
        
             } 
        
             // Wait for completion or timeout (should not occur). 
        
             for (auto& thread_future : thread_futures) { 
        
               // We give this a lot of time, not because we usually need that long for 100 threads to finish, but because 
        
               // sanitizers and other tools like valgrind sometimes bring a high overhead. 
        
               if (thread_future.wait_for(std::chrono::seconds(600)) == std::future_status::timeout) { 
        
                 ASSERT_TRUE(false) << "At least one thread got stuck and did not commit."; 
        
               } 
        
               // Retrieve the future so that exceptions stored in its state are thrown. 
        
               thread_future.get(); 
        
             } 
        
             const auto chunk_count = table->chunk_count(); 
        
             EXPECT_EQ(chunk_count, iterations_per_thread * thread_count); 
        
             for (auto chunk_id = ChunkID{0}; chunk_id < chunk_count; ++chunk_id) { 
        
               const auto chunk = table->get_chunk(chunk_id); 
        
               EXPECT_NE(chunk, nullptr); 
        
               EXPECT_EQ(chunk->size(), 1); 
        
             } 
        
           }

julianmenzler · 2022-04-29T15:03:57Z

We have the CI stage clangRelWithDebInfoThreadSanitizer, which runs hyriseSystemTest, and thus StressTest.*.

Why does/did the CI not bring up this issue?

mweisgut · 2022-04-29T15:38:23Z

We have the CI stage clangRelWithDebInfoThreadSanitizer, which runs hyriseSystemTest, and thus StressTest.*.

Why does/did the CI not bring up this issue?

Because the referenced stress test is not implemented in the master branch.
If it's implemented, the CI catches it:
https://hyrise-ci.epic-hpi.de/blue/organizations/jenkins/hyrise%2Fhyrise/detail/PR-2453/5/pipeline

julianmenzler · 2022-04-29T15:41:51Z

True. I did not read thoroughly enough, sorry.

mweisgut mentioned this issue Apr 25, 2022

Replace obsolete tbb:zero_allocator by ZeroAllocator #2453

Merged

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Investigate data races during concurrent chunk inserts #2454

Investigate data races during concurrent chunk inserts #2454

mweisgut commented Apr 25, 2022

julianmenzler commented Apr 29, 2022

mweisgut commented Apr 29, 2022 •

edited

julianmenzler commented Apr 29, 2022

Investigate data races during concurrent chunk inserts #2454

Investigate data races during concurrent chunk inserts #2454

Comments

mweisgut commented Apr 25, 2022

julianmenzler commented Apr 29, 2022

mweisgut commented Apr 29, 2022 • edited

julianmenzler commented Apr 29, 2022

mweisgut commented Apr 29, 2022 •

edited