hintedhandoff_additional_test.TestHintedHandoff.test_hintedhandoff_sync_point_api
failure: sync_point DONE instead of IN_PROGRESS; and hang during shutdown
#18733
Labels
Failure spotted in dtest-debug job
https://jenkins.scylladb.com/job/scylla-master/job/dtest-debug/295/testReport/junit/hintedhandoff_additional_test/TestHintedHandoff/Run_Dtest_Parallel_Cloud_Machines___FullDtest___full_split006___test_hintedhandoff_sync_point_api/
https://jenkins.scylladb.com/job/scylla-master/job/dtest-debug/295/testReport/junit/hintedhandoff_additional_test/TestHintedHandoff/Run_Dtest_Parallel_Cloud_Machines___FullDtest___full_split006___test_hintedhandoff_sync_point_api_2/
The two jenkins pages refer to the same test, but different causes of failure.
One failure was because there was an assertion failure
The second failure was because the dtest framework found an unexpected coredump from node1 on shutdown.
The logs of node1 at the end look like this:
And dtest-gw0.log says this:
So what happened is that shutdown of the node hanged (somewhere inside
hints_manager
), dtest framework timed out, told the process to generate a coredump, and then killed it --- and then failed because coredump was found. (The failure report is not perfect, dtest framework should say that it timed out waiting for node shutdown, but I digress)GitHub doesn't allow me to upload the coredump because it's over 25MB (even after compression). Could be useful to debug the issue though, to understand where the process hanged -- so whoever sees this issue I recommend downloading the coredump locally before artifacts get cleared.
Anyway uploading logs:
dtest-gw0.log
node1.log
node2.log
Could it be related to the recent huge change in hints? 64ba620
The text was updated successfully, but these errors were encountered: