Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🧑‍🌾 test_rosbag2_play_end_to_end flaky on Windows #1437

Open
Crola1702 opened this issue Aug 8, 2023 · 5 comments
Open

🧑‍🌾 test_rosbag2_play_end_to_end flaky on Windows #1437

Crola1702 opened this issue Aug 8, 2023 · 5 comments
Assignees
Labels
bug Something isn't working

Comments

@Crola1702
Copy link
Contributor

Crola1702 commented Aug 8, 2023

Description

Flaky test test_rosbag2_play_end_to_end in windows CI (debug, release and repeated)

Test regressions:

Expected Behavior

Test should pass

Actual Behavior

Test failing because of a timeout

To Reproduce

  1. Run a build in nightly_win_deb, nightly_win_rel or nightly_win_rep
  2. See projectroot.test_rosbag2_play_end_to_end fail

System (please complete the following information)

  • OS: Windows
  • ROS 2 Distro: Rolling
  • Install Method: Source
  • Version: Rolling latest

Additional context

Reference build: https://ci.ros2.org/view/nightly/job/nightly_win_deb/2813/

Test regressions:

Test is failing because of a timeout:

Log output:

[WARN] [1691491268.193064100] [subscription_manager_1691491266065392000]: SubscriptionManager::continue_spinning(..) finished by timeout
C:\ci\ws\src\ros2\rosbag2\rosbag2_tests\test\rosbag2_tests\test_rosbag2_play_end_to_end.cpp(80): error: Expected equality of these values:
  future.wait_for(service_call_timeout_)
    Which is: 4-byte object <01-00 00-00>
  std::future_status::ready
    Which is: 4-byte object <00-00 00-00>
 1/11 Test  #2: test_rosbag2_play_end_to_end .....***Timeout  60.00 sec

Test gets stuck when the error pops up (normally takes 15 seconds to run)


Flakiness ratio (last 15 days)

job_name last_time first_time build_count failure_count failure_percentage
nightly_win_rep 2023-08-16 2023-08-02 15 13 86.67
nightly_win_rel 2023-08-16 2023-08-02 15 14 93.33
nightly_win_deb 2023-08-16 2023-08-02 15 13 86.67

Updated 17-08-2023

Running a diff between ros2 repos in nightly_win_deb#2806 and nightly_win_deb#2807:

@@ -36,7 +36,7 @@ repositories:
   eProsima\Fast-DDS:
     type: git
     url: https://github.com/eProsima/Fast-DDS.git
-    version: 7cf43a62cabc3124721258f02c9257f451dd1971
+    version: 9ae27f174c1a33bf539c589ce2f1d6630052d1b0
   eProsima\foonathan_memory_vendor:
     type: git
     url: https://github.com/eProsima/foonathan_memory_vendor.git
@@ -208,7 +208,7 @@ repositories:
   ros2\mimick_vendor:
     type: git
     url: https://github.com/ros2/mimick_vendor.git
-    version: 6fcd465251c1e62b1dddabf6607712da8a141a2c
+    version: 24f0be689e525dbcf18cce910ac12bae26c07e3c
   ros2\orocos_kdl_vendor:
     type: git
     url: https://github.com/ros2/orocos_kdl_vendor.git
@@ -244,7 +244,7 @@ repositories:
   ros2\rclpy:
     type: git
     url: https://github.com/ros2/rclpy.git
-    version: 913afa019b0d60e7255bd84247dbeab4735d96ea
+    version: 5367703c9812680a563fb904b8c9d187da310bd3
   ros2\rcpputils:
     type: git
     url: https://github.com/ros2/rcpputils.git
@@ -264,7 +264,7 @@ repositories:
   ros2\rmw_connextdds:
     type: git
     url: https://github.com/ros2/rmw_connextdds.git
-    version: 1206113b6eff6d2a6557911866548ff4415c6852
+    version: b57d032cad10a68f0cb0d349a715b9e3c4475cf4
   ros2\rmw_cyclonedds:
     type: git
     url: https://github.com/ros2/rmw_cyclonedds.git
@@ -300,7 +300,7 @@ repositories:
   ros2\rosbag2:
     type: git
     url: https://github.com/ros2/rosbag2.git
-    version: ba199d05954d6e51975c47acd3725cac6267778f
+    version: fda5aea264b4183d94352a7c7caf266ae104aa52

First time this issue was seen:

ros2/rmw_connextdds#26 (comment)

@Crola1702 Crola1702 added the bug Something isn't working label Aug 8, 2023
@Crola1702
Copy link
Contributor Author

@clalancette, this is happening almost consistently in Windows builds. Do you think it's a good time to disable this test while #1342 is resolved? (it seems it would be a long time until we have updates on the fix PR)

@MichaelOrlov
Copy link
Contributor

@Crola1702 @clalancette I guess it hasn't been fully fixed for Windows after my overhaul in player tests per #1297.
I will try to take a look at it this week.
If I will not find a solution we can disable it for Windows again.

@MichaelOrlov
Copy link
Contributor

BTW. I have come across with this failure in one of my recent PR #1423 (comment).
My preliminary analysis was:

Failed PlayEndToEndTestFixture.play_filters_by_topic by timeout since was not able to receive confirmation about service call for player resume in 60 seconds.

@Crola1702
Copy link
Contributor Author

BTW. I have come across with this failure in one of my recent PR #1423 (comment).

I haven't seen this failure in the buildfarm 🤔

@Crola1702
Copy link
Contributor Author

I'm disabling the test in: #1452

@MichaelOrlov MichaelOrlov self-assigned this Aug 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants