Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

some special msg onto /initialpose could lead to heap-buffer-overflow bug #4307

Open
GoesM opened this issue May 7, 2024 · 7 comments
Open

Comments

@GoesM
Copy link
Contributor

GoesM commented May 7, 2024

Bug report

Required Info:

  • Operating System:
    • Ubuntu22.04
  • ROS2 Version:
    • humble
  • Version or commit hash:
    • the latest
  • DDS implementation:
    • defaulted

Steps to reproduce issue

Launch the navigation2 normally, as following steps:

#!/bin/bash
export ASAN_OPTIONS=halt_on_error=0:new_delete_type_mismatch=0:detect_leaks=0:log_pah=asan
source install/setup.bash
export TURTLEBOT3_MODEL=waffle
export GAZEBO_MODEL_PATH=$GAZEBO_MODEL_PATH:/opt/ros/humble/share/turtlebot3_gazebo/models
ros2 launch nav2_bringup tb3_simulation_launch.py headless:=True use_rviz:=False use_composition:=False 

Curious about how nav2 face to topic-interception, i sent validate /PoseWithCovarianceStamped msg onto topic /initialpose, which is like this:

ros2 topic pub /initialpose geometry_msgs/PoseWithCovarianceStamped " 
header:
  frame_id: map
  stamp:
    nanosec: 834647291
    sec: 1707737088
pose:
  covariance:
    - 0.25
    - 0.0
    - 0.0
    - 0.0
    - 0.0
    - 0.0
    - 0.0
    - 0.25
    - 0.0
    - 0.0
    - 0.0
    - 0.0
    - 0.0
    - 0.0
    - 0.0
    - 0.0
    - 0.0
    - 0.0
    - 0.0
    - 0.0
    - 0.0
    - 0.0
    - 0.0
    - 0.0
    - 0.0
    - 0.0
    - 0.0
    - 0.0
    - 0.0
    - 0.0
    - 0.0
    - 0.0
    - 0.0
    - 0.0
    - 0.0
    - 0.06853891909122467
  pose:
    orientation:
      w: 0.0
      x: 0.0
      y: 0.0
      z: 1.0
    position:
      x: -751613824.000000
      y: 0.37812575697898865
      z: 0.0" -1

[notice] the value of x of the position is -751613824.000000, which leads to the heap-buffer-overflow.

Expected behavior

Actual behavior

The ASAN reporting a stack-buffer-overflow bug to me as following, and the nav2_amcl stop its work.

=================================================================
==424016==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x76a4a2e36758 at pc 0x76a4abfe3c61 bp 0x7ffcfd208690 sp 0x7ffcfd208688
READ of size 8 at 0x76a4a2e36758 thread T0
    #0 0x76a4abfe3c60 in pf_cluster_stats (/home/***/nav2_humble/install/nav2_amcl/lib/libpf_lib.so+0x7c60) (BuildId: 92f6f6da07c69e4e8fdb674a2fb2b39a58de9a00)
    #1 0x76a4abfe21f5 in pf_init (/home/***/nav2_humble/install/nav2_amcl/lib/libpf_lib.so+0x61f5) (BuildId: 92f6f6da07c69e4e8fdb674a2fb2b39a58de9a00)
    #2 0x76a4ac546df2 in nav2_amcl::AmclNode::handleInitialPose(geometry_msgs::msg::PoseWithCovarianceStamped_<std::allocator<void> >&) (/home/***/nav2_humble/install/nav2_amcl/lib/libamcl_core.so+0x346df2) (BuildId: ef8b35bb0836c058bcbcd5d12c39479c3c6c37be)
    #3 0x76a4ac5446c6 in nav2_amcl::AmclNode::initialPoseReceived(std::shared_ptr<geometry_msgs::msg::PoseWithCovarianceStamped_<std::allocator<void> > >) (/home/***/nav2_humble/install/nav2_amcl/lib/libamcl_core.so+0x3446c6) (BuildId: ef8b35bb0836c058bcbcd5d12c39479c3c6c37be)
    #4 0x76a4ac726d57 in void std::__invoke_impl<void, void (nav2_amcl::AmclNode::*&)(std::shared_ptr<geometry_msgs::msg::PoseWithCovarianceStamped_<std::allocator<void> > >), nav2_amcl::AmclNode*&, std::shared_ptr<geometry_msgs::msg::PoseWithCovarianceStamped_<std::allocator<void> > > >(std::__invoke_memfun_deref, void (nav2_amcl::AmclNode::*&)(std::shared_ptr<geometry_msgs::msg::PoseWithCovarianceStamped_<std::allocator<void> > >), nav2_amcl::AmclNode*&, std::shared_ptr<geometry_msgs::msg::PoseWithCovarianceStamped_<std::allocator<void> > >&&) (/home/***/nav2_humble/install/nav2_amcl/lib/libamcl_core.so+0x526d57) (BuildId: ef8b35bb0836c058bcbcd5d12c39479c3c6c37be)
    #5 0x76a4ac7561cb in auto rclcpp::AnySubscriptionCallback<geometry_msgs::msg::PoseWithCovarianceStamped_<std::allocator<void> >, std::allocator<void> >::dispatch(std::shared_ptr<geometry_msgs::msg::PoseWithCovarianceStamped_<std::allocator<void> > >, rclcpp::MessageInfo const&)::'lambda'(auto&&)::operator()<std::function<void (std::shared_ptr<geometry_msgs::msg::PoseWithCovarianceStamped_<std::allocator<void> > >)>&>(auto&&) const (/home/***/nav2_humble/install/nav2_amcl/lib/libamcl_core.so+0x5561cb) (BuildId: ef8b35bb0836c058bcbcd5d12c39479c3c6c37be)
    #6 0x76a4ac753061 in rclcpp::AnySubscriptionCallback<geometry_msgs::msg::PoseWithCovarianceStamped_<std::allocator<void> >, std::allocator<void> >::dispatch(std::shared_ptr<geometry_msgs::msg::PoseWithCovarianceStamped_<std::allocator<void> > >, rclcpp::MessageInfo const&) (/home/***/nav2_humble/install/nav2_amcl/lib/libamcl_core.so+0x553061) (BuildId: ef8b35bb0836c058bcbcd5d12c39479c3c6c37be)
    #7 0x76a4ac72fc34 in rclcpp::Subscription<geometry_msgs::msg::PoseWithCovarianceStamped_<std::allocator<void> >, std::allocator<void>, geometry_msgs::msg::PoseWithCovarianceStamped_<std::allocator<void> >, geometry_msgs::msg::PoseWithCovarianceStamped_<std::allocator<void> >, rclcpp::message_memory_strategy::MessageMemoryStrategy<geometry_msgs::msg::PoseWithCovarianceStamped_<std::allocator<void> >, std::allocator<void> > >::handle_message(std::shared_ptr<void>&, rclcpp::MessageInfo const&) (/home/***/nav2_humble/install/nav2_amcl/lib/libamcl_core.so+0x52fc34) (BuildId: ef8b35bb0836c058bcbcd5d12c39479c3c6c37be)
    #8 0x76a4ad6257bb in rclcpp::Executor::execute_subscription(std::shared_ptr<rclcpp::SubscriptionBase>) (/opt/ros/humble/lib/librclcpp.so+0xe77bb) (BuildId: 4cca8a387f3c93d38a0567a8efc7cba9106f5d9a)
    #9 0x76a4ad625fbe in rclcpp::Executor::execute_any_executable(rclcpp::AnyExecutable&) (/opt/ros/humble/lib/librclcpp.so+0xe7fbe) (BuildId: 4cca8a387f3c93d38a0567a8efc7cba9106f5d9a)
    #10 0x76a4ad62d8af in rclcpp::executors::SingleThreadedExecutor::spin() (/opt/ros/humble/lib/librclcpp.so+0xef8af) (BuildId: 4cca8a387f3c93d38a0567a8efc7cba9106f5d9a)
    #11 0x76a4ad62dac4 in rclcpp::spin(std::shared_ptr<rclcpp::node_interfaces::NodeBaseInterface>) (/opt/ros/humble/lib/librclcpp.so+0xefac4) (BuildId: 4cca8a387f3c93d38a0567a8efc7cba9106f5d9a)
    #12 0x64f0866154cd in main (/home/***/nav2_humble/install/nav2_amcl/lib/nav2_amcl/amcl+0xe84cd) (BuildId: 6374af6c6a02284720c5116aa2ef067ebdd75367)
    #13 0x76a4ab829d8f in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
    #14 0x76a4ab829e3f in __libc_start_main csu/../csu/libc-start.c:392:3
    #15 0x64f0865545e4 in _start (/home/***/nav2_humble/install/nav2_amcl/lib/nav2_amcl/amcl+0x275e4) (BuildId: 6374af6c6a02284720c5116aa2ef067ebdd75367)

0x76a4a2e36758 is located 168 bytes to the left of 352000-byte region [0x76a4a2e36800,0x76a4a2e8c700)
allocated by thread T0 here:
    #0 0x64f0865d7618 in __interceptor_calloc (/home/***/nav2_humble/install/nav2_amcl/lib/nav2_amcl/amcl+0xaa618) (BuildId: 6374af6c6a02284720c5116aa2ef067ebdd75367)
    #1 0x76a4abfe1759 in pf_alloc (/home/***/nav2_humble/install/nav2_amcl/lib/libpf_lib.so+0x5759) (BuildId: 92f6f6da07c69e4e8fdb674a2fb2b39a58de9a00)
    #2 0x76a4ac532d69 in nav2_amcl::AmclNode::on_configure(rclcpp_lifecycle::State const&) (/home/***/nav2_humble/install/nav2_amcl/lib/libamcl_core.so+0x332d69) (BuildId: ef8b35bb0836c058bcbcd5d12c39479c3c6c37be)
    #3 0x76a4ad523b8c  (/opt/ros/humble/lib/librclcpp_lifecycle.so+0x28b8c) (BuildId: e9b8e454bf87aaab775667b79aefcab12c018de9)

SUMMARY: AddressSanitizer: heap-buffer-overflow (/home/***/nav2_humble/install/nav2_amcl/lib/libpf_lib.so+0x7c60) (BuildId: 92f6f6da07c69e4e8fdb674a2fb2b39a58de9a00) in pf_cluster_stats
Shadow bytes around the buggy address:
  0x0ed5145bec90: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0ed5145beca0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0ed5145becb0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0ed5145becc0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0ed5145becd0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
=>0x0ed5145bece0: fa fa fa fa fa fa fa fa fa fa fa[fa]fa fa fa fa
  0x0ed5145becf0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0ed5145bed00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0ed5145bed10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0ed5145bed20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0ed5145bed30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==424016==ABORTING

Additional information


it seems like that a high value of x could lead to such bug...

@SteveMacenski
Copy link
Member

You know what to do 😉

@GoesM
Copy link
Contributor Author

GoesM commented May 8, 2024

Ah... I think the situation here might be a bit different and not suitable for validation through validageMsg().

In my understanding, nav2_util::validateMsg() is a generic check mainly for nan, inf, and similar requirements that all messages of the same msg-type should meet. However, the error here is caused by a large value of x in a specific calculation process of nav2_amcl.

And I believe it's more appropriate to perform the check in the callback process of /initialpose in nav2_amcl, rather than adding conditions to nav2_util::validateMsg(). There are two main reasons for this:

  • The requirement for the "extreme value" seems to be specific to nav2_amcl and may not apply to all messages of this msg-type.
  • Exactly setting an appropriate value for the "extreme value" doesn't have a precise answer. Because it might depend on factors such as the size of the map where nav2 operates and the specific circumstances of other packages that use the message of PoseWithCovarianceStamped type.

Overall, I think agreeing on an upper limit for x and implementing it in the callback process of nav2_amcl would be more appropriate.

However, I'm still checking the specific line of code where the overflow occurs and haven't confirmed it yet. I think here would be a better method to fix it after a further check.

SO, I'd further confirm the specific line of code causing the buffer overflow and investigate why JUST a large value (still in the range of integer) here could lead to a heap-buffer-overflow.

@SteveMacenski
Copy link
Member

And I believe it's more appropriate to perform the check in the callback process of /initialpose in nav2_amcl

Ok! I trust your opinion 😄 Let me know what you think is best!

@SteveMacenski
Copy link
Member

@GoesM any update?

@GoesM
Copy link
Contributor Author

GoesM commented May 30, 2024

I'm so sorry for my late work about this issue because I'm struggled by other works. T_T

After the updating humble-branch, validateMsg() is deployed so the command to reproduce this issue needs some little change, as I'd edited it.

Additional Information:

**which code-line met to heap-buffer-overflow

I insert code for log as following:

void goes_debug(int cnt){
  fprintf(stderr, "[GOES|DEBUG]:---------- %d \n", cnt);
}
// Re-compute the cluster statistics for a sample set
void pf_cluster_stats(pf_t * pf, pf_sample_set_t * set)
{
...
...
  goes_debug(3);
  // Compute cluster stats
  for (i = 0; i < set->sample_count; i++) {
    ...
    ...
    goes_debug(4);
    cluster = set->clusters + cidx;
    goes_debug(5);
    cluster->weight += sample->weight;
    goes_debug(6);
    weight += sample->weight;
    goes_debug(7);
    ...
    ...
    
  }
...
}

And I catched the log during the crash:

[amcl-7] [INFO] [1717073213.332957908] [amcl]: Setting pose (5.019000): -751613824.000 0.378 3.142
[amcl-7] [GOES|DEBUG]:---------- 1 
[amcl-7] [GOES|DEBUG]:---------- 2 
[amcl-7] [GOES|DEBUG]:---------- 3 
[amcl-7] [GOES|DEBUG]:---------- 4 
[amcl-7] [GOES|DEBUG]:---------- 5 
[amcl-7] [GOES|DEBUG]:---------- 6 
[amcl-7] [GOES|DEBUG]:---------- 7 
[amcl-7] [GOES|DEBUG]:---------- 4 
[amcl-7] [GOES|DEBUG]:---------- 5 
[ERROR] [amcl-7]: process has died [pid 111426, exit code 1, cmd '/home/goes/ROS_Workstation/humble_fork/install/nav2_amcl/lib/nav2_amcl/amcl --ros-args --log-level info --ros-args -r __node:=amcl --params-file /tmp/launch_params_t5mx02k0 -r /tf:=tf -r /tf_static:=tf_static'].

It proves that the bug occurs in line:

cluster->weight += sample->weight;

And it seems like cluster = set->clusters +cidx would let the cluster became an invalidate pointer.

@GoesM
Copy link
Contributor Author

GoesM commented May 30, 2024

However, I am not an expert about the pf-calculation , and not clear whether such buffer-overflow is related to the size of map or not ?

If related to , I think we could add a check to ensure that the received pose is within the map range ?

@SteveMacenski
Copy link
Member

Sure! That seems sensible

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants