Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tf_remap very CPU-intensive #175

Open
peci1 opened this issue Sep 5, 2018 · 4 comments
Open

tf_remap very CPU-intensive #175

peci1 opened this issue Sep 5, 2018 · 4 comments

Comments

@peci1
Copy link

peci1 commented Sep 5, 2018

I was wondering what's eating my CPU when replaying bagfiles, and I found out it's tf_remap. In the bagfile, I have TF messages at about 1-2 kHz, and the code I'm running spawns like 10 TF listeners. This is enough to saturate 2 CPU cores on my machine just by running the tf_remap node.

I verified the remapping itself is not the bottleneck - the bottleneck really is the rospy.Publisher.publish(), probably not able to publish at such a high rate from python to that many subscribers.

So I've written an experimental C++ tf_remap, and the CPU load is much lower (about 30-40% of one CPU core). This package also solves a few more pain points of tf_remap, like the inability to work with tf_static, or the inability to remove TF frames.

Now the question is: do we want to merge this package as a tool to geometry2 (not to geometry, since it's TF2-native)? Or should I rather release it as a standalone package? I assume that completely substituting the current Python node with the C++ one is not an option for many reasons...

Another thing after the previous question is solved: should there be a warning added to tf/tf_remap that it's known to be inefficient, and with a link to the C++ node if the user observes performance problems?


Here's the code that can easily achieve the high-load situation I'm describing (note that tf/static_transform_publisher does not use /tf_static, but it periodically publishes to /tf, in this case at 1 kHz):

#!/bin/bash

trap "trap - SIGTERM && kill -- -$$" SIGINT SIGTERM EXIT

roscore -p $(echo $ROS_MASTER_URI | cut -d':' -f3) &

sleep 5

for i in `seq 1 10`; do
	rosrun tf tf_monitor 2>/dev/null 1>/dev/null &
done

rosrun tf static_transform_publisher 0 0 0 0 0 0 a b 1 /tf:=/tf_old &
rosrun tf static_transform_publisher 0 0 0 0 0 0 d e 1 /tf:=/tf_old &

rosrun tf tf_remap _mappings:='[{"old": "b", "new": "c"}]' &

htop -p $(ps aux | grep tf_remap | grep -v grep | sed 's/\s\+/ /g' | cut -d' ' -f2)


For completeness, I attach profiler outputs from profiling the original Python node and the C++ node: profilers.zip . In Python, thread.lock.acquire takes almost all the self time, seconded by write_data with 1.5% . In C++, pthread_mutex_lock is also the most time-consuming call, but still taking only 17% of the self time.

@chadrockey
Copy link
Member

This sounds promising. I'll wait for other to chime in, but if there are some sort of tests or other method to prove that the C++ node performs equivalent with the same usage to the python node, I would be in support of adding a C++ target with the name tf_remap and moving the python script ot tf_remap_legacy or similar. Could add a ROS out warning for one distro to make users aware of the change. The only breaking usage case I can think of off the top of my head is if a user launches the script with python /opt/ros/.../tf_remap. rosrun, roslaunch, and ./ should all work.

I would prefer bringing the C++ version upstream for wider distribution in either case. I should be available to help this process.

@tfoote
Copy link
Member

tfoote commented Sep 8, 2018

Certainly if we can get better performance from the C++ version it's definitely worth considering switching.

@peci1
Copy link
Author

peci1 commented Mar 10, 2019

As this kind of timed-out, I released the package into melodic separately. However, I'm still open for the possibility of adding it to ros/geometry.

@peci1
Copy link
Author

peci1 commented Apr 29, 2019

Should I still start a PR replacing the Python version?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants