Description
In CI the test_node gtest target of autoware_ground_filter times out at the
60 s CTest limit. The suite gets through 24 of 25 cases successfully, then
hangs on GroundFilterComponentTest.TestParameterUpdateBranches. The last
log line emitted before the timeout is the well-known rosout warning:
2: [ RUN ] GroundFilterComponentTest.TestParameterUpdateBranches
2: [WARN] [...] [rcl.logging_rosout]: Publisher already registered for node name: 'GroundFilter'.
2/7 Test #2: test_node ........................***Timeout 60.04 sec
Failing CI run
Root-cause analysis
The warning is the same one that appears in many other passing cases in this
suite. It is rclcpp telling us that a node named GroundFilter already exists
in the process and a second one is being constructed.
That is exactly what TestParameterUpdateBranches does:
GroundFilterComponentTest::SetUp() (test/test_node.cpp:147) calls
rclcpp::init and constructs node_, a GroundFilterComponent named
GroundFilter. It also creates result_sub_, a subscription on node_'s
output topic.
- The body of
TestParameterUpdateBranches (test/test_node.cpp:1181) then
constructs a second GroundFilterComponent with the same name
GroundFilter, while the first one is still alive.
- Each
GroundFilterComponent constructor:
- declares ~30 parameters,
- creates publishers / subscribers,
- calls
setupTF() which builds a tf2_ros::TransformListener with the
default spin_thread = true, spawning a dedicated executor thread that
subscribes to /tf and /tf_static.
- The test then calls
node->set_parameter(...) six times in a row. With two
GroundFilter-named nodes plus their TF-listener helper threads coexisting
in one process, one of these set_parameter calls never returns and the
test hangs to the 60 s timeout.
onParameter (src/node.cpp:659) takes mutex_ and calls get_parameter(...)
unconditionally for several names. If parameter machinery interacts with
duplicate-name node state (e.g. via rosout / parameter services), this is a
plausible deadlock surface.
The smoking-gun evidence that this is the trigger: every other test case that
emits the same Publisher already registered for node name: 'GroundFilter'
warning is one that also constructs a second GroundFilterComponent. The
unique thing about TestParameterUpdateBranches is that it then aggressively
mutates parameters on that second instance.
Suggested fix
Don't run a second GroundFilterComponent while the fixture's node_ is
alive. Either:
- Call
node_.reset() at the top of TestParameterUpdateBranches before
constructing the local one, or
- Reuse the fixture's
node_ and call node_->set_parameter(...) on it (this
mirrors TestParameterUpdate at test/test_node.cpp:345, which does pass).
Longer-term, two GroundFilterComponent instances should be safe to
co-exist (just give them distinct names via NodeOptions), but the immediate
fix is to stop creating duplicates in the test.
Reproduction
colcon test --packages-select autoware_ground_filter \
--ctest-args -R '^test_node$' --output-on-failure
The hang is timing-sensitive (CI hits it; local runs may not), but the
duplicate-node pattern is deterministic in the source.
Description
In CI the
test_nodegtest target ofautoware_ground_filtertimes out at the60 s CTest limit. The suite gets through 24 of 25 cases successfully, then
hangs on
GroundFilterComponentTest.TestParameterUpdateBranches. The lastlog line emitted before the timeout is the well-known rosout warning:
Failing CI run
build-and-testonmain, commita5c9fcdbnt (jazzy, false, false) / build-and-test-jazzy, stepTest, line 6863Root-cause analysis
The warning is the same one that appears in many other passing cases in this
suite. It is rclcpp telling us that a node named
GroundFilteralready existsin the process and a second one is being constructed.
That is exactly what
TestParameterUpdateBranchesdoes:GroundFilterComponentTest::SetUp()(test/test_node.cpp:147) callsrclcpp::initand constructsnode_, aGroundFilterComponentnamedGroundFilter. It also createsresult_sub_, a subscription onnode_'soutputtopic.TestParameterUpdateBranches(test/test_node.cpp:1181) thenconstructs a second
GroundFilterComponentwith the same nameGroundFilter, while the first one is still alive.GroundFilterComponentconstructor:setupTF()which builds atf2_ros::TransformListenerwith thedefault
spin_thread = true, spawning a dedicated executor thread thatsubscribes to
/tfand/tf_static.node->set_parameter(...)six times in a row. With twoGroundFilter-named nodes plus their TF-listener helper threads coexistingin one process, one of these
set_parametercalls never returns and thetest hangs to the 60 s timeout.
onParameter(src/node.cpp:659) takesmutex_and callsget_parameter(...)unconditionally for several names. If parameter machinery interacts with
duplicate-name node state (e.g. via rosout / parameter services), this is a
plausible deadlock surface.
The smoking-gun evidence that this is the trigger: every other test case that
emits the same
Publisher already registered for node name: 'GroundFilter'warning is one that also constructs a second
GroundFilterComponent. Theunique thing about
TestParameterUpdateBranchesis that it then aggressivelymutates parameters on that second instance.
Suggested fix
Don't run a second
GroundFilterComponentwhile the fixture'snode_isalive. Either:
node_.reset()at the top ofTestParameterUpdateBranchesbeforeconstructing the local one, or
node_and callnode_->set_parameter(...)on it (thismirrors
TestParameterUpdateattest/test_node.cpp:345, which does pass).Longer-term, two
GroundFilterComponentinstances should be safe toco-exist (just give them distinct names via
NodeOptions), but the immediatefix is to stop creating duplicates in the test.
Reproduction
The hang is timing-sensitive (CI hits it; local runs may not), but the
duplicate-node pattern is deterministic in the source.