Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BT action client of type nav2_msgs::action::NavigateToPose sends ActionNodeErrorCode SEND_GOAL_TIMEOUT #76

Open
AnthonyBreton opened this issue Jun 6, 2024 · 9 comments

Comments

@AnthonyBreton
Copy link

Running ROS 2 Humble on Ubuntu 22.04

The problem

I'm using a behavior tree that has a BT action node to send a goal to the Nav2 navigate_to_pose action server. My problem is that if the robot navigates for more than 10 seconds (which is the default server_timeout of a BT::RosActionNode), the node's onFailure method is triggered and returns the error code SEND_GOAL_TIMEOUT. From what I've read of the bt_action_node.hpp file, this means that the client did not receive a response regarding whether the goal was accepted or not from the server within 10 seconds.

The thing is that even if I haven't received a goal_response, the Nav2 action server runs normally and the robot navigates (since async_send_goal is used I'm guessing). I think the problem lies in the way that bt_action_node.hpp (line 459 and after) checks the future to receive the goal_handle. It seems to never return rclcpp::FutureReturnCode::SUCCESS in the spin_until_future_complete with no delay.

What's weird is that if the navigation time is under 10 seconds, then the BT action node does not fail, but I can see in the prints that the goal_response_callback only gets called after the robot reaches its navigation goal. This means that the response about whether the action goal is accepted or not is only received after the robot is done navigating. I also don't receive any feedback from the action server during the navigation, probably since I have no goal_handle.

I also have another BT action node for docking in my behavior tree (where the action server was custom-made), and I don't seem to have a problem with it since I can see in the debug prints that the goal was accepted and I receive feedback.
Because of this, I'm not sure if it's a Nav2 problem with the way that the navigate_to_pose action server was written or an edge case that BehaviorTree.ROS2 did not consider.

Maybe a fix

I found a way to make it work for the navigate_to_pose action by setting the nodelay in bt_action_node.hpp (line 452) to 10ms instead of 0ms:
auto nodelay = std::chrono::milliseconds(0); -> auto nodelay = std::chrono::milliseconds(10);
I don't have a clear understanding of the reason why the spin_until_future_complete has a timeout of 0ms. I guess it's to have a non-blocking behavior.

I've recreated the same issue with a simple action client node that calls the navigate_to_pose action server and handles the goal accepted response with the same logic that bt_action_node.hpp has. I'll leave the .hpp and .cpp code if you want to test it.

.hpp

#ifndef NAVIGATE_TO_POSE_CLIENT_HPP
#define NAVIGATE_TO_POSE_CLIENT_HPP

#include <rclcpp/rclcpp.hpp>
#include <rclcpp_action/rclcpp_action.hpp>
#include <geometry_msgs/msg/pose_stamped.hpp>
#include <nav2_msgs/action/navigate_to_pose.hpp>

class NavigateToPoseClient : public rclcpp::Node
{
public:
    using NavigateToPose = nav2_msgs::action::NavigateToPose;
    using GoalHandleNavigateToPose = rclcpp_action::ClientGoalHandle<NavigateToPose>;

    explicit NavigateToPoseClient(const rclcpp::NodeOptions & options = rclcpp::NodeOptions());

    void send_goal(const geometry_msgs::msg::PoseStamped & goal_pose);
    bool spin_until_goal_received();

private:
    void goal_response_callback(const rclcpp_action::ClientGoalHandle<NavigateToPose>::SharedPtr & goal);
    void feedback_callback(GoalHandleNavigateToPose::SharedPtr, const std::shared_ptr<const NavigateToPose::Feedback> feedback);
    void result_callback(const GoalHandleNavigateToPose::WrappedResult & result);

    rclcpp_action::Client<NavigateToPose>::SharedPtr action_client_;
    std::shared_future<GoalHandleNavigateToPose::SharedPtr> future_goal_handle_;
    GoalHandleNavigateToPose::SharedPtr goal_handle_;
    rclcpp::Time time_goal_sent_;
    bool goal_received_ = false;
};

#endif // NAVIGATE_TO_POSE_CLIENT_HPP

.cpp

#include "osc_nav/nodes/navigate_to_pose_client.hpp"

NavigateToPoseClient::NavigateToPoseClient(const rclcpp::NodeOptions & options)
    : Node("navigate_to_pose_client", options)
{
    action_client_ = rclcpp_action::create_client<NavigateToPose>(this, "navigate_to_pose");

    // Wait for the action server to be available
    while (!action_client_->wait_for_action_server(std::chrono::seconds(10))) {
        RCLCPP_INFO(this->get_logger(), "Waiting for action server to be available...");
    }
}

void NavigateToPoseClient::send_goal(const geometry_msgs::msg::PoseStamped & goal_pose)
{
    if (!action_client_) {
        RCLCPP_ERROR(this->get_logger(), "Action client not initialized");
        return;
    }

    auto goal_msg = NavigateToPose::Goal();
    goal_msg.pose = goal_pose;

    RCLCPP_INFO(this->get_logger(), "Sending goal");

    auto send_goal_options = rclcpp_action::Client<NavigateToPose>::SendGoalOptions();
    send_goal_options.goal_response_callback = std::bind(&NavigateToPoseClient::goal_response_callback, this, std::placeholders::_1);
    send_goal_options.feedback_callback = std::bind(&NavigateToPoseClient::feedback_callback, this, std::placeholders::_1, std::placeholders::_2);
    send_goal_options.result_callback = std::bind(&NavigateToPoseClient::result_callback, this, std::placeholders::_1);

    future_goal_handle_ = action_client_->async_send_goal(goal_msg, send_goal_options);
    time_goal_sent_ = this->now();
}

bool NavigateToPoseClient::spin_until_goal_received()
{
    auto nodelay = std::chrono::milliseconds(0); // 0ms creates the bug where the goal accpeted is received only at the end. 10ms solves the issue
    auto timeout = rclcpp::Duration::from_seconds(double(10));
    RCLCPP_WARN(this->get_logger(), "Timeout: %f", timeout.seconds());

    auto ret = rclcpp::spin_until_future_complete(this->get_node_base_interface(), future_goal_handle_, nodelay);
    if(ret != rclcpp::FutureReturnCode::SUCCESS)
    {
        if((this->now() - time_goal_sent_) > timeout)
        {
            RCLCPP_ERROR(this->get_logger(), "Goal timed out");
            return false;
        }
        else
        {
            return false;
        }
    }
    else
    {
        goal_received_ = true;
        goal_handle_ = future_goal_handle_.get();
        future_goal_handle_ = {};

        if(!goal_handle_)
        {
            RCLCPP_ERROR(this->get_logger(), "Goal rejected by server");
            return false;
        }
        return true;
    }
}

void NavigateToPoseClient::goal_response_callback(const rclcpp_action::ClientGoalHandle<NavigateToPose>::SharedPtr & goal)
{
    auto goal_handle = goal.get();
    if (!goal_handle) {
        RCLCPP_ERROR(this->get_logger(), "Goal was rejected by the action server");
    } else {
        RCLCPP_INFO(this->get_logger(), "Goal accepted by the action server, waiting for result");
    }
}

void NavigateToPoseClient::feedback_callback(GoalHandleNavigateToPose::SharedPtr, const std::shared_ptr<const NavigateToPose::Feedback> feedback)
{
    RCLCPP_INFO(this->get_logger(), "Received feedback: distance remaining = %f", feedback->distance_remaining);
}

void NavigateToPoseClient::result_callback(const GoalHandleNavigateToPose::WrappedResult & result)
{
    switch (result.code) {
        case rclcpp_action::ResultCode::SUCCEEDED:
            RCLCPP_INFO(this->get_logger(), "Goal succeeded");
            break;
        case rclcpp_action::ResultCode::ABORTED:
            RCLCPP_ERROR(this->get_logger(), "Goal was aborted");
            break;
        case rclcpp_action::ResultCode::CANCELED:
            RCLCPP_ERROR(this->get_logger(), "Goal was canceled");
            break;
        default:
            RCLCPP_ERROR(this->get_logger(), "Unknown result code");
            break;
    }
}

int main(int argc, char ** argv)
{
    rclcpp::init(argc, argv);
    auto node = std::make_shared<NavigateToPoseClient>();

    // Example goal pose, replace with your desired goal
    geometry_msgs::msg::PoseStamped goal_pose;
    goal_pose.header.frame_id = "map";
    goal_pose.header.stamp = node->now();
    goal_pose.pose.position.x = 0.0;
    goal_pose.pose.position.y = -1.0;
    goal_pose.pose.orientation.w = 1.0;

    node->send_goal(goal_pose);

    while (rclcpp::ok())
    {
        if (node->spin_until_goal_received())
        {
            break;
        }
        rclcpp::sleep_for(std::chrono::milliseconds(100)); // Small sleep to prevent busy-waiting
    }

    rclcpp::spin(node);
    rclcpp::shutdown();
    return 0;
}
@marj3220
Copy link
Contributor

marj3220 commented Jun 7, 2024

I think this explains better issue #48

@facontidavide
Copy link
Collaborator

please try branch issue_76 and let me know if problem persists

@AnthonyBreton
Copy link
Author

@facontidavide Unfortunately, it does not seem to solve the issue. I still receive a SEND_GOAL_TIMEOUT error after 10 seconds. I changed the log level of the tree_executor to debug. Here's the traceback, maybe it can help you:

image

... 10s later

image

And my overridden onFailure method of my BT::RosActionNode<Nav2>

image

I still have the same behavior where my other BT action node for docking works properly but not the nav2 one. I also tried adding a max_duration to the spin_some like I did previously with the timeout of the spin_until_future_complete, but it did not work like last time:

client_instance_->callback_executor.spin_some(std::chrono::nanoseconds(10000000));

@facontidavide
Copy link
Collaborator

facontidavide commented Jun 12, 2024

I need a way to reproduce this. Is there any dummy server and client that can be used to reproduce the error?

It seems to me like a potential error in the server. Also, which version of ROS are you using?

@AnthonyBreton
Copy link
Author

I'm running ROS2 Humble on Ubuntu 22.04

I joined a package with a custom BT with a simple sequence where a condition waits for geometry_msgs::msg::Pose on a topic to be received then set it to the blackboard where a Nav2_action_client can take it and send an action goal to the Nav2 action server /navigate_to_pose.

dummy_behavior_tree.zip

All you will need is:

  • A workspace with a pre-setup robot running nav2. I tried it with Turtlebot3 running example of Nav2 and it works I can reproduce the bug https://docs.nav2.org/getting_started/index.html#running-the-example
  • Run the tree_executor node in a separate terminal with this launch cmd ros2 launch dummy_behavior_tree bt.launch.py you can use the verbose parameter to activate or deactivate the StdCoutLogger (verbose:=false)
  • Send in another terminal a goal pose of your choice on this topic /dummy_goal_pose with this cmd line ros2 topic pub /dummy_goal_pose geometry_msgs/msg/Pose "position: x: 0.0 y: 0.0 z: 0.0 orientation: x: 0.0 y: 0.0 z: 0.0 w: 1.0 " -1
  • Make sur the ETA of the navigation is greater than the server_timeout default to 10s to get the problem. If it's lesser than 10s, then you will see that the callback saying that the goal is accepted by the server triggers only once the robot reaches it's goal.

Here's the behavior tree :

image

I hope I didn't leave any bugs in the dummy_behavior_tree package. If I did, let me know, and I'll try to fix them. I can also spend some time helping you solve this bug if you don't have the time. However, I have limited knowledge of the behaviortree.ros2 package, so it might take me longer to find and properly resolve the issue.

@AnthonyBreton
Copy link
Author

@facontidavide, were you able to reproduce the error? I also realized that downloading a zip from a GitHub comment might not be ideal. If you prefer, I can create a public repository with the code instead of the zip file.

@marj3220
Copy link
Contributor

@facontidavide @AnthonyBreton the branch issue_76 doesn't seem to fix the bugs on my side either.

@marj3220
Copy link
Contributor

marj3220 commented Aug 9, 2024

I'm working on a fix on this fork: https://github.com/Oscar-Robotics/BehaviorTree.ROS2/tree/fix_issue_76

It is heavily inspired by Nav2's bt_action_node logic. The timeout for the spin_until_future_complete corresponds to half the time between ticks, so it does not block the execution of the rest of the tree. Right now, the value is hard coded.

The only thing missing before a PR is having a way to know the tick frequency of the tree, probably through the blackboard as Nav2 does:
image

If anyone knows a simple way of doing this, I would love the feedback. @facontidavide @MarqRazz

@pucciland95
Copy link

Hi all,
I confirm that the problem of @AnthonyBreton is also present in my case.
The ROS2 action client correctly triggers the ROS2 action server in my behaviour tree node. Still, the client cannot get acceptance from the server, causing problems like the impossibility of cancelling the action while it is running or the necessity to set server_timeout_ to be bigger than the time the Action Server takes to finish the action.

Setting:
auto nodelay = std::chrono::milliseconds(10);
instead of:
auto nodelay = std::chrono::milliseconds(0);

fixes the problem for me.
Checking the source code of how spin_until_future_complete there is the following line (in humble branch at least):

  // TODO(wjwwood): does not work recursively; can't call spin_node_until_future_complete
  // inside a callback executed by an executor.

Could adding such a slight delay avoid recursion and fix the problem?

Cheers,

Niccolo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants