You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
#include<chrono>
#include<cstring>
#include<memory>
#include<string>
#include<utility>
#include"rclcpp/rclcpp.hpp"
#include"sensor_msgs/msg/image.hpp"
#include<sensor_msgs/image_encodings.hpp>
#include<cv_bridge/cv_bridge.h>
#include<opencv2/opencv.hpp>usingnamespacestd::chrono_literals;usingnamespacecv_bridge;classTalker : publicrclcpp::Node {
private:using Topic = sensor_msgs::msg::Image;
public:explicitTalker(const rclcpp::NodeOptions &options)
: Node("image_talker", options) {
m_input_cvimage->image = cv::imread("./res/img/205_182.png");
// m_input_cvimage->image = cv::imread("./res/img/1024_768.jpeg");// m_input_cvimage->image = cv::imread("./res/img/1920_1080.jpg");
m_input_cvimage->header.frame_id = "camera_link";
m_input_cvimage->encoding = "bgr8";
// cv::imshow("input image", m_input_cvimage->image);// cv::waitKey(0);auto publishMessage = [this]() -> void {
m_input_cvimage->header.stamp = now();
auto msg = std::make_shared<Topic>();
// Note that msg.data is a std::array generated by the IDL compiler
m_input_cvimage->toImageMsg(*msg);
RCLCPP_INFO_ONCE(this->get_logger(), "Publishing with ts: %u.%u", m_input_cvimage->header.stamp.sec, m_input_cvimage->header.stamp.nanosec);
m_start_ts = now();
m_publisher->publish(std::move(*msg));
auto time_offset_ms = (now() - m_start_ts).nanoseconds() / 1000000.0F;
// RCLCPP_INFO(this->get_logger(), "publish timeout with %.3f ms", time_offset_ms);RCLCPP_INFO_EXPRESSION(this->get_logger(), time_offset_ms > 1.0, "publish timeout with %.3f ms", time_offset_ms);
// We gave up ownership and loanedMsg is not supposed to be accessed// anymore
m_count++;
};
// rclcpp::QoS qos(rclcpp::KeepLast(10));// rclcpp::QoS qos(rclcpp::SensorDataQoS());
rclcpp::QoS custom_qos_profile = rclcpp::QoS(rclcpp::QoSInitialization::from_rmw(rmw_qos_profile_default))
.history(rmw_qos_history_policy_t::RMW_QOS_POLICY_HISTORY_KEEP_LAST)
.keep_last(5)
.reliability(rmw_qos_reliability_policy_t::RMW_QOS_POLICY_RELIABILITY_BEST_EFFORT)
.durability(rmw_qos_durability_policy_t::RMW_QOS_POLICY_DURABILITY_VOLATILE)
.avoid_ros_namespace_conventions(false);
m_publisher = this->create_publisher<Topic>("image", custom_qos_profile);
// Use a timer to schedule periodic message publishing.
m_timer = this->create_wall_timer(10ms, publishMessage);
}
private:uint64_t m_count = 1;
rclcpp::Publisher<Topic>::SharedPtr m_publisher;
rclcpp::TimerBase::SharedPtr m_timer;
std::shared_ptr<cv_bridge::CvImage> m_input_cvimage{std::make_shared<cv_bridge::CvImage>()};
rclcpp::Time m_start_ts{0, 0, RCL_ROS_TIME};
};
intmain(int argc, char *argv[]) {
rclcpp::init(argc, argv);
rclcpp::NodeOptions options;
rclcpp::spin(std::make_shared<Talker>(options));
rclcpp::shutdown();
return0;
}
Expected behavior
the publish timeout log should never show, because the typical publish block time is 0.023ms.
RCLCPP_INFO_EXPRESSION(this->get_logger(), time_offset_ms > 1.0, "publish timeout with %.3f ms", time_offset_ms);
Actual behavior
with a 1min test experiment, there show many publish blocking log.
[INFO] [1712730413.187229684] [image_talker]: Publishing with ts: 1712730413.187221484
[INFO] [1712730455.772330021] [image_talker]: publish timeout with 35.392 ms
[INFO] [1712730458.092792171] [image_talker]: publish timeout with 5.929 ms
[INFO] [1712730461.802849221] [image_talker]: publish timeout with 9.390 ms
[INFO] [1712730528.413606018] [image_talker]: publish timeout with 4.516 ms
[INFO] [1712730554.978534268] [image_talker]: publish timeout with 4.702 ms
[INFO] [1712730568.429477942] [image_talker]: publish timeout with 1.766 ms
[INFO] [1712730579.609595818] [image_talker]: publish timeout with 2.998 ms
[INFO] [1712730579.637526415] [image_talker]: publish timeout with 1.362 ms
[INFO] [1712730705.313449333] [image_talker]: publish timeout with 8.309 ms
[INFO] [1712730705.337222862] [image_talker]: publish timeout with 1.668 ms
[INFO] [1712730706.387663607] [image_talker]: publish timeout with 2.402 ms
[INFO] [1712730712.672717914] [image_talker]: publish timeout with 7.183 ms
[INFO] [1712730760.985228138] [image_talker]: publish timeout with 5.793 ms
[INFO] [1712730785.781393856] [image_talker]: publish timeout with 26.846 ms
[INFO] [1712730798.727746541] [image_talker]: publish timeout with 2.861 ms
[INFO] [1712730806.782316055] [image_talker]: publish timeout with 27.989 ms
[INFO] [1712730864.781304626] [image_talker]: publish timeout with 5.097 ms
[INFO] [1712730865.357020724] [image_talker]: publish timeout with 4.860 ms
[INFO] [1712730887.712930304] [image_talker]: publish timeout with 5.525 ms
[INFO] [1712730899.765927433] [image_talker]: publish timeout with 1.542 ms
[INFO] [1712731161.108500411] [rclcpp]: signal_handler(signum=2)
Additional information
Blocking behavior https://www.eprosima.com/index.php/resources-all/performance/dds-asynchronous-vs-synchronous-publishing
The DDS standard states that the write operation may block in some specific situations, especially when the History is completely filled. Therefore, it is important to correctly configure the History management related Quality of Service (QoS) policies to prevent this from happening. This QoS policies are: History QoS Policy: how many samples are kept within the history. It could be set to KEEP ALL the samples (within the available resource limits explained next) or to KEEP LAST which keeps only a given number of samples (also known as History Depth). Resource Limits QoS Policy: this policy controls the maximum resources available. Durability QoS Policy: how long the samples are kept within the History in case that a late joiner wants to access those samples. Reliability QoS Policy: this policy specifies the delivery of the samples. If set to BEST EFFORT, the sample is sent only once and no confirmation of reception is expected whereas if this QoS is set to RELIABLE, a confirmation for all samples is expected. This latter configuration may be troublesome when sending large samples that need fragmentation because the sample’s lost fragments would have to be delivered again (with the associated network load). Without receiving the acknowledgement, the RELIABLE History is not allowed to dispose of samples.
It is more common to fill the History when the publication mode is ASYNCHRONOUS, as the data is being sent in a separate thread. For example, if the user wants to send large samples and has selected the RELIABLE Reliability QoS Policy but the network is lossy, there is a high probability that the application will block for some time (the write operation keeps adding samples to the History which is not being emptied because the data is being lost in the network). In this case, it is important to know the limits of your network and set up a proper flow controller, and to tune correctly the resource limits for your application. Therefore, configuring the Quality of Service (QoS) of your DataWriters and DataReaders is paramount to prevent this blocking behavior.
Finally, the SYNCHRONOUS publication mode also blocks during callbacks executions. This is important because it can cause a thread deadlock. This is the reason why it is not advisable to call any of the Fast DDS APIs within the callback methods (except for the read and take operations), because it is an undefined behavior due to race conditions and can easily cause a deadlock.
@MiguelCompany@EduPonz Could you please handle/triage this issue?
The 35ms delay looks like an unreasonably long delay for a simple publish() call with the
Bug report
Required Info:
ros2 env is using fastrtps with local only communication
fastdds xml is using async publishing with datasharing.
Steps to reproduce issue
the code is such a typical image publisher
Expected behavior
the publish timeout log should never show, because the typical publish block time is 0.023ms.
Actual behavior
with a 1min test experiment, there show many publish blocking log.
Additional information
this issue can be related #338
QSTs
The text was updated successfully, but these errors were encountered: