Replies: 5 comments
-
Hi @MMarcus95 , We have replicated this same context you are describing and have not encountered any latency. Based on your description, the observed latency behavior is likely due to CPU resource contention and not an actual DDS-related issue. The workload simulation inside the DataWriter's loop can monopolize the CPU core it is running on, potentially delaying the other Fast DDS middleware threads if they are running on the same core. To better understand the cause of the issue, please provide additional details about your system’s configuration, including the number of CPU cores, and observe the CPU core usage during your tests. Thank you! |
Beta Was this translation helpful? Give feedback.
-
Hi @EugenioCollado, thanks for the feedback. I'm running the test I was talking about inside a Docker Image with Ubuntu 22.04. The CPU is an Intel 13th Gen i7-13700H. In the following there is a screenshot of the lscpu command This is the CPU core usage when using the same workload I was mentioning before (so the condition number for the for loop is 10000000) In this other example instead, I'm reducing the workload, having as condition number of the for loop 100000 In both cases, the sender thread is indeed using around 100% of CPU. However, I see that the message latency is much lower in the second case, as you can see in this plot Please let me know if you need any other information. Thanks! |
Beta Was this translation helpful? Give feedback.
-
Hi @MMarcus95 , After further investigation, we have replicated the behavior you described. It appears that the issue is not directly related to CPU workload but rather to the publication frequency itself. In your case, the workload in the loop was consuming time, effectively reducing the publication frequency. This can be checked by replacing the workload simulation with sleep calls and seeing the issue persists. We believe the observed "latency" is a result of how the kernel and its scheduler manage processes. Specifically, the process might be removed from the core's cache and subsequently reloaded, introducing delays. Our tests show that actions such as isolating a core, forcing the process to run on that core, increasing its priority, and similar mechanisms to prevent the CPU from reloading the process when the publisher is called significantly improve the situation. We would appreciate it if you could test these actions on your side and share your feedback on the results. This would help confirm our conclusions and ensure that these approaches effectively mitigate the issue in your environment. That said, since this behavior stems from kernel-level process management and not from DDS itself, it is not currently on our roadmap to implement changes addressing this issue in Fast DDS. However, if you'd like, you are welcome to reach out to us directly, and we can explore potential solutions tailored to your specific use case. Thank you! |
Beta Was this translation helpful? Give feedback.
-
Hi @EugenioCollado, thanks for useful information. I have done similar tests and these are the results under the "workload condition" 10000000: Let me explain the legend:
In these test I was also using a PREEMPT-RT patch for the linux kernel. I also tried to set FIFO scheduler policy to the dds.asyn and dds.udp threads setting respectively |
Beta Was this translation helpful? Give feedback.
-
Hi @MMarcus95 , Thank you for providing such detailed feedback and for conducting thorough tests. We appreciate your effort in exploring different configurations and sharing the results. It's great to see that the latency has been reduced with those actions. Given the relevance of this subject and the potential interest it may hold for other users, I will move this issue to the discussions section. Thank you again for your feedback! |
Beta Was this translation helpful? Give feedback.
-
Is there an already existing issue for this?
Expected behavior
The time between message publication and message reception does not depend on the workload of the of the process owning the data writer
Current behavior
The time between message publication and message reception depends on the workload of the of the process owning the data writer
Steps to reproduce
I'm testing the message latency between a data writer and a data reader. They belong to separate processes, and I'm using the discovery server, launching the server in a third process. To publish the message I'm using the following while loop
The for loop simulates a workload. RoundTripTimeMsg is instead a custom message defined as follows
The data reader is instead receiving the message and printing the elapsed time between the current time and the time saved in the timestamp field of the received message. Its callback does basically the following
Both the data writer and the data reader use their default QoS.
I have noticed that if inside the while loop I reduce the workload (using a lower number in the condition) the data reader receives a message earlier. If I increase the workload, the message is received later. What is it happening?
Fast DDS version/commit
v3.1.0
Platform/Architecture
Other. Please specify in Additional context section.
Transport layer
UDPv4
Additional context
Platform: Ubuntu Jammy Jellyfish 22.04 amd64
XML configuration file
No response
Relevant log output
No response
Network traffic capture
No response
Beta Was this translation helpful? Give feedback.
All reactions