-
Notifications
You must be signed in to change notification settings - Fork 312
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix exclusive hardware control mode switching on controller failed activation #1522
base: master
Are you sure you want to change the base?
Fix exclusive hardware control mode switching on controller failed activation #1522
Conversation
@firesurfer This PR has the fix that should solve your issue. If you can test it on your hardware and let us know, it would be great. |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #1522 +/- ##
==========================================
- Coverage 88.01% 87.96% -0.05%
==========================================
Files 121 124 +3
Lines 12412 12535 +123
Branches 1109 1117 +8
==========================================
+ Hits 10924 11027 +103
- Misses 1083 1099 +16
- Partials 405 409 +4
Flags with carried forward coverage won't be shown. Click here to find out more.
|
@saikishor Thanks a lot. Just for my understanding: If a controller fails activation now, a command mode switch is issued so that the hardware interface can release internally locked hardware? I try to test this patch with an iron based setup. |
@saikishor I just tested your PR and for me it didn't work so far. It could be that I did something wrong. What I did was:
When I activate the controller it fails as it should. But when I then try to load another controller I run into the same issue as before - the interfaces haven't been stopped. |
@firesurfer yes you are right! The hardware will be told to stop the interfaces that are started for the hardware that the controller needed. This should be good for the hardware component to internally release the locked control modes |
@firesurfer can you share the logs here?. Can you print something in your hardware interfaces to see if you receive the stop interfaces?. Basically, if your controller fails activation, it should immediately receive to release interfaces even before activating a new one. If you go over tests, I'm testing exactly the same |
@firesurfer for initial testing, maybe you can remove the installed dependencies and this might help. For me, after removing the installed dependencies, it worked |
So I removed the installed packages: Log while failing activation:
Log during activation of a
I added a log message in the I am not sure if perhaps rebasing on Iron broke something. |
Hello @firesurfer! It seems like you are not using the changes, because If you use the changes of this branch, your error should be like the following
ros2_control/controller_manager/src/controller_manager.cpp Lines 1509 to 1515 in 4a4d780
In your error log, it doesn't mention about the Releasing interfaces, so please crosscheck your setup. Thank you! |
@saikishor I made a mistake while rebasing onto iron. When I do it properly I run into a lots of conflicts. When I try to resolve them it doesn't compile anymore. Do you have an recommendations how to easily test this with an iron setup ? |
You should be able to compile the rolling stack (also this PR) on your iron distro directly. override should also work, no need to uninstall your binary install. (check with |
Thank you @christophfroehlich. Yes, you can do this. In case you continue to have issues, then try to just cherry-pick the last 2 commits of this PR (77e932e and 4a4d780) onto your iron branch |
@saikishor Thanks. Cherry-picking worked :) When I now have the controller failing I run into an endless loop of ros2control trying to activating the controller - failing - stopping the interfaces - starting them again - failing again...
But I can confirm. The interfaces are stopped!. |
@firesurfer I'm glad that the fix worked for you. If that's the case, do you mind reviewing this PR and approving it? |
@saikishor shouldn't be the issue of running into an endless loop of activation -> failing... be addressed first? |
@firesurfer I don't really see how there is an endless loop. CM never reactivates or tries if it is failing. Maybe the print you have added is printing many times, because you have this component per joint and it is trying to iterate over the HW components to do the prepare and perform switch. This is the behavior inside the Resource Manager and I'm not touching that part. EDIT: How many times does it print the |
@firesurfer I can confirm after testing it again on our setup that the fix works and there is no looping activation from the CM side. Thank you |
As I am seeing the activation failure message again and again (of the controller) I would say this is not just because of the printout in the hardware_interface. I had it running for like 10-20s before killing the process. I am wondering if that could be because I cherry-picked it on the iron branch? |
That would be strange to have this in the iron branch. Can you share some logs? EDIT: Can you also share how you are trying to activate this controller? |
@saikishor I just tested in a clean dockerized environment instead of my host installation and it seems to work fine. I will review and approve the PR then. |
I see your point. However, the thing of failing or not is not in our hand with user controllers, moreover, the LifeCycleNode allows this kind of behavior. So, I think it is reasonable to have this fix. We can warn the users that if this situation arises, the realtimeness is not assured anymore. |
This pull request is in conflict. Could you fix it @saikishor? |
7755df0
to
fc2dfe3
Compare
fc2dfe3
to
f04b680
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reported issue seams reasonable for me to fix, which LGTM except for some copy-paste remnants in the comments.
This pull request is in conflict. Could you fix it @saikishor? |
Co-authored-by: Christoph Fröhlich <[email protected]>
…f switch_controllers
5da7a31
to
381e277
Compare
Co-authored-by: Bence Magyar <[email protected]>
If the hardware supports only one active control mode at a time and when there is a failing controller activation, ros2_control is not propagating this information to the hardware, and the hardware might have the resource locked internally. This PR allows to solve such situations as explained in #1487 and #1486.
Fixes #1487
Closes #1492