Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add test to check for execution order of entities in various executors #2536

Closed

Conversation

wjwwood
Copy link
Member

@wjwwood wjwwood commented May 23, 2024

This is a regression test for the issue reported in #2532, it's meant to be a notice when this changes, but the order its testing for is not guaranteed by the Executor API. See the above issue for more information.

I expect this new test to fail on rolling and jazzy and probably pass on Iron, but I still need test that, I've only been working on rolling. I have another forthcoming pr to propose a fix for jazzy, but I wanted the new test separate so it could be back ported to jazzy and/or iron separately.

@wjwwood
Copy link
Member Author

wjwwood commented May 23, 2024

See #2537 for a proposed fix that makes this pass on rolling.

@jmachowinski
Copy link
Contributor

Nitpick, the tests only covers waitables. timers, subscriptions etc are not covered.

@wjwwood
Copy link
Member Author

wjwwood commented May 23, 2024

That's true, but I do not believe it matters, they are handled the same way. I can look at adding some use of the other entities, but they are less convenient to work with in tests.

@wjwwood
Copy link
Member Author

wjwwood commented May 28, 2024

I added a test for subscriptions, which revealed that the EventsExecutor orders them oddly, perhaps as a race or perhaps in some other order (it seems consistent to me), but I haven't looked into why yet.

@wjwwood
Copy link
Member Author

wjwwood commented May 30, 2024

@jmachowinski I also just pushed changes to test timers, which was a good call to test as they actually always go in "call order" (to which I interpreted expiration order), and that means that the expected execution order is always the call order/expiration order, not the registration order, which is different from waitables.

For subscriptions, it is the registration order, but not in the case of the events executor, which does it based on the events order which is driven by the rmw implementation.

I'll also add service clients and servers, but I hypothesize they will behave like subscriptions.

@jmachowinski
Copy link
Contributor

@wjwwood FYI We talked about this bug / PR in the client library working group meeting. The consensus was, that the order being implicit the registration order is too implicit. It would be more desirable, to add an extra priority parameter, and use this one to determine the execution order for the special case that multiple entities are ready. We think this would be the best way forward for rolling.

@alsora @fujitatomoya Did I forget anything ?

@wjwwood
Copy link
Member Author

wjwwood commented Jun 13, 2024

I added an Iron version of this which, as expected, passes without changes to rclcpp: #2561

I think with that, this test is at least ready to be merged to the iron branch. We will not be able to merge this pr (it was for discussion purposes), but instead it will have to be combined with a fix to make it pass, like in #2537, or some other solution if we find a better one.


FYI We talked about this bug / PR in the client library working group meeting. The consensus was, that the order being implicit the registration order is too implicit. It would be more desirable, to add an extra priority parameter, and use this one to determine the execution order for the special case that multiple entities are ready. We think this would be the best way forward for rolling.

Well then this parameter would either be required for every entity the user creates (which I think would be too annoying to be a requirement) or we'd still need a fallback tie-breaker when they opt-out of that option, for which I think registration order is probably as good of a heuristic as any.

I'd actually prefer to see us expose the scheduling function, so that the user can do arbitrary logic to decide which order to execute things in when multiple things are ready simultaneously. Then they could do them in any order they like, and we could provide a few obvious default options like registration order or user-defined priorities.

@jmachowinski
Copy link
Contributor

Well then this parameter would either be required for every entity the user creates (which I think would be too annoying to be a requirement) or we'd still need a fallback tie-breaker when they opt-out of that option, for which I think registration order is probably as good of a heuristic as any.

My thoughts were in the direction of putting the priority in the the rclcpp::PublisherOptions etc and have it default to zero.
I would then also state in the documentation, that in case of equal priorities, there is no guaranteed order. Anybody who really cares about this corner case can than assign priorities, and for the 99% who probably won't care there would be no change.

@wjwwood
Copy link
Member Author

wjwwood commented Jun 13, 2024

That's fair, explicitly not maintaining the order is what we're doing now (at least from an API level), but this test/fix is more to avoid being unnecessarily disruptive.

But I do agree that letting the executor implementations have room to order them differently (when given equal priority) may be a necessary bit of UB to allow them to optimize and innovate.

@wjwwood
Copy link
Member Author

wjwwood commented Jun 13, 2024

In summary, I'm in favor of the idea, but in addition to this test and my suggested fix.

@jmachowinski
Copy link
Contributor

In summary, I'm in favor of the idea, but in addition to this test and my suggested fix

How hard is it to remove the test later on, if we go for the favorable solution ? I want to avoid, that we lock us in now...

For jazzy we all agreed, that we should try to reproduce the 'old' behavior / add the test.

@wjwwood
Copy link
Member Author

wjwwood commented Jun 18, 2024

How hard is it to remove the test later on, if we go for the favorable solution ? I want to avoid, that we lock us in now...

Easy, in fact the test explicitly describes itself as testing UB and is there just to let us know when we change it, perhaps unintentionally:

// The purpose of this test is to check that the order of callbacks happen
// in some relation to the order of events and the order in which the callbacks
// were registered.
// This is not a guarantee of executor API, but it is a bit of UB that some
// have come to depend on, see:
//
// https://github.com/ros2/rclcpp/issues/2532
//
// It should not be changed unless there's a good reason for it (users find it
// the least surprising out come even if it is not guaranteed), but if there
// is a good reason for changing it, then the executors effected can be skipped,
// or the test can be removed.
// The purpose of this test is to catch this regressions and let the authors of
// the change read up on the above context and act accordingly.

So if in the future this behavior changes on purpose, we can remove or modify this test at will, without concern to breaking behavior. At least in my opinion.


For jazzy we all agreed, that we should try to reproduce the 'old' behavior / add the test.

Ok then I will push for #2537 to get merged and close this issue out for now. If anyone wants to work on adding explicit priorities as an option after that, I think that would be great.

@wjwwood
Copy link
Member Author

wjwwood commented Jun 18, 2024

I'm closing this in favor of #2537 (which includes this pr plus a fix) and #2561 (which adds this test in Iron, showing it used to pass before Jazzy, without modification).

@wjwwood wjwwood closed this Jun 18, 2024
@wjwwood wjwwood deleted the wjwwood/deterministic_execution_order_ub_test branch June 18, 2024 23:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants