-
Notifications
You must be signed in to change notification settings - Fork 178
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(api): Do not enqueue json commands on protocol load #14759
feat(api): Do not enqueue json commands on protocol load #14759
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## edge #14759 +/- ##
==========================================
+ Coverage 67.17% 67.19% +0.02%
==========================================
Files 2495 2495
Lines 71483 71405 -78
Branches 9020 8992 -28
==========================================
- Hits 48020 47984 -36
+ Misses 21341 21305 -36
+ Partials 2122 2116 -6
Flags with carried forward coverage won't be shown. Click here to find out more.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great to me! Nice work.
robot-server/tests/integration/http_api/runs/test_json_v6_protocol_run.tavern.yaml
Outdated
Show resolved
Hide resolved
async def _add_command_and_execute(self) -> None: | ||
for command in self._queued_commands: | ||
result = await self._protocol_engine.add_and_execute_command(command) | ||
if result.error: | ||
raise ProtocolCommandFailedError( | ||
original_error=result.error, | ||
message=f"{result.error.errorType}: {result.error.detail}", | ||
) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ignoring queued commands for a moment, this doesn't look like it matches the existing implementation based on QueueWorker
+get_next_to_execute()
. This does not have:
Special handling ofRunStoppedError
tobreak
instead of raising an errorYielding to the event loop on each command- Pausing on recoverable errors (added in feat(api): Pause run after recoverable errors #14646)
- You can see feat(api): Pause when
pick_up_tip()
errors in a Python protocol #14753 for how I'm doing this for Python protocols. If we're making JSON behave more like Python now, maybe this should work the same way.
- You can see feat(api): Pause when
- Prioritization of
"intent": "setup"
commands- But according to the tests, this is still working, somehow?
Do we need to handle those things, or are they already handled elsewhere in some way that I'm missing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Setup commands are added to the queue in PE but are being executed after the initial home commands instead of at the end of the run. adding prioritization will happen in a follow up pr.
Prioritization of "intent": "setup" commands
But according to the tests, this is still working, somehow?
Special handling of RunStoppedError to break instead of raising an error
Pausing on recoverable errors (added in #14646)
You can see #14753 for how I'm doing this for Python protocols. If we're making JSON behave more like Python now, maybe this should work the same way.
Will add logic in this pr
Yielding to the event loop on each command
I do not think we need this bc add_and_execute_command
is async and add_command
is synch but I will test this to make sure we hare not blocking anything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not think we need this bc add_and_execute_command is async and add_command is synch but I will test this to make sure we hare not blocking anything.
We do need it, unfortunately, for the reasons described in this comment in the old implementation:
opentrons/api/src/opentrons/protocol_engine/execution/queue_worker.py
Lines 80 to 83 in 9322657
await self._command_executor.execute(command_id=command_id) | |
# Yield to the event loop in case we're executing a long sequence of commands | |
# that never yields internally. For example, a long sequence of comment commands. | |
await asyncio.sleep(0) |
Unlike JavaScript, Python's await
isn't guaranteed to yield to the event loop. It will only do so when it calls something that goes back to the event loop, like network I/O or an asyncio.sleep()
.
Other than that, that all makes sense, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. I tested this by adding a background task that prints, then time.sleep(3)
and was able to see the printing sequence while executing the commands. so this test is not efficient? @SyntaxColoring @sfoster1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need the await because, crucially, what we're talking about here is in a different async stack than the lines you linked to @SyntaxColoring . This PR does not replace or even touch the execution queue worker.
The way the engine works, just so we agree, is that there's fundamentally at least three async stacks: the execution stack, which is controlled by the execution queue linked above and owned by the engine, and await
s command implementations; the runner task-queue stack, which is what is changed here and owned by the runner, and (now) dispatches commands via add_and_execute_command
; and the stack that calls the runner's primary interface, which is I think owned by the server's run data manager.
The execution stack, as you mention, definitely needs to have an explicit yield in case it's running a bunch of commands that have no yield points... so it does, it's in what you linked above, which is still what's used and which this PR does not touch.
The server stack does not need to have an explicit yield as long as the runner uses the task queue, since if the runner uses a task queue it await self._task_queue.join()
, which awaits a future, which is a yield point.
The task queue stack (which is what's changed here) does not need to have an explicit yield if it's using something that eventually uses wait_for
because wait_for
await (asyncio.Event())
s, which is a sync point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, that makes sense, thanks. I missed that this was still going through the QueueWorker
under the hood.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which means I was also wrong about this:
Special handling of RunStoppedError to break instead of raising an error
Per the documentation of ProtocolEngine.add_and_execute()
, if the run is stopped, it actually doesn't raise a RunStoppedError
—it returns the command still queued
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@SyntaxColoring
I had a feeling but I wanted to prove that through a test I am trying to add :-)
Which means I was also wrong about this:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@SyntaxColoring I added e2e tests to prove that python and json protocols now behave the same when stopping a protocol mid run
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Disregard my terrifying comments, tested and this does work. App & ODD don't appear to be affected as well!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks excellent!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me to merge, thanks. Here are some comments on the tests that we've talked about in Slack.
robot-server/tests/integration/http_api/runs/test_play_stop_papi.tavern.yaml
Show resolved
Hide resolved
robot-server/tests/integration/http_api/runs/test_play_stop_papi.tavern.yaml
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, TY!
robot-server/tests/integration/http_api/runs/test_play_stop_papi.tavern.yaml
Outdated
Show resolved
Hide resolved
Oh, and I renamed this from |
…pi.tavern.yaml Co-authored-by: Max Marrone <[email protected]>
# Overview closes https://opentrons.atlassian.net/browse/EXEC-352. first step towards fixit commands. do not enqueue json protocol commands. # Test Plan Tested with Json protocols and Postman: - Make sure loading a protocol and executing it are happening within order. - Make sure get run `/commands` returning the list properly with successful commands. - Make sure loading a failed protocol should fail the run and fail the command. - Make sure get run `/commands` for failed runs the list of commands, last command being the failed command. - Fixed e2e test to comply with these changes # Changelog - Do no enqueue commands in PE for Json command upon load. - Execute commands one by one when run get started - same way we do for python protocols. # Review requests Changes make sense? GET run` /commands` will not return the full list of commands if the run did not start - its a change we are doing to make json protocols run like python protocols. are we ok with this? # Risk assessment Medium. need to do smoke tests for Json protocols and make sure these changes do not affect anything. --------- Co-authored-by: Max Marrone <[email protected]>
Overview
closes https://opentrons.atlassian.net/browse/EXEC-352.
first step towards fixit commands. do not enqueue json protocol commands.
Test Plan
Tested with Json protocols and Postman:
/commands
returning the list properly with successful commands./commands
for failed runs the list of commands, last command being the failed command.Changelog
Review requests
Changes make sense?
GET run
/commands
will not return the full list of commands if the run did not start - its a change we are doing to make json protocols run like python protocols. are we ok with this?Risk assessment
Medium. need to do smoke tests for Json protocols and make sure these changes do not affect anything.