Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(api): readjust hardware controller end state in engine cleanup #13431

Conversation

sanni-t
Copy link
Member

@sanni-t sanni-t commented Aug 31, 2023

Overview

While handling a bug in engine cleanup in #13339, we rearranged the sequence of motor disengage, re-engage and home, which undid the work done in to not end a run with motors disengaged. This PR re-rearranges the sequence again, with additional refactors for clarity and upcoming support for disengaging the 96-channel at the end of runs.

See this document for detailed notes on requirements for motor disengaging.

Also separates out the concerns of post-run-tip-drop & post-run-homing-and-disengaging.

Test Plan

  • On a Flex as well as OT2:
    1. Start a protocol run and while a motor is moving, cancel the run abruptly. Check that:
      • the motion is halted immediately.
      • The motor halt is smooth (i.e., not jerky & loud) and is followed by a tip drop and home
    2. Start a maintenance run via HTTP requests (not through the app), load a pipette and verify that a moveTo command succeeds without any MustHomeErrors or other errors indicating that motors are disengaged. (This checks for starting state of maintenance run after a protocol run)
    3. Delete the maintenance run and verify that the gantry does not home.
    4. Restart another maintenance run. Verify that moveTo command still works correctly as in (2), without requiring a Home. (This checks for starting state of maintenance run after a previous maintenance run)
    5. During the maintenance run, send a move command and immediately send a DEL request to delete the maintenance run. Verify that the deletion request fails with EngineConflictError. Then reattempt deletion after the movement is over and it should delete the run successfully and keep the gantry in place.

Changelog

  • Added a PostRunHardwareState enum that clearly defines the allowed end states for hardware after a run is finished.
  • Updated engine.finish() by splitting the drop_tips_and_home argument into:
    • drop_tips_after_run: whether to drop tips during the current run's cleanup
    • post_run_hardware_state: whether to home and/or disengage motors after the run cleanup
  • Hardware control changes:
    • updated api.halt() to optionally disengage motors before performing an abrupt halt/stop
    • made task cancellation and execution manager state update a part of api.stop() so that halt only cares about halting hardware motion while stop does all the cleanup.
  • Updated tests to cover all scenarios of tip drops and post-run-hardware-state.

Review requests

  • Does the code add more clarity after the refactor?
  • Is it fine to have hardware end states in the enum that aren't used yet? I've added them in preparation for 96-channel support.
  • any comments about hardware controller changes?

Risk assessment

Medium. Re-implements the intentions of #13185 so that's a state change sequence we had been using and testing with before the run cancellation bug fix. Thorough testing will make sure we aren't introducing any regressions.

@codecov
Copy link

codecov bot commented Aug 31, 2023

Codecov Report

Merging #13431 (1d2c4ff) into chore_release-7.0.0 (33e89eb) will increase coverage by 0.00%.
Report is 20 commits behind head on chore_release-7.0.0.
The diff coverage is n/a.

Additional details and impacted files

Impacted file tree graph

@@                 Coverage Diff                  @@
##           chore_release-7.0.0   #13431   +/-   ##
====================================================
  Coverage                71.67%   71.67%           
====================================================
  Files                     2431     2427    -4     
  Lines                    67664    67597   -67     
  Branches                  7812     7784   -28     
====================================================
- Hits                     48499    48452   -47     
+ Misses                   17325    17319    -6     
+ Partials                  1840     1826   -14     
Flag Coverage Δ
g-code-testing 96.44% <ø> (ø)
notify-server 89.13% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Changed Coverage Δ
api/src/opentrons/execute.py 54.83% <ø> (ø)
api/src/opentrons/hardware_control/api.py 82.51% <ø> (ø)
api/src/opentrons/hardware_control/ot3api.py 79.37% <ø> (+0.05%) ⬆️
...ns/hardware_control/protocols/motion_controller.py 53.33% <ø> (-1.51%) ⬇️
...pentrons/protocol_engine/create_protocol_engine.py 100.00% <ø> (ø)
...rons/protocol_engine/execution/hardware_stopper.py 100.00% <ø> (ø)
...i/src/opentrons/protocol_engine/protocol_engine.py 100.00% <ø> (ø)
api/src/opentrons/protocol_engine/types.py 97.56% <ø> (ø)
...i/src/opentrons/protocol_runner/protocol_runner.py 100.00% <ø> (ø)
robot-server/robot_server/runs/engine_store.py 95.91% <ø> (ø)

... and 30 files with indirect coverage changes

Comment on lines +650 to +652
- for 96-channel:
- HOME_THEN_DISENGAGE after protocol runs
- DISENGAGE_IN_PLACE after maintenance runs
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anticipated. Not finalized yet and not used anywhere either.

@sanni-t sanni-t marked this pull request as ready for review August 31, 2023 16:16
@sanni-t sanni-t requested a review from a team as a code owner August 31, 2023 16:16
@sanni-t
Copy link
Member Author

sanni-t commented Aug 31, 2023

Tested on an OT-2:

Start a protocol run and while a motor is moving, cancel the run abruptly. Check that:

  • the motion is halted immediately.
  • The motor halt is smooth (i.e., not jerky & loud) and is followed by a tip drop and home
  • Delete the maintenance run and verify that the gantry does not home.
  • Restart another maintenance run. Verify that moveTo command still works correctly as in (2), without requiring a Home. (This checks for starting state of maintenance run after a previous maintenance run)

Copy link
Member

@sfoster1 sfoster1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks really good, I like the enum approach here.

Comment on lines +500 to +501
await self._backend.hard_halt()
await self._backend.halt()
Copy link
Member Author

@sanni-t sanni-t Aug 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not fully sure about this change. I've made this change assuming that backend.hard_halt() is equivalent to disengage and backend.halt() is equivalent to stop_motors().



class PostRunHardwareState(Enum):
"""State of robot gantry & motors after a stop is performed and the hardware API is reset.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"""State of robot gantry & motors after a stop is performed and the hardware API is reset.
"""State of robot gantry & motors after the run is finished and the hardware API is reset.

@jbleon95
Copy link
Contributor

Tested the following on a Flex:

Start a protocol run and while a motor is moving, cancel the run abruptly. Check that:

  • the motion is halted immediately and the motor halt is smooth (i.e., not jerky & loud) and is followed by a tip drop and home
  • Starting a maintenance run and issuing a moveTo command moves the gantry and does not raise a MustHomeError or motor error
  • Delete the maintenance run and verify that the gantry does not home.
  • Restart another maintenance run. Verify that moveTo command still works correctly without requiring a Home. (This checks for starting state of maintenance run after a previous maintenance run)
  • Deleting the maintenance run mid-move did not cancel the run and raised a RunNotIdle error
  • Deleting the maintenance run after the move had completed deleted the run successfully and did not cause the gantry to move

@SyntaxColoring
Copy link
Contributor

We think this is going to help with RQA-1450. Per discussion in that ticket, we might want to tweak behavior slightly—for example, never home after a protocol failure? We're going to merge this now, and then discuss any changes that might need to be done on top of this.

Thank you again for getting to the bottom of this!

@SyntaxColoring SyntaxColoring merged commit 2a502b1 into chore_release-7.0.0 Sep 1, 2023
27 of 32 checks passed
@SyntaxColoring SyntaxColoring deleted the api-readjust-hardware-controller-cleanup-in-engine-cleanup branch September 1, 2023 14:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants