fix(robot-server): maintain correct order of protocol analyses #14762

sanni-t · 2024-04-01T14:57:28Z

Overview

Updates the /protocols endpoints to always maintain the order of list of analyses as most-recently-started-analysis last, making sure to verify if a new analysis needs to be triggered because of new run-time-parameter values for a previously uploaded protocol.

To achieve that, this PR does the following things:

updates the analysis table in the database to include a new column where we save info about the RTP values used in the analysis in question.
adds functionality to read and write those values to/ from database
adds functionality to compare the new RTP values (if any) sent by the client, with those used in the most recent analysis.
updates the protocol upload endpoint to:
- If the protocol did not exist before: create a new protocol resource, start a new analysis on that resource (existing functionality)
- If the protocol exists and its most recent analysis used the same parameters' values as the ones in current request, do not create a new protocol resource, and do not trigger a new analysis
- If the protocol exists and its most recent analysis used different parameters' values than the ones in current request, do not create a new protocol resource, but trigger a new analysis using the new RTP override values.

Test Plan

(Can use test protocol from this PR)

Upload a protocol with Run-time parameters defined, but no RTP values specified. Verify that analysis contains correct RTP values (should be the default values)
Upload the same protocol, without any RTP values again, verify that there is still only 1 analysis for the protocol and that's the one from before
Upload same protocol, with some RTPs explicitly set to default values in the request. Verify that there's still only the one analysis
Upload same protocol with some RTPs set to non-default values in the request. Verify that a new analysis is added to the list of analyses and this analysis uses the new RTP values
Upload same protocol with no RTP values specified again. Verify that another analysis is appended to the end of the list of analyses and this one uses the default RTP values.

Review requests

Make sure code & comments make sense
See if I've missed any cases

Risk assessment

Medium. Does database update and fixes the analysis order that was broken by #14688

…for it

…g' checker

codecov · 2024-04-01T16:19:08Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 67.19%. Comparing base (0fbb4c7) to head (8c96653).
Report is 29 commits behind head on edge.

Additional details and impacted files

@@            Coverage Diff             @@
##             edge   #14762      +/-   ##
==========================================
+ Coverage   67.17%   67.19%   +0.02%     
==========================================
  Files        2495     2495              
  Lines       71483    71405      -78     
  Branches     9020     8992      -28     
==========================================
- Hits        48020    47984      -36     
+ Misses      21341    21305      -36     
+ Partials     2122     2116       -6

Flag	Coverage Δ
g-code-testing	`92.43% <ø> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

Files	Coverage Δ
api/src/opentrons/protocol_engine/types.py	`97.50% <ø> (ø)`
...t-server/robot_server/protocols/analysis_models.py	`100.00% <ø> (ø)`
...ot-server/robot_server/protocols/analysis_store.py	`100.00% <ø> (ø)`
robot-server/robot_server/protocols/router.py	`100.00% <ø> (ø)`

SyntaxColoring

Yay!

SyntaxColoring · 2024-04-02T13:34:08Z

robot-server/robot_server/protocols/analysis_store.py

+        # We only check for matching RTP values if the given protocol ID
+        # (& hence its summary) exists. So this assertion should never be false unless
+        # somehow a protocol resource is created without an analysis record.
+        assert len(analyses) > 0, "This protocol has no analysis associated with it."


This can happen, can't it?

Upload a protocol. This starts an analysis in the background.

Shut off the robot before the analysis completes.

Power on the robot. The protocol will exist without any analyses.

That looks right. Previously, that would result in the responses providing an empty list of analyses while now it would error out. I feel like raising an error is slightly better than just returning an empty list, but both approaches make the protocol resource unusable without having a way to force reanalysis.

Hmm.. it might be better to return False here so that it triggers a new analysis. There shouldn't be any risk of corrupting the database if we do that, ya?

Yeah, returning False and triggering a new analysis seems like a good idea. No, I don't see it causing any database problems.

SyntaxColoring · 2024-04-02T13:36:34Z

robot-server/robot_server/protocols/analysis_store.py

+        assert (
+            last_analysis.status != AnalysisStatus.PENDING
+        ), "Protocol must not already have a pending analysis."


This can also trigger, I think?

Upload a protocol. This starts an analysis in the background.

Before analysis completes, quickly upload the same protocol.

[Edit: Oh, yeah, and this is causing this.]

OK so correct me if I'm wrong, but as far as I can tell, the background is this: You don't want to start a redundant analysis. To determine if an analysis is redundant, you need the overrides supplied to the last analysis, and the protocol's default values when no overrides are supplied. To get the protocol's default values, you need some analysis to have completed.

If that's right, then it seems like a solvable problem.

What we want, ideally, is for the server to be able to get a protocol's available parameters and their default values quickly—say, within 1–2 seconds—without having to do a full analysis with aspirate()s and dispense()s and everything.

So when a client does POST /protocols:

If there are no analyses for that protocol yet:

Start one.

If the last analysis is completed:

And it has the same runtime parameters as this POST request:

Keep that one.

And it has different runtime parameters from this POST request:

Start a new one.

If the last analysis is pending:

Wait until we have details about the protocol's runtime parameters, which should only take 1–2 seconds. Then, handle as above.

This is all seems readily possible to me. The run-time parameters Python Protocol API was designed for it, with its isolated add_parameters() function, which we can call without calling run().

Does that seem doable?

That's a good idea. Maybe we can aim for it in the long term (or even medium term) but it's not doable in the current architecture of the protocol runner and analyzer. Even though add_parameters() is separate from run(), for a few reasons discussed previously, we decided to make handling the actual parameters parsing and setting as part of protocol execution (inside the simulating runner in case of analysis) as opposed to handling it inside an intermediate layer like the protocol reader. Which means that the RTP information is not available until the analysis is complete.

Let's say we refactor the runner to make the protocol's RTP values accessible before the analysis is completed, we still have the protocol analysis & analysis store working on this binary concept of 'pending' vs 'completed' analysis where a protocol's info is only available in the CompletedAnalysisStore. For the 'pending with RTP values parsed' state, we will need to introduce a new functionality for either the pending store or just redo this concept of the binary states for analysis store.

Once we make even that change so that the server can now successfully cross-check against the most recent RTP values, we will have to decide what happens if the new request used different RTP values than the ones in a pending request; do you start another analysis right away and keep the last one running too? Or do you cancel the previous analysis?

So there's a lot of changes that we will need to make in order to implement the behavior you are describing.
I agree that it's a better behavior though. I'm going to make a ticket for it. But until then, I think it's totally fine to say that you will need to wait for the current analysis to complete before reuploading the protocol. I believe the app makes it nearly impossible to re-upload a protocol anyway because of how runs are created. So I don't think it's a case you would run into easily.

OK, your call!

In that case, though:

If you think this is going to be the HTTP-client-facing behavior for the medium term, I would try to get it to return an intentional 503 busy response instead of a 500 internal server error.

In any case, I would change this to not be an assert, since that implies this condition will never happen except by robot-server bugs, whereas it really depends on client behavior.

If you're switching the response to HTTP 503 and settling in for this to be the HTTP API, a tri-state "yes"/"no"/"busy" return value seems appropriate.

If you're keeping the response as an HTTP 500 and calling this a known bug, I actually think a NotImplementedError would be correct?

I like that. I'll go with the 503 busy response.

robot-server/robot_server/protocols/analysis_store.py

robot-server/tests/integration/http_api/protocols/test_analyses.tavern.yaml

robot-server/tests/integration/http_api/protocols/test_key.tavern.yaml

robot-server/tests/protocols/test_analysis_store.py

Co-authored-by: Max Marrone <[email protected]>

robot-server/robot_server/protocols/analysis_store.py

SyntaxColoring

A question on another assert and on a Tavern test. Otherwise, this looks good to me. Thanks!

robot-server/tests/integration/http_api/protocols/test_analyses.tavern.yaml

SyntaxColoring

yeet

Closes AUTH-229 # Overview Updates the `/protocols` endpoints to always maintain the order of list of analyses as most-recently-started-analysis last, making sure to verify if a new analysis needs to be triggered because of new run-time-parameter values for a previously uploaded protocol. # Risk assessment Medium. Does database update and fixes the analysis order that was broken by #14688 --------- Co-authored-by: Max Marrone <[email protected]>

sanni-t added 8 commits March 29, 2024 13:11

update database to schema4, update migration

b8cf66f

updated tests

ca535e8

add rtp values and defaults to CompletedAnalysisResource, add getter …

6884a62

…for it

extract and save rtp values from analysis, add an 'RTP values matchin…

659c55d

…g' checker

updated logic for matching rtp values, added tests

f31c283

added RTP matching checker on protocol analyses, added tests

65b0e9b

added tavern tests, fixed matching values logic

4c56a4b

fix lint errors

1c53004

fixed tests

4ef3465

sanni-t marked this pull request as ready for review April 1, 2024 21:05

sanni-t requested review from a team as code owners April 1, 2024 21:05

sanni-t requested review from SyntaxColoring, a team, jbleon95 and Elyorcv and removed request for a team April 1, 2024 21:05

attempted to use more self-explanatory var names

f15eda7

SyntaxColoring reviewed Apr 2, 2024

View reviewed changes

sanni-t and others added 5 commits April 2, 2024 13:54

simplify rtp value match checker

cbdaf4d

Co-authored-by: Max Marrone <[email protected]>

remove duplicate tests

18561c0

revert wiring up unit for enum RTPs

a37f5e4

reanalyze the protocol if analysis is missing

b52b70e

return error response if analysis is pending

248d55e

sanni-t requested a review from SyntaxColoring April 3, 2024 20:06

SyntaxColoring reviewed Apr 3, 2024

View reviewed changes

robot-server/robot_server/protocols/analysis_store.py Show resolved Hide resolved

SyntaxColoring reviewed Apr 3, 2024

View reviewed changes

robot-server/tests/integration/http_api/protocols/test_analyses.tavern.yaml Outdated Show resolved Hide resolved

replace flaky integration test with stable unit tests

8c96653

SyntaxColoring approved these changes Apr 3, 2024

View reviewed changes

sanni-t merged commit 3d34031 into edge Apr 3, 2024
22 checks passed

jbleon95 mentioned this pull request Apr 11, 2024

feat(robot-server): add runtime parameter definitions to run summary #14866

Merged

sanni-t deleted the AUTH-229-robot-server-list-of-analyses-should-maintain-correct-order branch July 15, 2024 21:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(robot-server): maintain correct order of protocol analyses #14762

fix(robot-server): maintain correct order of protocol analyses #14762

sanni-t commented Apr 1, 2024 •

edited

Loading

codecov bot commented Apr 1, 2024 •

edited

Loading

SyntaxColoring left a comment

SyntaxColoring Apr 2, 2024

sanni-t Apr 2, 2024

SyntaxColoring Apr 2, 2024

SyntaxColoring Apr 2, 2024 •

edited

Loading

SyntaxColoring Apr 2, 2024

SyntaxColoring Apr 2, 2024

sanni-t Apr 2, 2024

SyntaxColoring Apr 2, 2024

SyntaxColoring Apr 2, 2024

sanni-t Apr 2, 2024

SyntaxColoring left a comment

SyntaxColoring left a comment

fix(robot-server): maintain correct order of protocol analyses #14762

fix(robot-server): maintain correct order of protocol analyses #14762

Conversation

sanni-t commented Apr 1, 2024 • edited Loading

Overview

Test Plan

Review requests

Risk assessment

codecov bot commented Apr 1, 2024 • edited Loading

Codecov Report

SyntaxColoring left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SyntaxColoring Apr 2, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SyntaxColoring left a comment

Choose a reason for hiding this comment

SyntaxColoring left a comment

Choose a reason for hiding this comment

sanni-t commented Apr 1, 2024 •

edited

Loading

codecov bot commented Apr 1, 2024 •

edited

Loading

SyntaxColoring Apr 2, 2024 •

edited

Loading