Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-45489: Add more test cover for RemoteButler queries #1042

Merged
merged 6 commits into from
Jul 31, 2024
Merged

Conversation

dhirving
Copy link
Contributor

@dhirving dhirving commented Jul 30, 2024

Fixed an issue where the "Hybrid" query results object for queryDataIds was testing the DirectButler implementation in some cases where it should have been testing the RemoteButler implementation. This revealed some small missing exception handling and minor discrepancies between the two implementations, which are now fixed.

Checklist

  • ran Jenkins
  • added a release note for user-visible changes to doc/changes
  • (if changing dimensions.yaml) make a copy of dimensions.yaml in configs/old_dimensions

Match the DirectButler behavior in RemoteButler  queryDataIds by throwing an ArgumentError if collections is specified without datasets.
The query shims for RemoteButler are not able to determine whether there are any errors until they go to resolve the query, so resolve the query in a test to allow this instance of MissingDatasetTypeError to be raised lazily.
Matching the DirectButler behavior, return a more user-friendly error when an empty string is passed to order_by.
Update unit tests related to interpreting identifiers in queries to account for differences between the old and new query systems.
Tweak the registry query unit tests to handle minor differences between the old and new query systems.
Fixed an issue where the "Hybrid" query results object for queryDataIds was testing the DirectButler implementation in some cases where it should have been testing the RemoteButler implementation.
Copy link

codecov bot commented Jul 30, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 89.38%. Comparing base (88f4f2d) to head (66e3efc).

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #1042   +/-   ##
=======================================
  Coverage   89.37%   89.38%           
=======================================
  Files         359      359           
  Lines       45630    45642   +12     
  Branches     9349     9357    +8     
=======================================
+ Hits        40783    40795   +12     
  Misses       3521     3521           
  Partials     1326     1326           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@dhirving dhirving marked this pull request as ready for review July 30, 2024 22:40
Comment on lines +2676 to +2679
with self.assertRaises(NotImplementedError):
registry.queryDataIds(
["instrument", "detector", "exposure"], datasets="bias", collections=coll_list
).count()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the new query system this is returning zero rows, which I don't think is right.. it should probably be returning the data IDs for all the bias datasets instead?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the the behavior is correct, and the comment above this test is wrong: I don't see anything that inserts exposure dimension records, so the query is going to return no rows because it's got a join to that empty table. Note that the dimensions of bias are just {instrument, detector}; the match to exposure is going to be a temporal join between the CALIBRATION collection's validity range and exposure.timespan.

So I suspect there's supposed to be an exposure record inserted with a timespan that overlaps the validity range of exactly one bias for or or two detectors in this test, as that'd make the test much more interesting for the new system (and it wouldn't affect the behavior of the old query system, which is probably why it wasn't done). And if we do that the new query system should return rows for that exposure and whatever detectors have a bias with the matching timespan. But it's quite possible that behavior is already covered in other tests of the new query system and hence there's no need to re-check it here.

Note that we don't really care about the case where the exposure's timespan overlaps the validity ranges of multiple biases; this query might still be sound, but it'd make any find-first search for the bias fail, and hence it represents a practically useless validity range.

Comment on lines +2676 to +2679
with self.assertRaises(NotImplementedError):
registry.queryDataIds(
["instrument", "detector", "exposure"], datasets="bias", collections=coll_list
).count()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the the behavior is correct, and the comment above this test is wrong: I don't see anything that inserts exposure dimension records, so the query is going to return no rows because it's got a join to that empty table. Note that the dimensions of bias are just {instrument, detector}; the match to exposure is going to be a temporal join between the CALIBRATION collection's validity range and exposure.timespan.

So I suspect there's supposed to be an exposure record inserted with a timespan that overlaps the validity range of exactly one bias for or or two detectors in this test, as that'd make the test much more interesting for the new system (and it wouldn't affect the behavior of the old query system, which is probably why it wasn't done). And if we do that the new query system should return rows for that exposure and whatever detectors have a bias with the matching timespan. But it's quite possible that behavior is already covered in other tests of the new query system and hence there's no need to re-check it here.

Note that we don't really care about the case where the exposure's timespan overlaps the validity ranges of multiple biases; this query might still be sound, but it'd make any find-first search for the bias fail, and hence it represents a practically useless validity range.

@dhirving dhirving merged commit c108883 into main Jul 31, 2024
18 checks passed
@dhirving dhirving deleted the tickets/DM-45489 branch July 31, 2024 17:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants