-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DM-45119: Update logic for checking overlap dimensions in relations #1043
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1043 +/- ##
==========================================
- Coverage 89.38% 89.37% -0.01%
==========================================
Files 359 359
Lines 45642 45647 +5
Branches 9357 9360 +3
==========================================
+ Hits 40795 40799 +4
Misses 3521 3521
- Partials 1326 1327 +1 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've made a suggestion for how to make this a bit more general, but overall I'm very much in favor of doing just this much and no more on the old query system.
) | ||
if overlap_relationship in existing_relationships: | ||
if any(overlap_relationship.issubset(rel) for rel in existing_relationships): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can make this a little safer by keeping the old check and the new one, and combinations between them:
group1 = self.universe[element1].minimal_group
group2 = self.universe[element2].minimal_group
overlap_relationships = {
frozenset(a | b) for a, b in itertools.product(
[group1.names, group1.required],
[group2.names, group2.required],
)
}
if not overlap_relationships.isdisjoint(existing_relationships):
return context.preferred_engine.make_join_identity_relation(), False
I'm particularly interested in keeping the old check, because while the required dimensions are the ones brought in by a dataset join, the full dimensions are the ones brought in by the temp tables created by materialize, and I'm not sure we have enough test coverage of QG generation edge cases to spot any regressions this change might introduce there otherwise.
When QueryBuilder.joinDatset returns False there is no point in trying to build and execuite the query.
New logic only checks for required elements instead of full set of dimensions. This fixes special case of `healpix11` dataset types in dc2 repo. There is no unit test for this special case, but I verified that query-datasets and query-data-ids now work correctly.
e625f9d
to
2111e21
Compare
New logic only checks for required elements instead of full set of
dimensions. This fixes special case of
healpix11
dataset typesin dc2 repo. There is no unit test for this special case, but I verified
that query-datasets and query-data-ids now work correctly.
Checklist
doc/changes
configs/old_dimensions