Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-34340: implement RFC-834 deprecations #903

Merged
merged 16 commits into from
Nov 22, 2023
Merged

DM-34340: implement RFC-834 deprecations #903

merged 16 commits into from
Nov 22, 2023

Conversation

TallJimbo
Copy link
Member

@TallJimbo TallJimbo commented Nov 6, 2023

Checklist

  • ran Jenkins
  • added a release note for user-visible changes to doc/changes

@TallJimbo TallJimbo force-pushed the tickets/DM-34340 branch 4 times, most recently from b5a773a to d34e126 Compare November 10, 2023 20:25
Copy link

codecov bot commented Nov 10, 2023

Codecov Report

Attention: 198 lines in your changes are missing coverage. Please review.

Comparison is base (f22234c) 87.72% compared to head (4d005f7) 87.45%.

Files Patch % Lines
...daf/butler/dimensions/_data_coordinate_iterable.py 44.04% 31 Missing and 16 partials ⚠️
python/lsst/daf/butler/dimensions/_graph.py 66.03% 34 Missing and 2 partials ⚠️
python/lsst/daf/butler/dimensions/_coordinate.py 87.11% 21 Missing and 8 partials ⚠️
python/lsst/daf/butler/dimensions/_group.py 90.50% 13 Missing and 4 partials ⚠️
python/lsst/daf/butler/_dataset_type.py 48.00% 9 Missing and 4 partials ⚠️
python/lsst/daf/butler/dimensions/_universe.py 87.50% 6 Missing and 1 partial ⚠️
python/lsst/daf/butler/registry/sql_registry.py 66.66% 5 Missing and 1 partial ⚠️
...ython/lsst/daf/butler/registry/queries/_structs.py 76.19% 3 Missing and 2 partials ⚠️
python/lsst/daf/butler/_named.py 75.00% 4 Missing ⚠️
python/lsst/daf/butler/dimensions/_packer.py 63.63% 3 Missing and 1 partial ⚠️
... and 12 more
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #903      +/-   ##
==========================================
- Coverage   87.72%   87.45%   -0.27%     
==========================================
  Files         285      286       +1     
  Lines       36760    37144     +384     
  Branches     7698     7828     +130     
==========================================
+ Hits        32246    32485     +239     
- Misses       3341     3474     +133     
- Partials     1173     1185      +12     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@TallJimbo TallJimbo force-pushed the tickets/DM-34340 branch 3 times, most recently from 3e6b773 to 95299b5 Compare November 16, 2023 15:10
Copy link
Member

@timj timj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great. DimensionGroup is much better name than DimensionGraph and I see that many isinstance checks have been removed.

doc/changes/DM-34340.api.md Outdated Show resolved Hide resolved
doc/changes/DM-34340.api.md Outdated Show resolved Hide resolved
@@ -355,7 +362,7 @@ def dimensions(self) -> DimensionGraph:
The dimensions label and relate instances of this
`DatasetType` (`DimensionGraph`).
"""
return self._dimensions
return self._dimensions._as_graph()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This property will change to DimensionGroup at some point? Also typo in doc string.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This property will change to DimensionGroup at some point?

Yes, when DimensionGraph is removed this will become DimensionGroup, and I think it should be transparently to code that isn't doing deprecated things with it.

Also typo in doc string.

Rewrote it completely, since Dimension instances are no longer in play and the parenthetical type somehow had gotten bumped down to the second paragraph.

@@ -661,7 +668,7 @@ def to_simple(self, minimal: bool = False) -> SerializedDatasetType:
"name": self.name,
"storageClass": self._storageClassName,
"isCalibration": self._isCalibration,
"dimensions": self.dimensions.to_simple(),
"dimensions": list(self.dimensions.names),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not self._dimensions ?

functions for private concrete implementations that should be sufficient
for most purposes. `standardize` is the most flexible and safe of these;
the others (`makeEmpty`, `fromRequiredValues`, and `fromFullValues`) are
more specialized and perform little or no checking of inputs.
the others (`makeEmpty`, `from_required_values`, and `from_full_values`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now the makeEmpty looks incongruous in that it's not make_empty...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I add make_empty, but leave makeEmpty un-deprecated (for now)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could start aliasing things so that makeEmpty redirects to make_empty. Then we can use the consistent form internally at least.

}
@property
@cached_getter
def dimensions(self) -> NamedValueAbstractSet[Dimension]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now a deprecated getter?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Will deprecate and fix any fallout.

return super().__eq__(other)

@classmethod
def _from_iterable(cls, iterable: Iterable[str]) -> set[str]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What calls this private method?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

~collections.abc.Set, in mixin methods like __and__. Will add a comment.

return self.intersection(other)

@property
def data_coordinate_keys(self) -> Set[str]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems odd that this is talking about returning an ordered entity that is a Set but this is really a KeysView.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

KeysView implements Set, and Set does not specify order one way or the other, so annotating this as Set is a sort of encapsulation: the user should not care that data_coordinate_keys is implemented as the keys of a dict.

In fact, I actually think it's a huge shame that KeysView, ValuesView, and ItemsView are defined collections.abc at all, let alone included in the type annotation for Mapping, as they are concrete classes that add nothing in terms of interfaces. Mapping.keys() should have been annotated to return Set, and the other two should have been annotated to return Collection.

dimensions = list(refInfo.datasetRef.dataId.full.keys())
dimensions = [
refInfo.datasetRef.dataId.universe.dimensions[k]
for k in refInfo.datasetRef.dataId.dimensions.data_coordinate_keys
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems a bit more convoluted than the original. I'm trying to understand why the data_coordinate_keys aren't enough given that they are converted to strings in the next line.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The list of actual Dimension instances is passed to sortAstropyTable, down below.

This adds a lot of warnings; DimensionGraph usage will be migrated in
later commits.

Due to those warnings, one test in test_obscore.py has been
temporarily disabled, because it counts warnings, and I couldn't
figure out how to make it only count the kind of warning it's actually
looking for.
The confusingly-named DimensionUniverse.getStatic* methods are no
longer needed after this, but while I think the middleware team has
informally agreed we should drop them and go back to attributes, I
don't think that's been RFC'd, so they stay for now.

The new skypix attributes help with the other goal here: using
set-membership tests rather than isinstance checks in places where
we care about different types of dimensions.  This will work better
when DimensionElement objects appear more rarely, in favor of str
names.
This includes replacing '.graph' with '.dimensions' (to go along with
replacing DimensionGraph with DimensionGroup), adding '.mapping' and
'.required' to replace '.full', and deprecating Dimension-instance
lookup and iteration (which effectively deprecates the Mapping
interface).
DatasetType.dimensions needs to continue to return DimensionGraph
during the deprecation period, since that's the name we want to use
long-term (unlike e.g. DataCoordinate.graph).  That means we rely on
the fact that we've also deprecated all of the things DimensionGraph
can do that DimensionGroup can't do (like iteration), and we rely on
those warnings instead of making DatasetType.dimensions itself warn.
That should let a lot of usage just blindly pass
DatasetType.dimensions to something else and not care about what type
it is throughout the deprecation period.
Only usage was in DimensionGraph.
The old DataCoordinate.makeEmpty is not being deprecated because that
wasn't included on RFC-834 and we don't have a good reason to get rid
it, but we've now got a consistent suite of DataCoordinate factory
methods with from_full_values and from_required_values (whose camelCase
forms _did_ have a good reason to be retired, as this aided with the
transition from DimensionGraph to DimensionGroup).
@TallJimbo TallJimbo merged commit 8ca025c into main Nov 22, 2023
15 of 17 checks passed
@TallJimbo TallJimbo deleted the tickets/DM-34340 branch November 22, 2023 04:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants