Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: remove enum unions from use in ServiceInfo #1391

Conversation

james-garner-canonical
Copy link
Contributor

Working towards issue #1287, this draft PR removes the use of str and enum member unions connected to the ops.pebble.ServiceInfo class.

In this PR I try to remove all such unions to simplify types. However, I wonder if it would be better to continue to allow the use of both str and enum members as arguments in some cases (constructors) where removing the union might result in a breaking api change.


Previously a str argument that an ops enum doesn't know about would result in the use of the str, while a known argument would use the corresponding enum member. Instead, we now use an UNKNOWN enum member in this case.

This is a behaviour change, but shouldn't be a breaking change, because the only time you should be using such values would be if you were developing for a more recent Pebble version than ops.pebble knows about. Since ops.pebble is presumable up to date currently, this will only be the case on older ops versions, which this PR does not apply to. However, it will be worth mentioning this behaviour change in the appropriate release notes.

An example of the core behaviour change, in the ServiceInfo.from_dict method:

 try:
     startup = ServiceStartup(d['startup'])
 except ValueError:
-    startup = d['startup']
+    warnings.warn(f'Unknown ServiceStartup value {d["startup"]!r}')
+    startup = ServiceStartup.UNKNOWN
...
 return cls(
     name=d['name'],
     startup=startup,
     current=current,
 )

This leads us to the __init__ method, where we also have unions to remove. I want to highlight this change because it's potentially breaking.

     def __init__(
         self,
         name: str,
-        startup: Union[ServiceStartup, str],
-        current: Union[ServiceStatus, str],
+        startup: ServiceStartup,
+        current: ServiceStatus,
     ):
         self.name = name
         self.startup = startup
         self.current = current

This is a breaking change, as the pebble module is public (explicitly exported in ops.__init__.__all__), and ServiceInfo is a public member of that module. Previously a user could expect to be able to call ServiceInfo("foo", startup="enabled", current="active") if they wanted to. However:

  1. It never worked correctly with string arguments. Note how startup and current are assigned in __init__ without any runtime validation. This leads to unexpected behaviour here:
  def is_running(self) -> bool:
      """Return True if this service is running (in the active state)."""
      return self.current == ServiceStatus.ACTIVE

This may have been written with the assumption that ServiceStatus.ACTIVE == 'active', the way it would with a normal class (is this how enums in other languages most commonly work?). If that were the case, then you'd expect self.current being a union of str and enum members to just work. Instead, ServiceStatus.ACTIVE is an object of type <enum 'ServiceStatus'> and ServiceStatus.ACTIVE.value == "hello", so this will return False if self.current == 'active'.

  1. Constructing a ServiceInfo from arbitrary arguments isn't really the intended use case, as you want information about actually running services. You're more likely to call ops.pebble.get_services, or a bit lower level, call ops.pebble.ServiceStatus.from_dict with a dict representation of the services.

This leads me to think this breaking change may be ok. Currently the change will only result in a type checking error, which if ignored will lead to the same (broken) runtime behaviour as previously.

We could avoid the breaking change and handle things better at runtime (fixing the issue with is_running) by continuing to accept unions for the __init__ args and turning them into enum members before assigning them. The from_dict logic could be moved to __init__ to handle this.


The final instance of str and enum member unions related to this class is _ServiceInfoDict. Removing this union leads to another potentially breaking change.

_ServiceInfoDict is part of the signature of the public ServiceInfo.from_dict method, and also used to annotate a value returned from an external Pebble call, which could never actually contain Python enum members. Without introducing more type variables, the correct change when removing these unions should therefore be:

 _ServiceInfoDict = TypedDict(
     '_ServiceInfoDict',
-    {'startup': Union['ServiceStartup', str], 'current': Union['ServiceStatus', str], 'name': str},
+    {'startup': str, 'current': str, 'name': str},
 )

But again this is technically a breaking change, as previously a user could expect to validly call:

ServiceInfo.from_dict({"name": "foo", "startup": ServiceStartup.ENABLED, "current": ServiceStatus.ACTIVE})

This still works at runtime, because ServiceStartup(ServiceStartup.ENABLED) and ServiceStartup('enabled') both produce ServiceStartup.ENABLED, but potentially introduces a new type checking error for users. Again though, using a hand crafted dictionary rather than one that ultimately comes from Pebble itself is less likely here (what about unit tests?).

We could avoid a breaking change here by continuing to accept the original _ServiceInfoDict type in from_dict.


Do we want to remove all the uses of str and enum member unions as a goal in itself to simplify typing? If so, this may lead to some breaking changes.

Alternatively, if the goal is just to switch to using an UNKNOWN enum member instead of passing along unknown string values, then I think we'll need to keep str andenum member unions in some places.

@benhoyt
Copy link
Collaborator

benhoyt commented Sep 25, 2024

I lean towards simplifying them all -- changing them all to just the enum (or just the str in the case of the dict). People shouldn't be doing this, and as you show, in the cases where we do allow a union things like is_running are half broken anyway. I suspect the _ServiceInfoDict allowing enum types at all was a copy-n-paste issue; it seems like a bug.

Let's at least try that and see what it looks like, and run super-tox over the diff to see if there are any failures.

@@ -352,6 +352,18 @@ class _NotProvidedFlag:
_not_provided = _NotProvidedFlag()


class _Unknown:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should avoid this and just use 'unknown' in the enum definition. If Pebble does have "unknown" (FileType already has an "unknown" value), nothing bad happens -- it's just that the caller can't distinguish between the two, which doesn't seem like it would be an issue in practice. There's also the warning log for debugging.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that the warning log won't be emitted if Pebble has "unknown" and "unknown" is in the enum. If that's ok, then I'll remove the sentinel class. I hadn't really thought about what I'd do in the case of FileType.UNKNOWN, so having it always be the string "unknown" makes that easier.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants