Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: unify k8s+vm upgrade pattern #83

Closed
wants to merge 12 commits into from

Conversation

marcoppenheimer
Copy link
Contributor

@marcoppenheimer marcoppenheimer commented Aug 9, 2023

TODO

  • Test on live charm when K8s examples exist
  • Add + update tests
  • Add + update full usage docstring

Changes Made

refactor: remove upgrade-finished event + handler, use relation-changed

  • Moving the logic to relation-changed to be coordinated by the Juju leader
  • NOTE - Does mean that charms need to recurse on leader. Possibly not as have added a call on set_unit_completed. Needs testing

feat: add resume-strategy to pre-upgrade-check

  • auto - all units upgrade automatically
    • Default on VM
  • safe - upgrade pauses on first unit
    • Default on K8s

refactor: add unified _update_stack()

  • Leader-only
  • On K8s, shifts partition
  • Updates relation-data stack
  • Called during resume-upgrade action, and if completed top of stack + first unit + 'safe'

@marcoppenheimer marcoppenheimer marked this pull request as draft August 9, 2023 15:27
@marcoppenheimer marcoppenheimer requested review from marceloneppel and paulomach and removed request for marceloneppel August 9, 2023 15:27
@marcoppenheimer marcoppenheimer changed the title chore: bump mccabe complex refactor: unify k8s+vm upgrade pattern Aug 9, 2023
Comment on lines +797 to +800
if self.substrate == "k8s":
return "safe"

return self.peer_relation.data[self.charm.app].get("resume-strategy", "auto")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can allow users to also select the auto resume strategy on k8s.


# if unit top of stack and all units ready (i.e stack), emit granted event
# only runs on VM, as K8s units will be 'idle' due to not setting 'ready' in upgrade-charm
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

L1081? Or do you mean something like as not upgraded k8s units will be 'idle' due to upgrade-charm event have not being fired yet?

Suggested change
# only runs on VM, as K8s units will be 'idle' due to not setting 'ready' in upgrade-charm
# only runs on VM, as K8s units will be 'upgrading' due to not setting 'ready' in upgrade-charm

logger.info(
f"Please evaluate unit {self.charm.app}/{self.upgrade_stack[-1]}, and run `resume-upgrade` action to continue upgrade"
)
self.charm.unit.status = BlockedStatus("awaiting resume-upgrade action...")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good addition!

if not self.peer_relation or not self.upgrade_stack:
return True

return len(self.upgrade_stack) == len(self.peer_relation.units)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The following change is needed as the upgrade stack has not changed when the resume upgrade action is called on k8s.

Suggested change
return len(self.upgrade_stack) == len(self.peer_relation.units)
return len(self.upgrade_stack) == len(self.peer_relation.units) + 1

However, it makes the upgrade process not respect the resume strategy (making it not wait for the call the the resume-upgrade action), so I think we need different comparisons for those two cases.

@@ -1056,32 +1070,24 @@ def _on_upgrade_charm(self, event: UpgradeCharmEvent) -> None:
logger.error("Cluster upgrade failed, ensure pre-upgrade checks are ran first.")
return

# for VM, run version checks on leader only
# for K8s, run version checks on first unit only
if (self.charm.unit.is_leader() and self.substrate == "vm") or (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

on VM does it really matters where the upgrade_supported_check is done, since every unit will have a new charm code.
I think we could have same approach for both, i.e. check on the first

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're probably right now that I think about it. For VM, the unit that does the check is the last unit to mark itself as 'ready'. I suppose it could be the first!

@marcoppenheimer
Copy link
Contributor Author

Closing as no longer relevant.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants