Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add oidcIssuerProfile to AzureManagedControlPlane #3973

Merged
merged 1 commit into from
Sep 14, 2023

Conversation

nojnhuh
Copy link
Contributor

@nojnhuh nojnhuh commented Sep 8, 2023

What type of PR is this?
/kind feature
/area managedclusters

What this PR does / why we need it:

This PR exposes the AKS oidcIssuerProfile config from AzureManagedControlPlane at spec.oidcIssuerProfile. It also adds status.oidcIssuerProfile which includes the current actual state of enabled and the issuerURL.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #2498

Special notes for your reviewer:

  • cherry-pick candidate

TODOs:

  • squashed commits
  • includes documentation
  • adds unit tests

Release note:

Added `spec.oidcIssuerProfile` to AzureManagedControlPlane

@k8s-ci-robot k8s-ci-robot added kind/feature Categorizes issue or PR as related to a new feature. release-note Denotes a PR that will be considered when it comes time to generate release notes. area/managedclusters Issues related to managed AKS clusters created through the CAPZ ManagedCluster Type cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Sep 8, 2023
@codecov
Copy link

codecov bot commented Sep 8, 2023

Codecov Report

Patch coverage: 100.00% and project coverage change: +0.54% 🎉

Comparison is base (b082981) 56.26% compared to head (9ecdf98) 56.80%.
Report is 50 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3973      +/-   ##
==========================================
+ Coverage   56.26%   56.80%   +0.54%     
==========================================
  Files         190      188       -2     
  Lines       19447    19242     -205     
==========================================
- Hits        10941    10931      -10     
+ Misses       7877     7677     -200     
- Partials      629      634       +5     
Files Changed Coverage Δ
api/v1beta1/azuremanagedcontrolplane_types.go 20.00% <ø> (ø)
api/v1beta1/azuremanagedcontrolplane_default.go 96.93% <100.00%> (+0.23%) ⬆️
api/v1beta1/azuremanagedcontrolplane_webhook.go 88.03% <100.00%> (+0.43%) ⬆️
azure/scope/managedcontrolplane.go 22.35% <100.00%> (-1.49%) ⬇️
azure/services/managedclusters/managedclusters.go 75.34% <100.00%> (+2.20%) ⬆️
azure/services/managedclusters/spec.go 52.95% <100.00%> (+1.72%) ⬆️

... and 50 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@maciaszczykm maciaszczykm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really nice to have that! I hope it can be included in the incoming release.

@CecileRobertMichon
Copy link
Contributor

/milestone v1.11

@maciaszczykm yes that's the plan

@k8s-ci-robot k8s-ci-robot added this to the v1.11 milestone Sep 8, 2023
type OIDCIssuerProfileStatus struct {
// Enabled is whether the OIDC issuer is enabled.
// +optional
Enabled *bool `json:"enabled,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is interesting

It differs a bit from existing patterns we have for setting "read-only" fields (eg. kubelet identity). This is arguably better and more aligned with k8s pattern (spec is desired state and status is read-only actual state) but inconsistent with other fields we already have. I'm not sure about duplicating the enabled field in both spec and status, that strikes me as redundant.

What was the reasoning in including the status field?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this was done in the spirit of #3855. ASO's ManagedCluster type is also structured the same way. One concrete case that having enabled in the status helps with is the webhook validation to return an error when a user tries to disable the OIDC issuer, which is not allowed by AKS.

The field itself is passed through verbatim from CAPZ to the AKS API, where nil means "don't change" and non-nil true and false set it explicitly. That means when enabled is nil in the AzureManagedControlPlane spec, that by itself is not enough to know whether or not the OIDC issuer is actually enabled at that moment. Since AKS (always, AFAICT) populates enabled in its API when the user sends nil, propagating that to the status in AzureManagedControlPlane lets the webhook definitively know the actual state of enabled when validating an update.

Copy link
Contributor

@CecileRobertMichon CecileRobertMichon Sep 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the advantage of letting the user set the spec field to "nil"? Would the same enforcement be covered by a simpler immutable check (once it's set to true, cannot be unset or set to false)?

overall I'm +1 on making the IssuerURL part of status but I don't love that we're designing a new pattern for enabled/disabled and immutability on the fly that is inconsistent with other API fields which currently have the exact same behavior from an AKS perspective.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Allowing nil means users that have already set this on their clusters via other means won't have this change out from under them when CAPZ upgrades. I believe @mtougeron expressed this kind of concern a while ago around the autoscaler settings.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a valid reason for doing it this way. However, if we're going to support that scenario ("users that have already set this on their clusters via other means") we should do this consistently across all fields we support (that's what ASO does) https://github.com/Azure/azure-service-operator/blob/main/v2/api/containerservice/v1api20230201/managed_cluster_types_gen.go#L10167

Would be worth doing a quick audit to see how many fields are not following this pattern currently but could be, and then having an issue to convert them to use the same pattern. It might be the case that we end up wrapping around the ASO types before we get to it but would still be good to understand where the gaps are.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@CecileRobertMichon Are you overall more +1 to leaving this field as a plain non-pointer bool in the spec and letting it default to false in all cases?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And if spec.oidcIssuerProfile is null in the spec, should that always imply spec.oidcIssuerProfile.enabled is false?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nojnhuh, no the danger with using a plain non-pointer is that the default could change and then we wouldn't know if a user is intentionally disabling it or not setting the field.

To summarize there are two options/patterns possible:

A) The spec field is an optional pointer bool and defaults the value to a sane default when nil. We don't support users setting things outside of the spec, the CAPZ spec is the only source of truth for the object. This is what we do for most fields if not all currently.
B) The spec field is an optional pointer bool, we don't default the value and let "nil" be a valid value to pass to the AKS API. We use a status field to track the "actual" value in Azure. This allows users to set this value outside of the CAPZ spec and not get overwritten, as long as AKS doesn't default/require a value. This is what this PR is doing currently, and what ASO does.

IMO both are valid. I tend to prefer consistency overall because that is less likely to lead users to make mistakes. This is not a hill I'm willing to die on (and I really want to see this PR get merged so I don't want to be standing in the way of that) but my personal preference would be to keep A) for now for consistency with the agreement that B) is what we're trying to evolve towards with a v2 of the API that wraps around ASO types (#3629).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my personal preference would be to keep A) for now for consistency with the agreement that B) is what we're trying to evolve towards with a v2 of the API that wraps around ASO types

Agree this seems like the best path for now. I'll update this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@CecileRobertMichon This should work the way you described now. Can we still get this in for the v1.11 release?

@nojnhuh
Copy link
Contributor Author

nojnhuh commented Sep 13, 2023

Updated this so it should be more consistent with other fields.

/hold for squash

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 13, 2023
@@ -839,15 +839,6 @@ var _ = Describe("Workload cluster creation", func() {
}
})
})

By("modifying OIDC issuer configuration", func() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we don't keep the "actual" state in the status, then we have to wait to add this test back until we can use SDK v2 clients in the tests.

@nojnhuh
Copy link
Contributor Author

nojnhuh commented Sep 14, 2023

/assign @CecileRobertMichon

Copy link
Contributor

@CecileRobertMichon CecileRobertMichon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 14, 2023
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: e85602b848fd98ff2bf0047d57d54757826077e0

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: CecileRobertMichon

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 14, 2023

func (m *AzureManagedControlPlane) setDefaultOIDCIssuerProfile() {
if m.Spec.OIDCIssuerProfile == nil {
m.Spec.OIDCIssuerProfile = &OIDCIssuerProfile{}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this thread covers this: #3973 (review)

But, just to clarify, the reason we're doing it this way instead of the way that autoscaler does it above (if user doesn't declare a profile config we omit it from the spec entirely) is because this more explicit "is disabled" implementation removes edge cases from upgrade scenarios?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jackfrancis this comment summarizes it #3973 (comment)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think because autoscaler is a little different from other fields in that we want to send nil over the wire to AKS when not defined in the CAPZ spec. OIDC issuer is more like other fields where we always send a non-nil value.

@nojnhuh
Copy link
Contributor Author

nojnhuh commented Sep 14, 2023

squashed!
/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 14, 2023
@jackfrancis
Copy link
Contributor

/lgtm

@nojnhuh
Copy link
Contributor Author

nojnhuh commented Sep 14, 2023

#3970
/retest

@k8s-ci-robot
Copy link
Contributor

@nojnhuh: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-cluster-api-provider-azure-e2e-aks 9ecdf98 link unknown /test pull-cluster-api-provider-azure-e2e-aks
pull-cluster-api-provider-azure-apidiff 9ecdf98 link false /test pull-cluster-api-provider-azure-apidiff

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@k8s-ci-robot k8s-ci-robot merged commit 09c96a7 into kubernetes-sigs:main Sep 14, 2023
19 of 21 checks passed
@nojnhuh nojnhuh deleted the aks-oidc branch September 14, 2023 22:38
@nojnhuh nojnhuh restored the aks-oidc branch September 15, 2023 05:32
@nojnhuh nojnhuh deleted the aks-oidc branch September 15, 2023 16:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/managedclusters Issues related to managed AKS clusters created through the CAPZ ManagedCluster Type cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

Add OIDCIssuerProfile configuration option
6 participants