You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Kubernetes supports Webhook, RABC, builtin CEL policy, and other methods for permission control. However, in the process of using Webhook for resource management, there are some more complex permission management requirements. For example, we hope to manage the permissions of k8s webhook, which can mutate/validate specific resources. However, k8s itself does not set permission mechanisms for webhooks to constrain their scope, which may cause webhooks to affect resources that should not have been effective, thereby affecting cluster behavior.
User Story
As a Kubernetes cluster administrator, I hope to have fine-grained permission control over k8s webhooks to ensure that only specific webhooks can operate specific resources, and the granularity of permissions can be precise to a certain mutation and validation. I need a way to define and execute these fine-grained permission rules to prevent webhooks from accidentally affecting resources that should not be affected, leading to abnormal cluster behavior.
Goals
The specific scenario of the problem mainly includes these two parts, and the goal is to try to solve them
Goal 1: Panic in webhook
K8s webhooks can either admit or reject API requests. There is one problem with webhooks that make them more dangerous though: admission request failures also result in rejection by default. That's a serious problem
Goal 2: Webhook works fine, but there is bugs with its logic
The kube-system namespace deserves its own section, because a mistake in the configuration can easily lead to complete cluster failure. The most common mistake is a missing label on the kube-system namespace object that would exclude it from request matching. A single webhook request failure can prevent Kubernetes components from starting, leading to a ripple effect causing the whole cluster to fail. Bottom line is: make sure to always exclude kube-system from mutations/validations unless you have a very good reason not to.
Proposal
Goal 1: Panic in webhook
The behavior taken after Webhook failure depends on the specific requirements. Webhook resources should provide post-recovery policies, which users can freely choose according to specific usage scenarios. Further, users can also write post-recovery policies by themselves
apiVersion: krm.kcl.dev/v1alpha1kind: KCLRunmetadata:
name: conditionally-add-annotationsspec:
params:
toMatch:
config.kubernetes.io/local-config: "true"toAdd:
configmanagement.gke.io/managed: disabledfailureAction: "abort"# or "warn”, “skip”, or a function for more action based on needssource: < kcl code >
Goal 2: Webhook works fine, but there is bugs with its logic
The RABC authority is inherited, and the mutation and validation resources of the account apply inherit the RABC authority of the account.
apiVersion: rbac.authorization.k8s.io/v1kind: Rolemetadata:
namespace: restricted-namespace # The specific namespacename: restricted-rolerules:
- apiGroups: [""]resources: ["pods", "services"] # The specific resourcesverbs: ["get", "list", "watch", "create", "update", "delete"] # The specific action
On the basis of RABC, the label selector is used to further fine-grained the scope of webhook validity.
The correspondence between RABC verbs and validation, mutation webhook or custom verbs.
The appropriate selector helps the webhook select the resource accurately.
The appropriate selector helps the resource select the webhook accurately.
[WIP] Design Details
Write Webhooks through KCL
params = option("params") or {} # hidden this for user
set_func = lambda params {
annotations: {str:str} = {k = v for k, v in params.annotations or {}}
items = [item | {
metadata.annotations: annotations
} for item in option("items")]
}
items = set_func(params) # hidden this for user
req --> KCL-operator --> KCLRun --> KCL --> kclplugin --> go/py/rust...
|--- Filters out the list of resources that
|--- the webhook can access based on RBAC
When asking Chainsaw to execute the assertion above, it will look for a deployment named coredns in the kube-system namespace and will compare the existing resource with the (partial) resource definition contained in the assertion.
In this specific case, if the field spec.replicas is set to 2 in the existing resource, the assertion will be considered valid. If it is not equal to 2 the assertion will be considered failed.
[WIP] FluxCD Multi Tenancy
Flux defers to Kubernetes’ native RBAC to specify which operations are authorised when processing its custom resources. By default, this means operations are constrained by the service account under which the controllers run, which has the cluster-admin role bound to it. This is convenient for a deployment in which all users are trusted.
In a multi-tenant deployment, each tenant needs to be restricted in the operations that can be done on their behalf. Since tenants control Flux via its API objects, this becomes a matter of attaching RBAC rules to Flux API objects.
To give users control over the authorisation, the Flux controllers can impersonate (assume the identity of) a service account mentioned in the apply specification (e.g., the field .spec.serviceAccountName in a Kustomization object or in a HelmRelease object) for both accessing resources and applying configuration. This lets a user constrain the operations performed by the Flux controllers with RBAC.
Motivation
Kubernetes supports Webhook, RABC, builtin CEL policy, and other methods for permission control. However, in the process of using Webhook for resource management, there are some more complex permission management requirements. For example, we hope to manage the permissions of k8s webhook, which can mutate/validate specific resources. However, k8s itself does not set permission mechanisms for webhooks to constrain their scope, which may cause webhooks to affect resources that should not have been effective, thereby affecting cluster behavior.
User Story
As a Kubernetes cluster administrator, I hope to have fine-grained permission control over k8s webhooks to ensure that only specific webhooks can operate specific resources, and the granularity of permissions can be precise to a certain mutation and validation. I need a way to define and execute these fine-grained permission rules to prevent webhooks from accidentally affecting resources that should not be affected, leading to abnormal cluster behavior.
Goals
The specific scenario of the problem mainly includes these two parts, and the goal is to try to solve them
Goal 1: Panic in webhook
Goal 2: Webhook works fine, but there is bugs with its logic
kube-system
namespace deserves its own section, because a mistake in the configuration can easily lead to complete cluster failure. The most common mistake is a missing label on thekube-system
namespace object that would exclude it from request matching. A single webhook request failure can prevent Kubernetes components from starting, leading to a ripple effect causing the whole cluster to fail. Bottom line is: make sure to always excludekube-system
from mutations/validations unless you have a very good reason not to.Proposal
Goal 1: Panic in webhook
The behavior taken after Webhook failure depends on the specific requirements. Webhook resources should provide post-recovery policies, which users can freely choose according to specific usage scenarios. Further, users can also write post-recovery policies by themselves
Goal 2: Webhook works fine, but there is bugs with its logic
The webhook resource describes the object it wants to work on
Among the resources, describe those webhooks that are available to itself.
Further detailed design is required:
[WIP] Design Details
Print
or Provide builtinlog
e.g.
Community Tach
https://www.likakuli.com/posts/kinitiras-all/
Kubernetes Webhook
k8s webhook supports scoping when registering services
Kubernetes CEL Policy
k8s CEL specifies the object for which the rule takes effect
Specify a namespace using Binding
OPA Gatekeeper
OPA can create rules to prevent users from accessing the namespace
https://support.tools/post/opa-gatekeeper-require-labels/
https://stackoverflow.com/questions/71547292/opa-rego-policy-to-block-access-to-kubernetes-namespace
Kyverno
Kyverno can create rules to prevent users from accessing the namespace
Chainsaw
chainsaw: An end-to-end, declarative testing tool anyone can use to test Kubernetes operators.
When asking Chainsaw to execute the assertion above, it will look for a deployment named coredns in the kube-system namespace and will compare the existing resource with the (partial) resource definition contained in the assertion.
In this specific case, if the field spec.replicas is set to 2 in the existing resource, the assertion will be considered valid. If it is not equal to 2 the assertion will be considered failed.
[WIP] FluxCD Multi Tenancy
Flux defers to Kubernetes’ native RBAC to specify which operations are authorised when processing its custom resources. By default, this means operations are constrained by the service account under which the controllers run, which has the cluster-admin role bound to it. This is convenient for a deployment in which all users are trusted.
In a multi-tenant deployment, each tenant needs to be restricted in the operations that can be done on their behalf. Since tenants control Flux via its API objects, this becomes a matter of attaching RBAC rules to Flux API objects.
To give users control over the authorisation, the Flux controllers can impersonate (assume the identity of) a service account mentioned in the apply specification (e.g., the field .spec.serviceAccountName in a Kustomization object or in a HelmRelease object) for both accessing resources and applying configuration. This lets a user constrain the operations performed by the Flux controllers with RBAC.
https://fluxcd.io/flux/components/helm/helmreleases/#role-based-access-control
https://fluxcd.io/flux/installation/configuration/multitenancy/
KusionStack Controller Mesh
Reference
The text was updated successfully, but these errors were encountered: