Karpenter fails to schedule a pending pod with a preferred affinity #1204

wmgroot · 2024-04-23T17:05:13Z

Description

Observed Behavior:
We have a pod stuck in pending indefinitely and Karpenter does not take action to add a new node to allow the pod to schedule.

$ kubectl get pod -n capa-system
NAME                                      READY   STATUS    RESTARTS   AGE
capa-controller-manager-7c6f4fbf6-2wxxr   0/1     Pending   0          4d5h
capa-controller-manager-7c6f4fbf6-9lsd6   1/1     Running   0          4d5h

The pod has a soft affinity to prefer controlplane nodes. Given this is an EKS cluster, this pod can never schedule on a controlplane node.

    spec:
      affinity:
        nodeAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - preference:
              matchExpressions:
              - key: node-role.kubernetes.io/control-plane
                operator: Exists
            weight: 10

Expected Behavior:
Karpenter creates a node to allow the pod to schedule even though the pod has a soft affinity preference that cannot be satisfied. Not scheduling the pod can result in prolonged outages, blocked PDBs and other undesirable behavior that requires manual intervention and is worse than an unsatisfied soft affinity.

Reproduction Steps (Please include YAML):
I believe this should be reproducible with a pod that uses a nodeselector/toleration for an isolated NodePool for easier testing.

Any unsatisfiable preferred constraint in an affinity should allow the observed behavior to occur (such as a label that will never exist for nodes in the nodepool).
Do note that the pod must not have space to schedule without Karpenter taking action, otherwise K8s will schedule it successfully without satisfying the soft affinity constraint. Using a NodePool that should scale up from 0 is an effective way to test this.

Upon removing the affinity spec from the example above, Karpenter added a node immediately to allow the pod to schedule.

$ kubectl get pod -n capa-system
NAME                                      READY   STATUS    RESTARTS   AGE
capa-controller-manager-7c6f4fbf6-4wc6b   1/1     Running   0          12m
capa-controller-manager-7c6f4fbf6-hqtjb   1/1     Running   0          12m

Versions:

Chart Version: 0.35.0
Kubernetes Version (kubectl version):

$ kubectl version
Client Version: version.Info{Major:"1", Minor:"25", GitVersion:"v1.25.1", GitCommit:"e4d4e1ab7cf1bf15273ef97303551b279f0920a9", GitTreeState:"clean", BuildDate:"2022-09-14T19:49:27Z", GoVersion:"go1.19.1", Compiler:"gc", Platform:"linux/amd64"}
Kustomize Version: v4.5.7
Server Version: version.Info{Major:"1", Minor:"26+", GitVersion:"v1.26.14-eks-b9c9ed7", GitCommit:"7c3f2be51edd9fa5727b6ecc2c3fc3c578aa02ca", GitTreeState:"clean", BuildDate:"2024-03-02T03:46:35Z", GoVersion:"go1.21.7", Compiler:"gc", Platform:"linux/amd64"}

Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
If you are interested in working on this issue or have submitted a pull request, please leave a comment

The text was updated successfully, but these errors were encountered:

tzneal · 2024-04-23T20:10:45Z

Its not just a preferred node affinity that isn't satisfiable that is causing this. Its due to the label being for a restricted domain (node-role.kubernetes.io/control-plane). If you modify the label to be something else, Karpenter will launch capacity for the pod.

karpenter-5bb56f6d9b-l8v4x controller {"level":"DEBUG","time":"2024-04-23T20:03:55.814Z","logger":"controller.disruption","message":"ignoring pod, label node-role.kubernetes.io/control-plane is restricted; specify a well known label: [karpenter.k8s.aws/instance-accelerator-count karpenter.k8s.aws/instance-accelerator-manufacturer karpenter.k8s.aws/instance-accelerator-name karpenter.k8s.aws/instance-category karpenter.k8s.aws/instance-cpu karpenter.k8s.aws/instance-encryption-in-transit-supported karpenter.k8s.aws/instance-family karpenter.k8s.aws/instance-generation karpenter.k8s.aws/instance-gpu-count karpenter.k8s.aws/instance-gpu-manufacturer karpenter.k8s.aws/instance-gpu-memory karpenter.k8s.aws/instance-gpu-name karpenter.k8s.aws/instance-hypervisor karpenter.k8s.aws/instance-local-nvme karpenter.k8s.aws/instance-memory karpenter.k8s.aws/instance-network-bandwidth karpenter.k8s.aws/instance-size karpenter.sh/capacity-type karpenter.sh/nodepool kubernetes.io/arch kubernetes.io/os node.kubernetes.io/instance-type node.kubernetes.io/windows-build topology.kubernetes.io/region topology.kubernetes.io/zone], or a custom label that does not use a restricted domain: [k8s.io karpenter.k8s.aws karpenter.sh kubernetes.io]","commit":"8b2d1d7","pod":"default/test-pod"}

It may not be necessary to validate preferred terms however since they can be ignored.

wmgroot · 2024-04-23T22:51:02Z

Would you mind linking the code for this special handling to me? I'm searching for any mention of "node-role" or "control-plane" label and not finding anything. I don't think it's safe to just ignore any kubernetes.io label since some of those are reasonable to use for node selection.

billrayburn · 2024-05-08T20:34:48Z

/assign @jmdeal

cnmcavoy · 2024-05-10T18:59:45Z

@wmgroot I believe what is happening is that the affinity causes Karpenter to compute a nodeclaim with the restricted domain as one of it's labels. Then, Karpenter validates the nodeclaim, detects this restricted label, and determines it's an unsatisfiable nodeclaim and can not be created. So Karpenter does not scale up a node.

k8s-triage-robot · 2024-08-08T19:25:46Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot · 2024-09-07T20:14:28Z

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle rotten
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

jwcesign · 2024-09-08T11:30:24Z

related PR: #1608

k8s-triage-robot · 2024-10-08T11:54:24Z

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue with /reopen
Mark this issue as fresh with /remove-lifecycle rotten
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

k8s-ci-robot · 2024-10-08T11:54:28Z

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied

After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied

After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue with /reopen

Mark this issue as fresh with /remove-lifecycle rotten

Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

k8s-ci-robot · 2024-10-10T16:07:13Z

This issue is currently awaiting triage.

If Karpenter contributors determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

njtran · 2024-10-10T16:10:04Z

Hey, re-opening since this fell through the cracks and staled out. I see this was opened for v0.35, can we try to reproduce on v1 and see if it's fixed?

k8s-triage-robot · 2024-11-09T16:16:58Z

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue with /reopen
Mark this issue as fresh with /remove-lifecycle rotten
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

k8s-ci-robot · 2024-11-09T16:17:02Z

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied

After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied

After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue with /reopen

Mark this issue as fresh with /remove-lifecycle rotten

Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

wmgroot added kind/bug Categorizes issue or PR as related to a bug. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Apr 23, 2024

tzneal removed the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Apr 23, 2024

k8s-ci-robot assigned jmdeal May 8, 2024

wmgroot mentioned this issue May 10, 2024

docs: update karpenter docs around soft affinity behavior aws/karpenter-provider-aws#6172

Merged

3 tasks

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 8, 2024

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Sep 7, 2024

k8s-ci-robot closed this as not planned Won't fix, can't repro, duplicate, stale Oct 8, 2024

njtran reopened this Oct 10, 2024

k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Oct 10, 2024

k8s-ci-robot closed this as not planned Won't fix, can't repro, duplicate, stale Nov 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Karpenter fails to schedule a pending pod with a preferred affinity #1204

Karpenter fails to schedule a pending pod with a preferred affinity #1204

wmgroot commented Apr 23, 2024

tzneal commented Apr 23, 2024

wmgroot commented Apr 23, 2024

billrayburn commented May 8, 2024

cnmcavoy commented May 10, 2024 •

edited

Loading

k8s-triage-robot commented Aug 8, 2024

k8s-triage-robot commented Sep 7, 2024

jwcesign commented Sep 8, 2024

k8s-triage-robot commented Oct 8, 2024

k8s-ci-robot commented Oct 8, 2024

k8s-ci-robot commented Oct 10, 2024

njtran commented Oct 10, 2024

k8s-triage-robot commented Nov 9, 2024

k8s-ci-robot commented Nov 9, 2024

Karpenter fails to schedule a pending pod with a preferred affinity #1204

Karpenter fails to schedule a pending pod with a preferred affinity #1204

Comments

wmgroot commented Apr 23, 2024

Description

tzneal commented Apr 23, 2024

wmgroot commented Apr 23, 2024

billrayburn commented May 8, 2024

cnmcavoy commented May 10, 2024 • edited Loading

k8s-triage-robot commented Aug 8, 2024

k8s-triage-robot commented Sep 7, 2024

jwcesign commented Sep 8, 2024

k8s-triage-robot commented Oct 8, 2024

k8s-ci-robot commented Oct 8, 2024

k8s-ci-robot commented Oct 10, 2024

njtran commented Oct 10, 2024

k8s-triage-robot commented Nov 9, 2024

k8s-ci-robot commented Nov 9, 2024

cnmcavoy commented May 10, 2024 •

edited

Loading