Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Karpenter fails to schedule a pending pod with a preferred affinity #1204

Closed
wmgroot opened this issue Apr 23, 2024 · 13 comments
Closed

Karpenter fails to schedule a pending pod with a preferred affinity #1204

wmgroot opened this issue Apr 23, 2024 · 13 comments
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one.

Comments

@wmgroot
Copy link
Contributor

wmgroot commented Apr 23, 2024

Description

Observed Behavior:
We have a pod stuck in pending indefinitely and Karpenter does not take action to add a new node to allow the pod to schedule.

$ kubectl get pod -n capa-system
NAME                                      READY   STATUS    RESTARTS   AGE
capa-controller-manager-7c6f4fbf6-2wxxr   0/1     Pending   0          4d5h
capa-controller-manager-7c6f4fbf6-9lsd6   1/1     Running   0          4d5h

The pod has a soft affinity to prefer controlplane nodes. Given this is an EKS cluster, this pod can never schedule on a controlplane node.

    spec:
      affinity:
        nodeAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - preference:
              matchExpressions:
              - key: node-role.kubernetes.io/control-plane
                operator: Exists
            weight: 10

Expected Behavior:
Karpenter creates a node to allow the pod to schedule even though the pod has a soft affinity preference that cannot be satisfied. Not scheduling the pod can result in prolonged outages, blocked PDBs and other undesirable behavior that requires manual intervention and is worse than an unsatisfied soft affinity.

Reproduction Steps (Please include YAML):
I believe this should be reproducible with a pod that uses a nodeselector/toleration for an isolated NodePool for easier testing.

  • Any unsatisfiable preferred constraint in an affinity should allow the observed behavior to occur (such as a label that will never exist for nodes in the nodepool).
  • Do note that the pod must not have space to schedule without Karpenter taking action, otherwise K8s will schedule it successfully without satisfying the soft affinity constraint. Using a NodePool that should scale up from 0 is an effective way to test this.

Upon removing the affinity spec from the example above, Karpenter added a node immediately to allow the pod to schedule.

$ kubectl get pod -n capa-system
NAME                                      READY   STATUS    RESTARTS   AGE
capa-controller-manager-7c6f4fbf6-4wc6b   1/1     Running   0          12m
capa-controller-manager-7c6f4fbf6-hqtjb   1/1     Running   0          12m

Versions:

  • Chart Version: 0.35.0
  • Kubernetes Version (kubectl version):
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"25", GitVersion:"v1.25.1", GitCommit:"e4d4e1ab7cf1bf15273ef97303551b279f0920a9", GitTreeState:"clean", BuildDate:"2022-09-14T19:49:27Z", GoVersion:"go1.19.1", Compiler:"gc", Platform:"linux/amd64"}
Kustomize Version: v4.5.7
Server Version: version.Info{Major:"1", Minor:"26+", GitVersion:"v1.26.14-eks-b9c9ed7", GitCommit:"7c3f2be51edd9fa5727b6ecc2c3fc3c578aa02ca", GitTreeState:"clean", BuildDate:"2024-03-02T03:46:35Z", GoVersion:"go1.21.7", Compiler:"gc", Platform:"linux/amd64"}

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment
@wmgroot wmgroot added kind/bug Categorizes issue or PR as related to a bug. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Apr 23, 2024
@tzneal
Copy link
Contributor

tzneal commented Apr 23, 2024

Its not just a preferred node affinity that isn't satisfiable that is causing this. Its due to the label being for a restricted domain (node-role.kubernetes.io/control-plane). If you modify the label to be something else, Karpenter will launch capacity for the pod.

karpenter-5bb56f6d9b-l8v4x controller {"level":"DEBUG","time":"2024-04-23T20:03:55.814Z","logger":"controller.disruption","message":"ignoring pod, label node-role.kubernetes.io/control-plane is restricted; specify a well known label: [karpenter.k8s.aws/instance-accelerator-count karpenter.k8s.aws/instance-accelerator-manufacturer karpenter.k8s.aws/instance-accelerator-name karpenter.k8s.aws/instance-category karpenter.k8s.aws/instance-cpu karpenter.k8s.aws/instance-encryption-in-transit-supported karpenter.k8s.aws/instance-family karpenter.k8s.aws/instance-generation karpenter.k8s.aws/instance-gpu-count karpenter.k8s.aws/instance-gpu-manufacturer karpenter.k8s.aws/instance-gpu-memory karpenter.k8s.aws/instance-gpu-name karpenter.k8s.aws/instance-hypervisor karpenter.k8s.aws/instance-local-nvme karpenter.k8s.aws/instance-memory karpenter.k8s.aws/instance-network-bandwidth karpenter.k8s.aws/instance-size karpenter.sh/capacity-type karpenter.sh/nodepool kubernetes.io/arch kubernetes.io/os node.kubernetes.io/instance-type node.kubernetes.io/windows-build topology.kubernetes.io/region topology.kubernetes.io/zone], or a custom label that does not use a restricted domain: [k8s.io karpenter.k8s.aws karpenter.sh kubernetes.io]","commit":"8b2d1d7","pod":"default/test-pod"}

It may not be necessary to validate preferred terms however since they can be ignored.

@tzneal tzneal removed the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Apr 23, 2024
@wmgroot
Copy link
Contributor Author

wmgroot commented Apr 23, 2024

Would you mind linking the code for this special handling to me? I'm searching for any mention of "node-role" or "control-plane" label and not finding anything. I don't think it's safe to just ignore any kubernetes.io label since some of those are reasonable to use for node selection.

@billrayburn
Copy link

/assign @jmdeal

@cnmcavoy
Copy link

cnmcavoy commented May 10, 2024

@wmgroot I believe what is happening is that the affinity causes Karpenter to compute a nodeclaim with the restricted domain as one of it's labels. Then, Karpenter validates the nodeclaim, detects this restricted label, and determines it's an unsatisfiable nodeclaim and can not be created. So Karpenter does not scale up a node.

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 8, 2024
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle rotten
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Sep 7, 2024
@jwcesign
Copy link
Contributor

jwcesign commented Sep 8, 2024

related PR: #1608

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

@k8s-ci-robot
Copy link
Contributor

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot closed this as not planned Won't fix, can't repro, duplicate, stale Oct 8, 2024
@njtran njtran reopened this Oct 10, 2024
@k8s-ci-robot
Copy link
Contributor

This issue is currently awaiting triage.

If Karpenter contributors determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Oct 10, 2024
@njtran
Copy link
Contributor

njtran commented Oct 10, 2024

Hey, re-opening since this fell through the cracks and staled out. I see this was opened for v0.35, can we try to reproduce on v1 and see if it's fixed?

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

@k8s-ci-robot k8s-ci-robot closed this as not planned Won't fix, can't repro, duplicate, stale Nov 9, 2024
@k8s-ci-robot
Copy link
Contributor

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one.
Projects
None yet
Development

No branches or pull requests

9 participants