-
Notifications
You must be signed in to change notification settings - Fork 204
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Karpenter fails to schedule a pending pod with a preferred affinity #1204
Comments
Its not just a preferred node affinity that isn't satisfiable that is causing this. Its due to the label being for a restricted domain (
It may not be necessary to validate preferred terms however since they can be ignored. |
Would you mind linking the code for this special handling to me? I'm searching for any mention of "node-role" or "control-plane" label and not finding anything. I don't think it's safe to just ignore any |
/assign @jmdeal |
@wmgroot I believe what is happening is that the affinity causes Karpenter to compute a nodeclaim with the restricted domain as one of it's labels. Then, Karpenter validates the nodeclaim, detects this restricted label, and determines it's an unsatisfiable nodeclaim and can not be created. So Karpenter does not scale up a node. |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten |
related PR: #1608 |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /close not-planned |
@k8s-triage-robot: Closing this issue, marking it as "Not Planned". In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
This issue is currently awaiting triage. If Karpenter contributors determines this is a relevant issue, they will accept it by applying the The Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Hey, re-opening since this fell through the cracks and staled out. I see this was opened for v0.35, can we try to reproduce on v1 and see if it's fixed? |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /close not-planned |
@k8s-triage-robot: Closing this issue, marking it as "Not Planned". In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Description
Observed Behavior:
We have a pod stuck in pending indefinitely and Karpenter does not take action to add a new node to allow the pod to schedule.
The pod has a soft affinity to prefer controlplane nodes. Given this is an EKS cluster, this pod can never schedule on a controlplane node.
Expected Behavior:
Karpenter creates a node to allow the pod to schedule even though the pod has a soft affinity preference that cannot be satisfied. Not scheduling the pod can result in prolonged outages, blocked PDBs and other undesirable behavior that requires manual intervention and is worse than an unsatisfied soft affinity.
Reproduction Steps (Please include YAML):
I believe this should be reproducible with a pod that uses a nodeselector/toleration for an isolated NodePool for easier testing.
Upon removing the affinity spec from the example above, Karpenter added a node immediately to allow the pod to schedule.
Versions:
kubectl version
):The text was updated successfully, but these errors were encountered: