Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

karpenter disruption taint isn't correct for karpenter v1 #2158

Open
willthames opened this issue Sep 24, 2024 · 3 comments
Open

karpenter disruption taint isn't correct for karpenter v1 #2158

willthames opened this issue Sep 24, 2024 · 3 comments
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@willthames
Copy link

/kind bug

What happened?

EBS CSI driver does not clean up after itself when node is terminating due to karpenter disruption after v1.0.1 karpenter upgrade

What you expected to happen?

EBS CSI driver removes all volume attachments when terminating due to node termination

How to reproduce it (as minimally and precisely as possible)?

Have an EBS volume attached, and don't configure any tolerations on the EBS CSI driver (so it gets terminated before the node terminates)

Anything else we need to know?:

The taint during karpenter disruption is:

  taints:
  - effect: NoSchedule
    key: karpenter.sh/disrupted

But the IsDisruptedTaint looks for disrupting rather than disrupted because it's using the v1beta1 API rather than the v1 API.

https://github.com/kubernetes-sigs/karpenter/blob/b69e975128ac9a511542f9f1d245a6d4c3f91112/pkg/apis/v1beta1/taints.go#L28

vs

https://github.com/kubernetes-sigs/karpenter/blob/b69e975128ac9a511542f9f1d245a6d4c3f91112/pkg/apis/v1/taints.go#L28

Environment

  • Kubernetes version (use kubectl version):
Client Version: v1.31.1
Kustomize Version: v5.4.2
Server Version: v1.30.3-eks-a18cd3a
  • Driver version: v1.35.0
@AndrewSirenko
Copy link
Contributor

AndrewSirenko commented Sep 24, 2024

disrupting rather than disrupted because it's using the v1beta1 API rather than the v1 API.

Wow this is a very subtle breaking change. Thank you so much @willthames for getting back to me and finding an even deeper root cause. We'll take care of this before our next driver release, and try to find some mechanism to make sure this does not happen a third time.

@torredil
Copy link
Member

CSI drivers are not responsible for deleting VolumeAttachment objects. This is handled by Kubernetes itself, specifically the Attach/Detach controller in the KCM.

In Karpenter v1.0.0 and beyond, Karpenter ensures that EBS volumes are properly unmounted/detached (VolumeAttachment objects deleted) before the node is terminated. That functionality relies on either:

a) The CSI node tolerateAllTaints parameter being enabled (which is the default setting currently)
b) Manually configuring the toleration on the node pod to prevent Karpenter from evicting it prematurely.

For users who have disabled tolerateAllTaints and haven't added the necessary toleration to the node pod spec, the pre-stop lifecycle hook in the driver can still be helpful (once we start checking for the presence of that new taint). However, this should be viewed as a fallback, best-effort mechanism rather than the primary method of ensuring clean volume detachment.

@AndrewSirenko
Copy link
Contributor

AndrewSirenko commented Sep 24, 2024

Manually configuring the toleration on the node pod to prevent Karpenter from evicting it prematurely.

I think a nice improvement here would be maintaining the list of common tolerations in our helm chart, that way the pre-stop hook wouldn't need to be relied upon as a fallback.

I don't think I would have caught this regression in driver cleanup due to a Karpenter upgrade myself either (it was a subtle line in the migration guide), which means we can alleviate some customer pain by making the driver 'just work' upon add-on creation.

This would also help us migrate off of tolerateAllTaints in the future if we decide to change that default as some maintainers have suggested.

That functionality relies on...

To be fair, we could do a better job here by documenting these assumptions, because volumes not unmounting during node drains might surprise non-subject-matter-experts in Kubernetes Storage. I'll add an FAQ item in the fix PR, but we should also consider adding a note in our install guide.


Either way, thank you @willthames for reporting this pain-point.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

No branches or pull requests

4 participants