Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does ebs-csi-driver have impact on the AZ chosen when EKS replace a worker node? #2133

Open
ensean opened this issue Sep 4, 2024 · 4 comments

Comments

@ensean
Copy link

ensean commented Sep 4, 2024

Hello, we are planning to deploy a stateful application(Clickhouse) to EKS with the support of ebs-csi-driver.

But we are concerning that during the process of worker node replacement(may be due to hardware failure etc.), is there a chance that an instance in a different AZ is launched, causing EBS volume mount failure?

/triage support

@k8s-ci-robot
Copy link
Contributor

@ensean: The label(s) triage/support cannot be applied, because the repository doesn't have them.

In response to this:

Hello, we are planning to deploy a stateful application(Clickhouse) to EKS with the support of ebs-csi-driver.

But we are concerning that during the process of worker node replacement(may be due to hardware failure etc.), is there a chance that an instance in a different AZ is launched, causing EBS volume mount failure?

/triage support

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@ConnorJC3
Copy link
Contributor

Hi @ensean - EBS volumes are a zonal resource, so any instances using a volume must be in the same zone as the volume (for more info see the AWS docs for EBS volumes). The EBS CSI Driver will automatically populate a CSI Topology label with the key topology.kubernetes.io/zone containing the AZ of the associated volume/node.

Most node scalers such as cluster-autoscaler or Karpenter either automatically are aware of CSI topology labels or can be configured to do so. For example, here is the Karpenter documentation about PV-aware scheduling.

@ensean
Copy link
Author

ensean commented Sep 5, 2024

Hi @ConnorJC3 , thanks a lot for your explanation.

Suppose the AZ where the PV belongs fails, will ebs-csi-driver try to copy the PV to another AZ through snapshot?

@AndrewSirenko
Copy link
Contributor

Suppose the AZ where the PV belongs fails, will ebs-csi-driver try to copy the PV to another AZ through snapshot?

No, the ebs-csi-driver will not try to copy a failed PV to another AZ through snapshot.

You may be able to write your own Kubernetes Operator which would do this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants