You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We struggle to update clusters at times because changing circumstances can require our cluster to scale up the number of nodes. If cluster-autoscaler wasn't shut down, we could allow our cluster to scale up for increases in demand during the process.
I'm not sure of the best way to handle this, but it'd be fantastic if we could.
I think the primary issue with leaving the autoscaler on is that it will prefer to shut down nodes if nothing has scheduled there. This means that the nodes that get spun up before rotating nodes will be shut down prematurely. To combat this, we could annotate those nodes with "cluster-autoscaler.kubernetes.io/scale-down-disabled": "true". It might be hard to determine which nodes are "new", but disabling scale down on all nodes that match the new launch configuration should be a good heuristic. If new nodes join, they'll also need to be annotated, however. I believe it's also possible to disable scale-down entirely, but that would require modifying the autoscaler deployment so that's less attractive for that reason.
The next issue would be when the desired count increased while rotating nodes. This would mess with eks-rolling-update as the original count will have diverged from where it was. eks-rolling-update could perhaps tolerate increases to this number and update the ASG tags to match. If the number went down unexpectedly, that would be an issue that would still cause the tool to abort.
I searched for related issues, but didn't see anything, so I apologize if this has already been noted.
The text was updated successfully, but these errors were encountered:
We ran also in that issue and would be happy when someone has the time to contribute the feature. Meanwhile we are using Pods with a low priority (-1) which gets terminated when another Pod needs the Node. We have an Deployment for each ASG which starts these Pods. Before the eks-rolling-update script we scale out these deployment to the amount of Nodes we think we need and after that we scale them back in. Maybe thats a solution for you as well.
We struggle to update clusters at times because changing circumstances can require our cluster to scale up the number of nodes. If cluster-autoscaler wasn't shut down, we could allow our cluster to scale up for increases in demand during the process.
I'm not sure of the best way to handle this, but it'd be fantastic if we could.
I think the primary issue with leaving the autoscaler on is that it will prefer to shut down nodes if nothing has scheduled there. This means that the nodes that get spun up before rotating nodes will be shut down prematurely. To combat this, we could annotate those nodes with
"cluster-autoscaler.kubernetes.io/scale-down-disabled": "true"
. It might be hard to determine which nodes are "new", but disabling scale down on all nodes that match the new launch configuration should be a good heuristic. If new nodes join, they'll also need to be annotated, however. I believe it's also possible to disable scale-down entirely, but that would require modifying the autoscaler deployment so that's less attractive for that reason.The next issue would be when the desired count increased while rotating nodes. This would mess with
eks-rolling-update
as the original count will have diverged from where it was.eks-rolling-update
could perhaps tolerate increases to this number and update the ASG tags to match. If the number went down unexpectedly, that would be an issue that would still cause the tool to abort.I searched for related issues, but didn't see anything, so I apologize if this has already been noted.
The text was updated successfully, but these errors were encountered: