Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Align the skip-drain label for the operator with the common upgrade lib #497

Merged
merged 1 commit into from
May 3, 2023

Conversation

almaslennikov
Copy link
Collaborator

Fix for #496

When the upgrade functionality was extracted into a separate upgrade library, the pod label for skipping drain was changed (https://github.com/NVIDIA/k8s-operator-libs/blob/89451d4d7524278f985a7e813d654084c9e5d146/pkg/upgrade/consts.go#L23). The operator's spec was not aligned to use this new label.

This label is required for the operator, when deployed on a single-node cluster, where operator runs on the same node as the ofed driver container. Without this label, the operator pod would drain itself, thus breaking the complete network-operator deployment.

With this label, the pod is omitted during the drain (https://github.com/NVIDIA/k8s-operator-libs/blob/89451d4d7524278f985a7e813d654084c9e5d146/pkg/upgrade/upgrade_state.go#L603)

When the upgrade functionality was extracted into a separate upgrade
library, the pod label for skipping drain was changed. The operator's
spec was not aligned to use this new label.

This label is required for the operator, when deployed
on a single-node cluster, where operator runs on the same node as
the ofed driver container. Without this label, the operator pod would
drain itself, thus breaking the complete network-operator deployment.

Signed-off-by: amaslennikov <[email protected]>
@adrianchiris
Copy link
Collaborator

adrianchiris commented May 2, 2023

now that we have moved the logic to a common place, technically each user (netop/gpuop) can provide their on drain labels
i dont think its needed to be handled by the common lib

imo a better solution is:

  1. remove the logic from operator-lib (it just handles drain spec as is)
  2. add logic to add their own custom labels it to each operator. so they have full control on whats being drained

Thoughts ?

that said, its not a blocker for merging this one for now.

@almaslennikov
Copy link
Collaborator Author

now that we have moved the logic to a common place, technically each user (netop/gpuop) can provide their on drain labels i dont think its needed to be handled by the common lib

imo a better solution is:

  1. remove the logic from operator-lib (it just handles drain spec as is)
  2. add logic to add their own custom labels it to each operator. so they have full control on whats being drained

Thoughts ?

that said, its not a blocker for merging this one for now.

now that we have moved the logic to a common place, technically each user (netop/gpuop) can provide their on drain labels i dont think its needed to be handled by the common lib

imo a better solution is:

  1. remove the logic from operator-lib (it just handles drain spec as is)
  2. add logic to add their own custom labels it to each operator. so they have full control on whats being drained

Thoughts ?

that said, its not a blocker for merging this one for now.

That makes sense, created a new issue for that: #500

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants