Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow custom NVIDIA CTK path #72

Merged
merged 1 commit into from
Feb 20, 2024
Merged

Allow custom NVIDIA CTK path #72

merged 1 commit into from
Feb 20, 2024

Conversation

empovit
Copy link
Contributor

@empovit empovit commented Feb 12, 2024

On OpenShift, the DTK and driver root paths must be overridden as follows:

nvidiaDriverRoot: /run/nvidia/driver
nvidiaCtkPath: /var/usrlocal/nvidia/toolkit/nvidia-ctk

Comment on lines +79 to +80
- name: NVIDIA_CTK_PATH
value: "{{ .Values.nvidiaCtkPath }}"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this also require us to set a default in values.yaml?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, is there a reason that this should be set at the root level of the values.yaml rather than specific to the kubelet plugin (same goes for .Values.nvidiaDriverRoot for that matter)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On:

Does this also require us to set a default in values.yaml?
The nvcdi package will default to /usr/bin/nvidia-ctk if this is empty, but I would assume that an explicit entry in values.yaml would be better.

Also, is there a reason that this should be set at the root level of the values.yaml rather than specific to the kubelet plugin (same goes for .Values.nvidiaDriverRoot for that matter)?

I think this depends on the API that the values file represents. I suppose from my perspective, the current options in the kubeletplugin section seem related to that pod specifically, whereas the driver root and this path are not specific to the plugin. Yes, they're currently only used there, but if another component is added here that requires these values where would be specify them then?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this also require us to set a default in values.yaml?

Good point. Fixed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, is there a reason that this should be set at the root level of the values.yaml rather than specific to the kubelet plugin (same goes for .Values.nvidiaDriverRoot for that matter)?

Hmm, I can fix that. Where do they should go, in your opinion?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@empovit I think this is probably more of a question for us internally than a serious issue with the PR.

@klueska do you want to block this PR on defining some other top-level structure to define these kinds of properties (driver, host?) or are you ok with leaving this as a top-level value for the time being?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I think it's fine. Just an observation (but one that has implications on the user-facing API, so I want us to be conscious of what we are doing).

Signed-off-by: Vitaliy Emporopulo <[email protected]>
@elezar elezar merged commit 203be1d into NVIDIA:main Feb 20, 2024
5 checks passed
@empovit empovit deleted the openshift branch September 5, 2024 09:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants