Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cleanup: Documentation on NFD #115

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ Use OpenOnload® or EnterpriseOnload® to accelerate your workloads in Kubernete
* [AMD Solarflare](https://www.solarflare.com) hardware (`sfc`)
* OpenShift Container Platform (OCP) 4.10+ with
* [Kernel Module Management (KMM) Operator](https://kmm.sigs.k8s.io/) 1.1 ([OpenShift documentation](https://docs.openshift.com/container-platform/4.14/hardware_enablement/kmm-kernel-module-management.html))
* [Node Feature Discovery (NFD)](docs/nfd.md) Operator (optional)
* Both restricted network or internet-connected clusters

Deployment can also be performed on Kubernetes 1.23+ but full implementation details are not currently provided.
Expand Down Expand Up @@ -164,6 +165,9 @@ this recommended overlay further, see the variant steps below.
The above overlay configures KMM to `modprobe onload` but `modprobe sfc` is also required.
Please see [Out-of-tree `sfc` module](#out-of-tree-sfc-kernel-module) for options.

The above overlay selects **all `worker` role nodes** in the cluster. To filter based on node hardware, you may wish
to use the [recommended Node Feature Discovery configuration](docs/nfd.md).

> [!IMPORTANT]
> Due to Kubernetes limitations on label lengths, the combined length of the Name and Namespace of the Onload CR must be less than 32 characters.

Expand Down
54 changes: 54 additions & 0 deletions docs/nfd.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@

# Selecting Nodes with AMD Solarflare hardware using Node Feature Discovery (NFD)

## Cluster configuration

[Node Feature Discovery (NFD)](https://kubernetes-sigs.github.io/node-feature-discovery)
([Redhat documentation](https://docs.openshift.com/container-platform/4.14/hardware_enablement/psap-node-feature-discovery-operator.html#create-cd-cli_node-feature-discovery-operator))
enables the selection of nodes based on hardware features and system configuration.
NFD-Worker runs on each node to detect changes which are then used to label the node.

A `NodeFeatureDiscovery` CR enables the detections you require. A full example is provided in the above documentation
if you do not already have one configured.

To enable detection of AMD Solarflare cards, identified by the PCIe Subsystem Vendor ID '1924',
add the following configuration to your CR's `configData` section:

```yaml
kind: NodeFeatureDiscovery
...
spec:
...
workerConfig:
configData: |
sources:
pci:
deviceClassWhitelist:
- "1924"
deviceLabelFields:
- "subsystem_vendor"
```

After NFD is deployed, configured, and its daemons have performed detections, verify with:

```sh
kubectl get nodes -l feature.node.kubernetes.io/pci-1924.present=true
```

## Onload Custom Resource (CR) & workload configuration

Now the above is configured, automated build and loading of the out-of-tree `sfc` driver on all AMD Solarflare
hardware nodes can be easily achieved through the addition the following node label selector in
your Onload CR and/or workloads:

```yaml
selector:
feature.node.kubernetes.io/pci-1924.present: "true"
```

## Footnotes

```yaml
SPDX-License-Identifier: MIT
SPDX-FileCopyrightText: (c) Copyright 2023 Advanced Micro Devices, Inc.
```