Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request support for E823-C NICs which are used in Kontron ME1310 servers #501

Closed
murali509 opened this issue Sep 7, 2023 · 5 comments
Closed

Comments

@murali509
Copy link
Contributor

murali509 commented Sep 7, 2023

We are using Kontron ME1310 Servers for deploying GDCv clusters. Upon checking, we observed that we are unable to create VFs due to unavailability of support for E823-C NICs as seen in the configmap https://github.com/k8snetworkplumbingwg/sriov-network-operator/blob/master/deploy/configmap.yaml

Request you to add support for E823-C NICs.

@SchSeba
Copy link
Collaborator

SchSeba commented Sep 11, 2023

Hi @murali509,
I don't think there is someone in the community that has access to that type of hardware.

if you need you can update the configmap when you deploy the operator.
but if you want that to be fully support in the community as initial please open a PR to add the file to the configmap and also info needed to https://github.com/k8snetworkplumbingwg/sriov-network-operator/blob/master/doc/vendor-support.md

on that PR please show a case where a pod is able to use the netdevice and also user-space for a dpdk application.
optional is a support for hw offload.

@murali509
Copy link
Contributor Author

I tried to update the configmap by appending below line and restarted sriov related pods:

Intel_ice_Columbiapark_E823C: "8086 188a"

I observed that sriov-network-config-daemon-qw9s4 pod is failing to start with below error in the logs:

kubectl logs sriov-network-config-daemon-qw9s4 -n gke-operators
I1008 19:36:59.263709 2463936 start.go:133] starting node writer
I1008 19:36:59.269851 2463936 start.go:153] Running on platform: Baremetal
I1008 19:36:59.269867 2463936 writer.go:44] Run(): start writer
I1008 19:36:59.269872 2463936 writer.go:47] Run(): once
I1008 19:36:59.313443 2463936 utils.go:598] getLinkType(): Device 0000:05:00.0
I1008 19:36:59.314128 2463936 utils.go:598] getLinkType(): Device 0000:16:00.0
I1008 19:36:59.315500 2463936 utils.go:598] getLinkType(): Device 0000:16:00.1
I1008 19:36:59.316903 2463936 utils.go:598] getLinkType(): Device 0000:89:00.0
I1008 19:36:59.317552 2463936 utils.go:598] getLinkType(): Device 0000:89:00.1
I1008 19:36:59.318243 2463936 utils.go:598] getLinkType(): Device 0000:89:00.2
I1008 19:36:59.318870 2463936 utils.go:598] getLinkType(): Device 0000:89:00.3
I1008 19:36:59.319867 2463936 utils.go:598] getLinkType(): Device 0000:91:00.0
I1008 19:36:59.321267 2463936 utils.go:598] getLinkType(): Device 0000:91:00.1
I1008 19:36:59.329100 2463936 writer.go:132] setNodeStateStatus(): syncStatus: , lastSyncError:
I1008 19:36:59.338049 2463936 writer.go:170] writeCheckpointFile(): try to decode the checkpoint file
I1008 19:36:59.338371 2463936 start.go:159] Starting SriovNetworkConfigDaemon
I1008 19:36:59.338415 2463936 writer.go:44] Run(): start writer
I1008 19:36:59.338427 2463936 daemon.go:228] Run(): start daemon
E1008 19:36:59.338720 2463936 daemon.go:987] tryEnableRdma(): fail to enable rdma fork/exec /bin/bash: no such file or directory:
I1008 19:36:59.338982 2463936 utils.go:523] LoadKernelModule(): try to load kernel module tun
E1008 19:36:59.354826 2463936 runtime.go:78] Observed a panic: runtime.boundsError{x:2, y:2, signed:true, code:0x0} (runtime error: index out of range [2] with length 2)
goroutine 1 [running]:
k8s.io/apimachinery/pkg/util/runtime.logPanic({0x1f83e80?, 0xc0005a8d98})
/go/src/github.com/k8snetworkplumbingwg/sriov-network-operator/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:74 +0x99
k8s.io/apimachinery/pkg/util/runtime.HandleCrash({0x0, 0x0, 0x10?})
/go/src/github.com/k8snetworkplumbingwg/sriov-network-operator/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:48 +0x75
panic({0x1f83e80, 0xc0005a8d98})
/usr/local/go/src/runtime/panic.go:884 +0x212
github.com/k8snetworkplumbingwg/sriov-network-operator/api/v1.GetSupportedVfIds()
/go/src/github.com/k8snetworkplumbingwg/sriov-network-operator/api/v1/helper.go:170 +0x2a5
github.com/k8snetworkplumbingwg/sriov-network-operator/pkg/daemon.tryCreateNMUdevRule()
/go/src/github.com/k8snetworkplumbingwg/sriov-network-operator/pkg/daemon/daemon.go:1068 +0x8a
github.com/k8snetworkplumbingwg/sriov-network-operator/pkg/daemon.(*Daemon).tryCreateUdevRuleWrapper(0xc00137e340)
/go/src/github.com/k8snetworkplumbingwg/sriov-network-operator/pkg/daemon/daemon.go:218 +0x190
github.com/k8snetworkplumbingwg/sriov-network-operator/pkg/daemon.(*Daemon).Run(0xc00137e340, 0xc00010e180, 0xc00010e6c0)
/go/src/github.com/k8snetworkplumbingwg/sriov-network-operator/pkg/daemon/daemon.go:240 +0x165
main.runStartCmd(0x2eb53c0?, {0x20672ae?, 0x0?, 0x0?})
/go/src/github.com/k8snetworkplumbingwg/sriov-network-operator/cmd/sriov-network-config-daemon/start.go:170 +0xb31
github.com/spf13/cobra.(*Command).execute(0x2eb53c0, {0x2f715e0, 0x0, 0x0})
/go/src/github.com/k8snetworkplumbingwg/sriov-network-operator/vendor/github.com/spf13/cobra/command.go:854 +0x663
github.com/spf13/cobra.(*Command).ExecuteC(0x2eb5120)
/go/src/github.com/k8snetworkplumbingwg/sriov-network-operator/vendor/github.com/spf13/cobra/command.go:958 +0x39d
github.com/spf13/cobra.(*Command).Execute(...)
/go/src/github.com/k8snetworkplumbingwg/sriov-network-operator/vendor/github.com/spf13/cobra/command.go:895
main.main()
/go/src/github.com/k8snetworkplumbingwg/sriov-network-operator/cmd/sriov-network-config-daemon/main.go:27 +0x25
I1008 19:36:59.354930 2463936 writer.go:61] Run(): refresh trigger
panic: runtime error: index out of range [2] with length 2 [recovered]
panic: runtime error: index out of range [2] with length 2

goroutine 1 [running]:
k8s.io/apimachinery/pkg/util/runtime.HandleCrash({0x0, 0x0, 0x10?})
/go/src/github.com/k8snetworkplumbingwg/sriov-network-operator/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:55 +0xd7
panic({0x1f83e80, 0xc0005a8d98})
/usr/local/go/src/runtime/panic.go:884 +0x212
github.com/k8snetworkplumbingwg/sriov-network-operator/api/v1.GetSupportedVfIds()
/go/src/github.com/k8snetworkplumbingwg/sriov-network-operator/api/v1/helper.go:170 +0x2a5
github.com/k8snetworkplumbingwg/sriov-network-operator/pkg/daemon.tryCreateNMUdevRule()
/go/src/github.com/k8snetworkplumbingwg/sriov-network-operator/pkg/daemon/daemon.go:1068 +0x8a
github.com/k8snetworkplumbingwg/sriov-network-operator/pkg/daemon.(*Daemon).tryCreateUdevRuleWrapper(0xc00137e340)
/go/src/github.com/k8snetworkplumbingwg/sriov-network-operator/pkg/daemon/daemon.go:218 +0x190
github.com/k8snetworkplumbingwg/sriov-network-operator/pkg/daemon.(*Daemon).Run(0xc00137e340, 0xc00010e180, 0xc00010e6c0)
/go/src/github.com/k8snetworkplumbingwg/sriov-network-operator/pkg/daemon/daemon.go:240 +0x165
main.runStartCmd(0x2eb53c0?, {0x20672ae?, 0x0?, 0x0?})
/go/src/github.com/k8snetworkplumbingwg/sriov-network-operator/cmd/sriov-network-config-daemon/start.go:170 +0xb31
github.com/spf13/cobra.(*Command).execute(0x2eb53c0, {0x2f715e0, 0x0, 0x0})
/go/src/github.com/k8snetworkplumbingwg/sriov-network-operator/vendor/github.com/spf13/cobra/command.go:854 +0x663
github.com/spf13/cobra.(*Command).ExecuteC(0x2eb5120)
/go/src/github.com/k8snetworkplumbingwg/sriov-network-operator/vendor/github.com/spf13/cobra/command.go:958 +0x39d
github.com/spf13/cobra.(*Command).Execute(...)
/go/src/github.com/k8snetworkplumbingwg/sriov-network-operator/vendor/github.com/spf13/cobra/command.go:895
main.main()
/go/src/github.com/k8snetworkplumbingwg/sriov-network-operator/cmd/sriov-network-config-daemon/main.go:27 +0x25

Any help on fixing this issue is highly appreciated. Thank you.

@rollandf
Copy link
Contributor

rollandf commented Oct 9, 2023

Hi,
You are missing a third value of VF ID as described here:
https://github.com/k8snetworkplumbingwg/sriov-network-operator/blob/master/doc/supported-hardware.md

@murali509
Copy link
Contributor Author

Oh yeah, I missed to provide 3rd value in the config map. Thanks for quick fix.

@murali509
Copy link
Contributor Author

Issue is resolved after adding the E823-C NIC in the config map as mentioned in the PR #520:

I have tested binding the VFs to a DPDK driver as you can see the output for eno2 NIC (E823C) below with vfio-pci type:

kubectl -n gke-operators get SriovNetworkNodeState dauk-mrl-k-gdcv-host02.denseair.net -o yaml
apiVersion: sriovnetwork.k8s.cni.cncf.io/v1
kind: SriovNetworkNodeState
metadata:
creationTimestamp: "2023-10-13T16:10:13Z"
generation: 5
name: dauk-mrl-k-gdcv-host02.denseair.net
namespace: gke-operators
ownerReferences:

apiVersion: sriovnetwork.k8s.cni.cncf.io/v1
blockOwnerDeletion: true
controller: true
kind: SriovNetworkNodePolicy
name: default
uid: b52c7683-e98b-4020-8fd8-b86037bdabb9
resourceVersion: "4059558"
uid: c4cfd65d-70f3-4578-914c-f2d43380fccb
spec:
dpConfigVersion: "4059478"
interfaces:
mtu: 1500
name: eno2
numVfs: 6
pciAddress: 0000:89:00.2
vfGroups:
deviceType: vfio-pci
mtu: 1500
policyName: sriov-network-node-policy-dpdk-ngu
resourceName: intel_sriov_dpdk_ngu
vfRange: 0-1
mtu: 1500
name: eno1
numVfs: 2
pciAddress: 0000:89:00.3
vfGroups:
deviceType: netdevice
mtu: 1500
policyName: sriov-network-node-policy-eno1
resourceName: eno1
vfRange: 0-1
mtu: 1500
name: enp145s0f0
numVfs: 2
pciAddress: 0000:91:00.0
vfGroups:
deviceType: netdevice
mtu: 1500
policyName: sriov-network-node-policy-enp145s0f0
resourceName: enp145s0f0
vfRange: 0-1
mtu: 1500
name: enp145s0f1
numVfs: 2
pciAddress: 0000:91:00.1
vfGroups:
deviceType: netdevice
mtu: 1500
policyName: sriov-network-node-policy-enp145s0f1
resourceName: enp145s0f1
vfRange: 0-1
mtu: 1500
name: enp22s0f0
numVfs: 2
pciAddress: "0000:16:00.0"
vfGroups:
deviceType: netdevice
mtu: 1500
policyName: sriov-network-node-policy-enp22s0f0
resourceName: enp22s0f0
vfRange: 0-1
mtu: 1500
name: enp22s0f1
numVfs: 2
pciAddress: "0000:16:00.1"
vfGroups:
deviceType: netdevice
mtu: 1500
policyName: sriov-network-node-policy-enp22s0f1
resourceName: enp22s0f1
vfRange: 0-1
status:
interfaces:
deviceID: "1533"
driver: igb
linkSpeed: 1000 Mb/s
linkType: ETH
mac: 00:a0:a5:e3:e3:8d
mtu: 1500
name: eno5
pciAddress: "0000:05:00.0"
vendor: "8086"
Vfs:
deviceID: "1889"
driver: iavf
mac: fa:49:be:21:5e:2f
mtu: 1500
name: enp22s0f0v0
pciAddress: "0000:16:01.0"
vendor: "8086"
vfID: 0
deviceID: "1889"
driver: iavf
mac: 3e:8f:a4:ad:75:24
mtu: 1500
name: enp22s0f0v1
pciAddress: "0000:16:01.1"
vendor: "8086"
vfID: 1
deviceID: 159b
driver: ice
eSwitchMode: legacy
linkSpeed: 10000 Mb/s
linkType: ETH
mac: b4:83:51:06:b1:a8
mtu: 1500
name: enp22s0f0
numVfs: 2
pciAddress: "0000:16:00.0"
totalvfs: 128
vendor: "8086"
Vfs:
deviceID: "1889"
driver: iavf
mac: 12:9e:9d:b6:ff:c4
mtu: 1500
name: enp22s0f1v0
pciAddress: "0000:16:11.0"
vendor: "8086"
vfID: 0
deviceID: "1889"
driver: iavf
mac: 82:31:82:f2:b8:bd
mtu: 1500
name: enp22s0f1v1
pciAddress: "0000:16:11.1"
vendor: "8086"
vfID: 1
deviceID: 159b
driver: ice
eSwitchMode: legacy
linkSpeed: 10000 Mb/s
linkType: ETH
mac: b4:83:51:06:b1:a9
mtu: 1500
name: enp22s0f1
numVfs: 2
pciAddress: "0000:16:00.1"
totalvfs: 128
vendor: "8086"
deviceID: 188a
driver: ice
eSwitchMode: legacy
linkSpeed: 10000 Mb/s
linkType: ETH
mac: 00:a0:a5:e3:e3:8c
mtu: 1500
name: eno4
pciAddress: 0000:89:00.0
totalvfs: 64
vendor: "8086"
deviceID: 188a
driver: ice
eSwitchMode: legacy
linkSpeed: 10000 Mb/s
linkType: ETH
mac: 00:a0:a5:e3:e3:8b
mtu: 1500
name: eno3
pciAddress: 0000:89:00.1
totalvfs: 64
vendor: "8086"
Vfs:
deviceID: "1889"
driver: vfio-pci
pciAddress: 0000:89:11.0
vendor: "8086"
vfID: 0
deviceID: "1889"
driver: vfio-pci
pciAddress: 0000:89:11.1
vendor: "8086"
vfID: 1
deviceID: "1889"
driver: iavf
mac: d6:3d:7a:42:37:d7
mtu: 1500
name: eno2v2
pciAddress: 0000:89:11.2
vendor: "8086"
vfID: 2
deviceID: "1889"
driver: iavf
mac: 52:d8:0f:97:d6:88
mtu: 1500
name: eno2v3
pciAddress: 0000:89:11.3
vendor: "8086"
vfID: 3
deviceID: "1889"
driver: iavf
mac: 52:b1:51:e9:8a:e5
mtu: 1500
name: eno2v4
pciAddress: 0000:89:11.4
vendor: "8086"
vfID: 4
deviceID: "1889"
driver: iavf
mac: 52:85:8b:89:88:8c
mtu: 1500
name: eno2v5
pciAddress: 0000:89:11.5
vendor: "8086"
vfID: 5
deviceID: 188a
driver: ice
eSwitchMode: legacy
linkSpeed: 10000 Mb/s
linkType: ETH
mac: 00:a0:a5:e3:e3:8a
mtu: 1500
name: eno2
numVfs: 6
pciAddress: 0000:89:00.2
totalvfs: 64
vendor: "8086"
Vfs:
deviceID: "1889"
driver: iavf
mac: 96:91:da:da:4f:76
mtu: 1500
name: eno1v0
pciAddress: 0000:89:19.0
vendor: "8086"
vfID: 0
deviceID: "1889"
driver: iavf
mac: b2:0f:94:f0:82:99
mtu: 1500
name: eno1v1
pciAddress: 0000:89:19.1
vendor: "8086"
vfID: 1
deviceID: 188a
driver: ice
eSwitchMode: legacy
linkSpeed: 10000 Mb/s
linkType: ETH
mac: 00:a0:a5:e3:e3:89
mtu: 1500
name: eno1
numVfs: 2
pciAddress: 0000:89:00.3
totalvfs: 64
vendor: "8086"
Vfs:
deviceID: "1889"
driver: iavf
mac: ae:ea:e8:f7:19:dd
mtu: 1500
name: enp145s0f0v0
pciAddress: 0000:91:01.0
vendor: "8086"
vfID: 0
deviceID: "1889"
driver: iavf
mac: 36:42:6d:a5:f5:ee
mtu: 1500
name: enp145s0f0v1
pciAddress: 0000:91:01.1
vendor: "8086"
vfID: 1
deviceID: 159b
driver: ice
eSwitchMode: legacy
linkSpeed: 10000 Mb/s
linkType: ETH
mac: b4:83:51:06:b2:18
mtu: 1500
name: enp145s0f0
numVfs: 2
pciAddress: 0000:91:00.0
totalvfs: 128
vendor: "8086"
Vfs:
deviceID: "1889"
driver: iavf
mac: 4a:fe:56:ef:68:f4
mtu: 1500
name: enp145s0f1v0
pciAddress: 0000:91:11.0
vendor: "8086"
vfID: 0
deviceID: "1889"
driver: iavf
mac: d2:85:3e:db:c6:1c
mtu: 1500
name: enp145s0f1v1
pciAddress: 0000:91:11.1
vendor: "8086"
vfID: 1
deviceID: 159b
driver: ice
eSwitchMode: legacy
linkSpeed: 10000 Mb/s
linkType: ETH
mac: b4:83:51:06:b2:19
mtu: 1500
name: enp145s0f1
numVfs: 2
pciAddress: 0000:91:00.1
totalvfs: 128
vendor: "8086"
syncStatus: Succeeded

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants