-
Notifications
You must be signed in to change notification settings - Fork 114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support Externally Created virtual functions #436
Support Externally Created virtual functions #436
Conversation
Thanks for your PR,
To skip the vendors CIs use one of:
|
Pull Request Test Coverage Report for Build 5753402663
💛 - Coveralls |
Thanks for your PR,
To skip the vendors CIs use one of:
|
Thanks for your PR,
To skip the vendors CIs use one of:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, PR looks good to me. I added a few comments error messages
df85137
to
5cf1a83
Compare
Thanks for your PR,
To skip the vendors CIs use one of:
|
LGTM |
5cf1a83
to
aed3c7b
Compare
Thanks for your PR,
To skip the vendors CIs use one of:
|
aed3c7b
to
a2c3ae9
Compare
Thanks for your PR,
To skip the vendors CIs use one of:
|
a2c3ae9
to
294054c
Compare
Thanks for your PR,
To skip the vendors CIs use one of:
|
@@ -59,6 +59,8 @@ type SriovNetworkNodePolicySpec struct { | |||
VdpaType string `json:"vdpaType,omitempty"` | |||
// Exclude device's NUMA node when advertising this resource by SRIOV network device plugin. Default to false. | |||
ExcludeTopology bool `json:"excludeTopology,omitempty"` | |||
// don't create the virtual function only allocated them to the device plugin. Defaults to false. | |||
ExternallyCreated bool `json:"externallyCreated,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how does ExternallyCreated relate to other parameters here ?
we need to have this documented so users will know what to expect when using it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I add documentation for it let me know if you need anything else :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a reason to do it per interface and not global?
Do you have use-case when some SR-IOV will be create by SR-IOV opeartor and some not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I read it now in the description. You want to use it for CNI primary network and for secondary VF will be managed by sriov operator
294054c
to
4e5a5e0
Compare
Thanks for your PR,
To skip the vendors CIs use one of:
|
1 similar comment
Thanks for your PR,
To skip the vendors CIs use one of:
|
Thanks for your PR,
To skip the vendors CIs use one of:
|
pkg/webhook/validate.go
Outdated
|
||
// Externally created: we don't support ExternallyCreated + EswitchMode | ||
//TODO: if needed we will need to add this in the future as today EswitchMode is for HWOFFLOAD | ||
if cr.Spec.ExternallyCreated && cr.Spec.EswitchMode == sriovnetworkv1.ESwithModeSwitchDev { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do you want to block it ?
if its externally created and device is already in switchdev mode then its OK no ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should block right now because all the systemd and bash script we do for this will be hard to manage.
and right know I am not aware of any request for this.
pkg/utils/host.go
Outdated
_, err = os.Stat(PfAppliedConfig) | ||
if err != nil { | ||
if os.IsNotExist(err) { | ||
err = os.Mkdir(PfAppliedConfig, os.ModeDir) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can call MkdirAll[1] on PfAppliedConfg
[1[ https://pkg.go.dev/os#MkdirAll
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
starting to wonder if we need a design doc for this one.
it feels that we need to get an agreement on what exactly ExternallyManaged means
then outline the changes in the different flows of sriov-network-operator.
that way:
- we agree on what we want to achieve
- make sure this PR is doing what we want
pkg/utils/host.go
Outdated
return nil | ||
} | ||
|
||
func ClearPCIAddressFolder() error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: add docsting
pkg/utils/host.go
Outdated
return nil | ||
} | ||
|
||
func CreatePfAppliedStatusFromSpec(p *sriovnetworkv1.Interface) *PfStatus { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: add docsting
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not save/load sriovnetworkv1.Interface
?
i just wonder if we really need this intermediate struct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also it would be nice if all these methods are bunched together in an interface.
essentially you are storing and loading information of a PF.
this "Store" once created can initialize its storage path.
pkg/utils/host.go
Outdated
// returns false if the file doesn't exist. | ||
func LoadPfsStatus(pciAddress string, chroot bool) (*PfStatus, bool, error) { | ||
if chroot { | ||
exit, err := Chroot("/host") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need to chroot ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A: Because SyncNodeState is called with chroot
return err | ||
} | ||
// we only update the status to avoid any race conditions where a new policy can be applied | ||
latestState.Status = updatedState.Status |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is all starting to get complex.
can you move all of this "prep" logic to its own function and call it here ?
please keep comments so we understand why things are there .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree but I will like to do it in a followup PR if possible
@@ -452,6 +458,15 @@ func (dn *Daemon) nodeStateSyncHandler() error { | |||
syncStatus: syncStatusInProgress, | |||
lastSyncError: "", | |||
} | |||
// wait for writer to refresh status then pull again the latest node state | |||
<-dn.syncCh | |||
updatedState, err := dn.client.SriovnetworkV1().SriovNetworkNodeStates(namespace).Get(context.Background(), dn.name, metav1.GetOptions{}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not just overwrite latestState here ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because we may get a new spec that we don't expect.
pkg/daemon/daemon.go
Outdated
glog.Warningf("nodeStateSyncHandler(): Failed to fetch node state %s: %v", dn.name, err) | ||
return err | ||
} | ||
// we only update the status to avoid any race conditions where a new policy can be applied |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if generation changed from when we first got it. just fail WDYT ?
im not a fan of starting to patch the object
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure we can go with that
@@ -270,6 +319,12 @@ func NeedUpdate(iface *sriovnetworkv1.Interface, ifaceStatus *sriovnetworkv1.Int | |||
glog.V(2).Infof("NeedUpdate(): VF %d MTU needs update, desired=%d, current=%d", vf.VfID, group.Mtu, vf.Mtu) | |||
return true | |||
} | |||
|
|||
// this is needed to be sure the admin mac address is configured as expected |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why can the admin mac be "externally configured" ? that should be part of sriov configuration performed by the external service.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we need this to be sure we maintain the same capability as before.
this means for nics that don't configure the admin mac address the operator configures them when externally manage is not requested.
so we need to have the same for externally manage true
return err | ||
} | ||
|
||
if !exist { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if entry was not found, does this nessecarily mean that VFs were not created by sriov-operator ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes because we always going to create the PF info
also UT need to be added where possible. |
13835b6
to
b6f321d
Compare
Thanks for your PR,
To skip the vendors CIs use one of:
|
Thanks for your PR,
To skip the vendors CIs use one of:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some nit. It would be great if we could make some unit test around daemon/generic-plugin logic
Signed-off-by: Sebastian Sch <[email protected]>
Signed-off-by: Sebastian Sch <[email protected]>
Signed-off-by: Sebastian Sch <[email protected]>
45bb99c
to
e80a4e1
Compare
Thanks for your PR,
To skip the vendors CIs use one of:
|
Signed-off-by: Sebastian Sch <[email protected]>
Signed-off-by: Sebastian Sch <[email protected]>
Signed-off-by: Sebastian Sch <[email protected]>
Signed-off-by: Sebastian Sch <[email protected]> WIP Signed-off-by: Sebastian Sch <[email protected]>
Signed-off-by: Sebastian Sch <[email protected]>
Signed-off-by: Sebastian Sch <[email protected]>
Signed-off-by: Sebastian Sch <[email protected]>
Signed-off-by: Sebastian Sch <[email protected]>
Signed-off-by: Sebastian Sch <[email protected]>
e80a4e1
to
835f5ac
Compare
Thanks for your PR,
To skip the vendors CIs use one of:
|
I am working on a separate PR after we move the drain to a controller a refactor in the daemon that will use more interfaces so we can mock easily |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a few minor comments. Overall shape looks good to me.
Thanks for your PR,
To skip the vendors CIs use one of:
|
Hi @zeeke let me know if the changes are good and I will rebase the commit messages |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thanks for addressing my comments
Signed-off-by: Sebastian Sch <[email protected]>
0399543
to
53cfd3d
Compare
Thanks for your PR,
To skip the vendors CIs use one of:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks for addressing my comments
Thanks folks! Merging after 2 maintainers approval |
this PR adds the new API to allow the operator to use virtual functions
that the user configures manually on the system
this will allow new use-cases like using a VF as the SDN primary nic.
the nic will be created before the operator starts by the user, for
example using nmstate or NetworkManager and the operator will allocate
all the remaining vfs to pods.