-
Notifications
You must be signed in to change notification settings - Fork 254
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
✨ Add reference to HostUpdatePolicy in Servicing. #1969
Conversation
Skipping CI for Draft Pull Request. |
0d8b518
to
509027a
Compare
a327bf3
to
d532cf4
Compare
f165c0b
to
7eda476
Compare
206eb33
to
939aace
Compare
939aace
to
fbc9f58
Compare
0f11a97
to
7f4b773
Compare
87cb67e
to
95aa70b
Compare
LGTM, thanks for working on it @rhjanders |
95aa70b
to
e97612c
Compare
e97612c
to
b3a7d84
Compare
Signed-off-by: Dmitry Tantsur <[email protected]>
Servicing only runs when a host is powered off (either completely or by rebooting it). Signed-off-by: Dmitry Tantsur <[email protected]>
Signed-off-by: Jacob Anders <[email protected]> Removed unused ServicingData fields.
b3a7d84
to
da1c7ed
Compare
…ageHostPower. Signed-off-by: Jacob Anders <[email protected]>
da1c7ed
to
5dfc536
Compare
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
cbe1eb9
to
324e111
Compare
LGTM |
|
||
if provResult.Dirty { | ||
result := actionContinue{provResult.RequeueAfter} | ||
if dirty { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The actual thing we want to check here is whether we need to write the BMH. Writes occur on line 1406 and line 1420, but dirty
could be true if hfsDirty
is true even if nothing is actually updated.
// update didn't actually happen. This is deemed an acceptable risk for the moment since it is only | ||
// going to impact a small subset of Firmware Settings implementations. | ||
currentError := info.host.Status.ErrorType | ||
if clearErrorWithStatus(info.host, metal3api.OperationalStatusServicing) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is too early to clear an error. We should do it after the check for provResult.ErrorMessage != ""
around line 1418.
It's also arguably too late for an attempt to set the status to Servicing in the non-error case, because we'll still only write it once servicing has already started.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed the order as per the first line of the comment. Unsure about the second part - will check in with Dmitry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remember that writes to the k8s API only happen after we return from this function. So if you want to put it into OperationalStatusServicing
before servicing starts then you'd need to do something like:
if info.host.Status.OperationalStatus != metal3api.OperationalStatusServicing {
info.host.Status.OperationalStatus = metal3api.OperationalStatusServicing
return actionUpdate{}
}
Just changing the order has no effect in that respect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we do it this way, we'll lose the error information, and it will never be passed to prov.Service
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I stand corrected. If we don't call clearErrorWithStatus
, we'll have ErrorType
still set. We just need to remember to unset it after the successful call.
Maybe we need to return clearErrorWithStatus
to where it was before, but also set the status to servicing explicitly in the way that Zane suggested?
324e111
to
f30cddc
Compare
Signed-off-by: Jacob Anders <[email protected]>
f30cddc
to
ef760bc
Compare
Signed-off-by: Jacob Anders <[email protected]>
// succeed before leaving this state (e.g. by deprovisioning) we lose the signal that the | ||
// update didn't actually happen. This is deemed an acceptable risk for the moment since it is only | ||
// going to impact a small subset of Firmware Settings implementations. | ||
currentError := info.host.Status.ErrorType |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: you no longer need to store this because you no longer clear ErrorType before calling Service.
} | ||
|
||
if started && fwDirty { | ||
info.host.Status.Provisioning.Firmware = info.host.Spec.Firmware.DeepCopy() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You probably need dirty = true
here
/close Superseded by #2041 |
@dtantsur: Closed this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
What this PR does / why we need it:
This PR enables BMO to run Ironic servicing operations (such as applying firmware settings changes - or in the future firmware updates to already provisioned nodes). Servicing is an opt-in feature and is controlled by creation of a HostUpdatePolicy for a node with attributes indicating the desire to make changes to firmware configuration onReboot.
This is a partial implementation of https://github.com/metal3-io/metal3-docs/blob/main/design/baremetal-operator/host-live-updates.md (please note only firmware settings changes are currently supported, firmware update support will be added next).