-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support bluegreen release: rollout controller #232
base: master
Are you sure you want to change the base?
Conversation
Signed-off-by: yunbo <[email protected]>
bb51872
to
dcc116c
Compare
tr.RecheckDuration = time.Duration(trafficrouting.GetGraceSeconds(c.Rollout.Spec.Strategy.GetTrafficRouting(), defaultGracePeriodSeconds)) * time.Second | ||
} | ||
expectedTime := time.Now().Add(tr.RecheckDuration) | ||
expectedTime := time.Now().Add(time.Duration(defaultGracePeriodSeconds) * time.Second) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why change the code here ? if one set TrafficRoutingRef.GracePeriodSeconds, we should respect the user setting
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How the tr.RecheckDuration works:
if we do some traffic-related operations (eg. some functions in manager.go), the operations will update the tr.RecheckDuration automatically, if operation A need to recheck 300ms later, operations B need to recheck 500ms later, then tr.RecheckDuration will be the max of 300ms and 500ms. However, the initial value of tr.RecheckDuration is 0, if we doesn't do any traffic-related operations, the tr.RecheckDuration will keep the initial value. When RecheckTime is 0, requeue won't happen. For function DoTrafficRouting, it won't always update tr.RecheckDuration, thus the tr.RecheckDuration may be 0. That's why i added a check (the removed lines) before. However, I realize that
- the check won't save significant time due to the complex logic in the DoTrafficRouting function
- this check is a little hard to understand for others, that's why i removed these lines and use the original defaultGracePeriodSeconds again (ie. the removed are lines added by me in a recent commit, and the line to add is the original logic)
@@ -506,7 +505,10 @@ func (r *RolloutReconciler) recalculateCanaryStep(c *RolloutContext) (int32, err | |||
if c.NewStatus != nil { | |||
currentIndex = c.NewStatus.GetSubStatus().CurrentStepIndex - 1 | |||
} | |||
steps := append([]int{}, int(currentIndex)) | |||
steps := make([]int, 0) | |||
if ci := int(currentIndex); ci >= 0 && ci < len(c.Rollout.Spec.Strategy.GetSteps()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
plz comment when ci will be less than zero ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, it's unlikely to be less than 0, just to make sure the slice doesn't go out of index
@@ -94,6 +94,13 @@ func (r *ControllerFinder) GetWorkloadForRef(rollout *rolloutv1beta1.Rollout) (* | |||
return workload, err | |||
} | |||
} | |||
} else if rollout.Spec.Strategy.GetRollingStyle() == rolloutv1beta1.BlueGreenRollingStyle { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
plz add ut for GetWorkloadForRef
if workload != nil || err != nil { | ||
return workload, err | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
consider extract the following code block as a common code
for _, finder := range finders {
workload, err := finder(rollout.Namespace, &workloadRef)
if workload != nil || err != nil {
return workload, err
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
blueGreenStatus.CurrentStepState = v1beta1.CanaryStepStateUpgrade | ||
klog.Infof("rollout(%s/%s) step(%d) state from(%s) -> to(%s)", c.Rollout.Namespace, c.Rollout.Name, | ||
blueGreenStatus.CurrentStepIndex, v1beta1.CanaryStepStateInit, blueGreenStatus.CurrentStepState) | ||
return nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
consider using a ’labeled’ fallthrough to avoid unnecessary requeue , e.g
if currentStep.Traffic == nil && len(currentStep.Matches) == 0 {
...
goto nextCase
}
....
nextCase:
fallthrough
case: v1beta1.CanaryStepStateUpgrade:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
} | ||
//TODO - consider istio subsets | ||
if blueGreenStatus.CurrentStepIndex == 1 { | ||
klog.Infof("special case detected: rollout(%s/%s) patch stable Service", c.Rollout.Namespace, c.Rollout.Name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is not special case, we should always patch stable service here. consider revise or removing to logging to avoid confusion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
} else { | ||
klog.Infof("rollout(%s/%s) bluegreen run all steps, and completed", c.Rollout.Namespace, c.Rollout.Name) | ||
blueGreenStatus.LastUpdateTime = &metav1.Time{Time: time.Now()} | ||
blueGreenStatus.CurrentStepState = v1beta1.CanaryStepStateCompleted |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we fallthough to the case CanaryStepStateCompleted to avoid unnecessary reenqueue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes
} else { | ||
bluegreenStatus.LastUpdateTime = &metav1.Time{Time: time.Now()} | ||
bluegreenStatus.CurrentStepState = v1beta1.CanaryStepStateInit | ||
klog.Infof("rollout(%s/%s) step(%d) state from(%s) -> to(%s)", c.Rollout.Namespace, c.Rollout.Name, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
plz move the following lines out of if clause since they can be shared by every if conditions
bluegreenStatus.LastUpdateTime = &metav1.Time{Time: time.Now()}
klog.Infof("rollout(%s/%s) step(%d) state from(%s) -> to(%s)", c.Rollout.Namespace, c.Rollout.Name,
bluegreenStatus.CurrentStepIndex, currentStepStateBackup, bluegreenStatus.CurrentStepState)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's a little hard to do so, here is why
- only if a step is done, can we run
bluegreenStatus.LastUpdateTime = &metav1.Time{Time: time.Now()}
, if not done, we cannot update thebluegreenStatus.LastUpdateTime
. - klog Infof is reasonable to not shared by every if conditions either, it can help us to realize the fallthrough, what's more, if we use a new variable currentStepStateBackup additionally, the total lines seems keep the same....
case v1beta1.FinalisingStepTypeRemoveCanaryService: | ||
retry, err = m.trafficRoutingManager.RemoveCanaryService(tr) | ||
// route all traffic to new version | ||
case v1beta1.FinalisingStepTypeRouteAllTraffic: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
consider rename these finalizing steps so that the name can be more differentiating. now it is very hard to tell the difference between FinalisingStepTypeRouteAllTraffic and FinalisingStepTypeGateway, and it is hard to guess the meaning of step FinalisingStepTypeBatchRelease and FinalisingStepTypeGateway, consider rename them to
FinalisingStepTypeRestoreWorkload and FinalisingStepTypeRestoreGateway
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok!
but CurrentStepIndex remains 2, which could cause out of range. | ||
*/ | ||
resetCurrentIndex := false | ||
if int(bluegreenStatus.CurrentStepIndex) > len(c.Rollout.Spec.Strategy.BlueGreen.Steps) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it possible to forbid adding or removing steps during rollout so that the code can be more simpler?
// if current step is empty, set it with the first step | ||
// if current step is end, we just return | ||
if len(blueGreenStatus.FinalisingStep) == 0 { | ||
blueGreenStatus.FinalisingStep = nextStep |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it possible to move finalizing status to rollotu.status.conditions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if the finished finalizing steps in saved in rollout.status.conditions, it is still possible to determine the current step in nextBlueGreenTask by iterating over taskSequence and find the last recorded step
Ⅰ. Describe what this PR does
the first patch is about rollout controller
Ⅱ. Does this pull request fix one issue?
Ⅲ. Special notes for reviews