Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FLINK-36018][autoscaler] Support lazy scale down to avoid frequent rescaling #875

Merged
merged 2 commits into from
Sep 6, 2024

Conversation

1996fanrui
Copy link
Member

@1996fanrui 1996fanrui commented Aug 28, 2024

What is the purpose of the change

When the source traffic decreases, the autoscaler scales down job in time to save resources. It causes that each job rescales more than 20 times a day.

Brief change log

  • [FLINK-36018][autoscaler] Support lazy scale down to avoid frequent rescaling
  • Configuration Option:
    • Introduce job.autoscaler.scale-down.interval, the default value could be 1 hour.
    • Replace job.autoscaler.scale-up.grace-period with job.autoscaler.scale-down.interval
  • Core code change:
    • Introducing the DelayedScaleDown to store the firstTriggerTime for all vertices.
      • The scale down can be triggered when the (now - firstTriggerTime > scale-down.interval).
    • Updating the return value type of JobVertexScaler#computeScaleTargetParallelism from int to ParallelismResult
      • ParallelismResult introduced the required field, it means the parallelism change of current vertex is required or optional.
      • In general, scale up the required, and scale down is optional if it's within scale-down.interval.
      • If all parallelism changes are optional, we could ignore rescaling.

Verifying this change

  • Added some unit tests for JobVertexScalerTest and ScalingExecutorTest.
  • Improved the test of AbstractAutoScalerStateStoreTest

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): no
  • The public API, i.e., is any changes to the CustomResourceDescriptors: yes, update the autoscaler option.
  • Core observer or reconciler logic that is regularly executed: no

Documentation

  • Does this pull request introduce a new feature? yes
  • If yes, how is the feature documented? Added the option description.

@1996fanrui 1996fanrui force-pushed the 36018/delay-scale-down branch 4 times, most recently from ee50f77 to 301742a Compare August 29, 2024 06:36
@1996fanrui 1996fanrui marked this pull request as ready for review August 29, 2024 10:34
Copy link
Member Author

@1996fanrui 1996fanrui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @gyfora @mxm , this PR is ready to review. Would you mind reviewing it in your free time? thank you in advance.

Copy link
Contributor

@gyfora gyfora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, my only concern is related to the behaviour after a scale up. I feel like we should reset the DelayedScaleDown. firstTriggerTimes to the scale up time to avoid scaling down after a scale up


var firstTriggerTime = delayedScaleDown.getFirstTriggerTimeForVertex(vertex);
if (firstTriggerTime.isEmpty()) {
LOG.info("The scale down request is delayed for {}", vertex);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we maybe log how long it is delayed?

Comment on lines 186 to 273
if (lastScalingTs.plus(gracePeriod).isAfter(clock.instant())) {
LOG.info(
"Skipping immediate scale down after scale up within grace period for {}",
vertex);
return true;
if (clock.instant().isBefore(firstTriggerTime.get().plus(scaleDownInterval))) {
LOG.debug("Try to skip immediate scale down within scale-down interval for {}", vertex);
return ParallelismResult.optional(newParallelism);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it may make sense to keep the previous logic. If you just had a scale up (you need to catch up) it's not good to scale down even if the scale down request was triggered a long time ago.

I think we can simply use the old logic here and basically a scale up would "reset" the scale down first trigger time. By actually resetting it we may even be able to get rid of this method completely?

// Clear delayed scale down request if the new parallelism is equal to
// currentParallelism.
delayedScaleDown.clearVertex(vertex);
return ParallelismResult.optional(currentParallelism);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should have something like ParallelismResult.noChange() because this looks a bit weird :D

* that if all vertices' ParallelismResult is optional, rescaling will be ignored.
*/
@Getter
public static class ParallelismResult {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we call this ParallelismChange instead?

@@ -156,6 +159,9 @@ public boolean scaleResource(

autoScalerStateStore.storeConfigChanges(context, configOverrides);

// Try to clear all delayed scale down requests after scaling.
delayedScaleDown.clearAll();
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @gyfora , reply your comment[1] here.

IIUC, the previous logic still exists, if the rescale happens, delayedScaleDown.clearAll will be called (rescale includes scale up). It means the first trigger time is reset.

Not sure whether your question is answered.

[1] #875 (comment)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic isn't same with the previous logic.

One demo of previous logic

  • 9:00 scale up (be executed)
  • 9:30 scale down (ignore)
  • 9:59 scale down (ignore)
  • 10:00 scale down (can be executed)

New logic

  • 9:00 scale up (be executed, clear all first trigger time)
  • 9:30 scale down (update the first trigger time to 9:30)
  • 10:00 scale down (within interval from 9:30 to 10:30)
  • 10:29 scale down (within interval from 9:30 to 10:30)
  • 10:30 scale down (can be executed)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if there was a scale down before 9:00 and that set the first trigger time, that doesn't matter as the scale up resets it subsequently.

In that case we should probably rename the method detectImmediateScaleDown to something like applyScaleDownInterval

@1996fanrui
Copy link
Member Author

Hey @gyfora , thank you for the quick review, all comments are addressed. ❤️

Copy link
Contributor

@gyfora gyfora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good, let's give @mxm 1-2 days to comment on this :)

@gyfora
Copy link
Contributor

gyfora commented Sep 6, 2024

@1996fanrui I think we can go ahead with this :)

@1996fanrui
Copy link
Member Author

@1996fanrui I think we can go ahead with this :)

Thank you, let me merge it

@1996fanrui 1996fanrui merged commit 968a578 into apache:main Sep 6, 2024
229 checks passed
@1996fanrui 1996fanrui deleted the 36018/delay-scale-down branch September 6, 2024 08:55
Copy link
Contributor

@mxm mxm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @1996fanrui and @gyfora for this improvement! The changes look good to me. Thank you also for waiting for any comments from my side.

.durationType()
.defaultValue(Duration.ofHours(1))
.withFallbackKeys(oldOperatorConfigKey("scale-up.grace-period"))
.withDeprecatedKeys(autoScalerConfigKey("scale-up.grace-period"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding this as a deprecated key for existing users!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants