[FLINK-36018][autoscaler] Support lazy scale down to avoid frequent rescaling #875

1996fanrui · 2024-08-28T13:41:26Z

What is the purpose of the change

When the source traffic decreases, the autoscaler scales down job in time to save resources. It causes that each job rescales more than 20 times a day.

Brief change log

[FLINK-36018][autoscaler] Support lazy scale down to avoid frequent rescaling
Configuration Option:
- Introduce job.autoscaler.scale-down.interval, the default value could be 1 hour.
- Replace job.autoscaler.scale-up.grace-period with job.autoscaler.scale-down.interval
Core code change:
- Introducing the DelayedScaleDown to store the firstTriggerTime for all vertices.
  - The scale down can be triggered when the (now - firstTriggerTime > scale-down.interval).
- Updating the return value type of JobVertexScaler#computeScaleTargetParallelism from int to ParallelismResult
  - ParallelismResult introduced the required field, it means the parallelism change of current vertex is required or optional.
  - In general, scale up the required, and scale down is optional if it's within scale-down.interval.
  - If all parallelism changes are optional, we could ignore rescaling.

Verifying this change

Added some unit tests for JobVertexScalerTest and ScalingExecutorTest.
Improved the test of AbstractAutoScalerStateStoreTest

Does this pull request potentially affect one of the following parts:

Dependencies (does it add or upgrade a dependency): no
The public API, i.e., is any changes to the CustomResourceDescriptors: yes, update the autoscaler option.
Core observer or reconciler logic that is regularly executed: no

Documentation

Does this pull request introduce a new feature? yes
If yes, how is the feature documented? Added the option description.

…escaling

1996fanrui

Hey @gyfora @mxm , this PR is ready to review. Would you mind reviewing it in your free time? thank you in advance.

gyfora

Looks good, my only concern is related to the behaviour after a scale up. I feel like we should reset the DelayedScaleDown. firstTriggerTimes to the scale up time to avoid scaling down after a scale up

gyfora · 2024-08-30T06:55:49Z

flink-autoscaler/src/main/java/org/apache/flink/autoscaler/JobVertexScaler.java

+
+        var firstTriggerTime = delayedScaleDown.getFirstTriggerTimeForVertex(vertex);
+        if (firstTriggerTime.isEmpty()) {
+            LOG.info("The scale down request is delayed for {}", vertex);


should we maybe log how long it is delayed?

gyfora · 2024-08-30T06:59:59Z

flink-autoscaler/src/main/java/org/apache/flink/autoscaler/JobVertexScaler.java

-        if (lastScalingTs.plus(gracePeriod).isAfter(clock.instant())) {
-            LOG.info(
-                    "Skipping immediate scale down after scale up within grace period for {}",
-                    vertex);
-            return true;
+        if (clock.instant().isBefore(firstTriggerTime.get().plus(scaleDownInterval))) {
+            LOG.debug("Try to skip immediate scale down within scale-down interval for {}", vertex);
+            return ParallelismResult.optional(newParallelism);


I think it may make sense to keep the previous logic. If you just had a scale up (you need to catch up) it's not good to scale down even if the scale down request was triggered a long time ago.

I think we can simply use the old logic here and basically a scale up would "reset" the scale down first trigger time. By actually resetting it we may even be able to get rid of this method completely?

gyfora · 2024-08-30T07:01:10Z

flink-autoscaler/src/main/java/org/apache/flink/autoscaler/JobVertexScaler.java

+            // Clear delayed scale down request if the new parallelism is equal to
+            // currentParallelism.
+            delayedScaleDown.clearVertex(vertex);
+            return ParallelismResult.optional(currentParallelism);


Maybe we should have something like ParallelismResult.noChange() because this looks a bit weird :D

gyfora · 2024-08-30T07:01:32Z

flink-autoscaler/src/main/java/org/apache/flink/autoscaler/JobVertexScaler.java

+     * that if all vertices' ParallelismResult is optional, rescaling will be ignored.
+     */
+    @Getter
+    public static class ParallelismResult {


Should we call this ParallelismChange instead?

1996fanrui · 2024-08-30T10:59:26Z

flink-autoscaler/src/main/java/org/apache/flink/autoscaler/ScalingExecutor.java

@@ -156,6 +159,9 @@ public boolean scaleResource(

        autoScalerStateStore.storeConfigChanges(context, configOverrides);

+        // Try to clear all delayed scale down requests after scaling.
+        delayedScaleDown.clearAll();


Hi @gyfora , reply your comment[1] here.

IIUC, the previous logic still exists, if the rescale happens, delayedScaleDown.clearAll will be called (rescale includes scale up). It means the first trigger time is reset.

Not sure whether your question is answered.

[1] #875 (comment)

This logic isn't same with the previous logic.

One demo of previous logic

9:00 scale up (be executed)

9:30 scale down (ignore)

9:59 scale down (ignore)

10:00 scale down (can be executed)

New logic

9:00 scale up (be executed, clear all first trigger time)

9:30 scale down (update the first trigger time to 9:30)

10:00 scale down (within interval from 9:30 to 10:30)

10:29 scale down (within interval from 9:30 to 10:30)

10:30 scale down (can be executed)

So if there was a scale down before 9:00 and that set the first trigger time, that doesn't matter as the scale up resets it subsequently.

In that case we should probably rename the method detectImmediateScaleDown to something like applyScaleDownInterval

1996fanrui · 2024-09-02T01:42:08Z

Hey @gyfora , thank you for the quick review, all comments are addressed. ❤️

gyfora

Looks pretty good, let's give @mxm 1-2 days to comment on this :)

gyfora · 2024-09-06T08:13:18Z

@1996fanrui I think we can go ahead with this :)

1996fanrui · 2024-09-06T08:54:30Z

@1996fanrui I think we can go ahead with this :)

Thank you, let me merge it

mxm

Thanks @1996fanrui and @gyfora for this improvement! The changes look good to me. Thank you also for waiting for any comments from my side.

mxm · 2024-09-07T14:37:48Z

flink-autoscaler/src/main/java/org/apache/flink/autoscaler/config/AutoScalerOptions.java

                    .durationType()
                    .defaultValue(Duration.ofHours(1))
-                    .withFallbackKeys(oldOperatorConfigKey("scale-up.grace-period"))
+                    .withDeprecatedKeys(autoScalerConfigKey("scale-up.grace-period"))


Thanks for adding this as a deprecated key for existing users!

1996fanrui force-pushed the 36018/delay-scale-down branch 4 times, most recently from ee50f77 to 301742a Compare August 29, 2024 06:36

[FLINK-36018][autoscaler] Support lazy scale down to avoid frequent r…

68c8c46

…escaling

1996fanrui force-pushed the 36018/delay-scale-down branch from 301742a to 68c8c46 Compare August 29, 2024 08:03

1996fanrui marked this pull request as ready for review August 29, 2024 10:34

1996fanrui commented Aug 29, 2024

View reviewed changes

gyfora requested changes Aug 30, 2024

View reviewed changes

1996fanrui commented Aug 30, 2024

View reviewed changes

Address Gyula's comment

b6726b5

gyfora approved these changes Sep 2, 2024

View reviewed changes

1996fanrui merged commit 968a578 into apache:main Sep 6, 2024
229 checks passed

1996fanrui deleted the 36018/delay-scale-down branch September 6, 2024 08:55

mxm reviewed Sep 7, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FLINK-36018][autoscaler] Support lazy scale down to avoid frequent rescaling #875

[FLINK-36018][autoscaler] Support lazy scale down to avoid frequent rescaling #875

1996fanrui commented Aug 28, 2024 •

edited

Loading

1996fanrui left a comment

gyfora left a comment

gyfora Aug 30, 2024

gyfora Aug 30, 2024

gyfora Aug 30, 2024

gyfora Aug 30, 2024

1996fanrui Aug 30, 2024

1996fanrui Aug 30, 2024

gyfora Aug 30, 2024

1996fanrui commented Sep 2, 2024

gyfora left a comment

gyfora commented Sep 6, 2024

1996fanrui commented Sep 6, 2024

mxm left a comment

mxm Sep 7, 2024

[FLINK-36018][autoscaler] Support lazy scale down to avoid frequent rescaling #875

[FLINK-36018][autoscaler] Support lazy scale down to avoid frequent rescaling #875

Conversation

1996fanrui commented Aug 28, 2024 • edited Loading

What is the purpose of the change

Brief change log

Verifying this change

Does this pull request potentially affect one of the following parts:

Documentation

1996fanrui left a comment

Choose a reason for hiding this comment

gyfora left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

One demo of previous logic

New logic

Choose a reason for hiding this comment

1996fanrui commented Sep 2, 2024

gyfora left a comment

Choose a reason for hiding this comment

gyfora commented Sep 6, 2024

1996fanrui commented Sep 6, 2024

mxm left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

1996fanrui commented Aug 28, 2024 •

edited

Loading