Descheduler Framework Proposal #753

damemi · 2022-03-02T20:18:20Z

Hi all,

I'm creating this issue to gauge interest in our longstanding idea of creating a "Descheduler Framework". The general idea is to abstract the Descheduler internals into a library that third-party developers can use to create their own custom Deschedulers.

I think this is relevant now because we have a lot of big, ongoing tasks that I think would fit well under the umbrella of this effort. For example, each of these features presents a potential "extension point" for future custom descheduler developers (so, if we choose to take on a framework, it would be worth considering the implementation of these features with that extensibility in mind):

Supporting out-of-tree strategies (#679, #557)
Allowing custom options for pod evictions / custom PodEvictor implementations (#690, #725)
Defining an implementable interface for Strategies (#586, #706)
Refactoring the API to be more flexible and extensible (#587, #486)
Allowing event-triggered strategy runs (#488, #696)

In addition, the Descheduler has seen a lot of growth and requests for unique features to support custom use cases. Enabling users to build their own Deschedulers in a preferred way will free up our maintainers to focus on larger tasks like those above, while also steering the Descheduler toward a stable project.

I would like to propose an outline for this effort, but first I want to know if this is something people think is worth tackling as a project. I think this also will be a good chance for a lot of contributors to make a significant impact on the project as a whole.

So the main question is: do folks think it would be worth the effort to re-focus some of the above tasks under the umbrella goal of the Descheduler Framework?

Update: Based on feedback I put together some more organized thoughts into a doc: https://docs.google.com/document/d/1fr5tfSaCkqGZVXUMJA9pUkIy6GGRwJumPN5MeZ8oewo/edit?usp=sharing

Please comment there or here to help flesh out the design goals for if/when we pursue this

The text was updated successfully, but these errors were encountered:

JaneLiuL · 2022-03-03T00:17:01Z

This is a great proposal ～～ i think it worth~~

binacs · 2022-03-07T09:51:10Z

/cc

damemi · 2022-03-09T18:13:39Z

Based on the reactions it sounds like this is something people are interested in. I'd like to bring it up at tomorrow's sig-scheduling meeting if anyone would like to discuss it there.

I also put together some more detail about the goals of this into a doc: https://docs.google.com/document/d/1fr5tfSaCkqGZVXUMJA9pUkIy6GGRwJumPN5MeZ8oewo/edit?usp=sharing

Please comment there (or in this issue), specifically about the "open questions" at the bottom of the doc. If there are any other questions you think should be brought up, those are welcome too.

ingvagabund · 2022-05-03T12:29:23Z

The first version of the design proposal can be found at https://docs.google.com/document/d/1tWpXoZ6YF3ksnmueM2POwASdxoiLW8y7TXuWFxT8NSo/edit?usp=sharing. I have just shared the link at https://groups.google.com/g/kubernetes-sig-scheduling/c/VdwizKUMNgM as well to include broader audience.

Any feedback welcome.

ingvagabund · 2022-05-25T08:02:29Z

Thanks everyone for sharing thoughts. Given there have not been any new comments in the proposal, I think we can start discussing the implementation. We could split the implementation into several steps:

define framework types:
- descheduler configuration with profiles (internal/v1alpha2) with conversion/validation/defaulting
- handle for injecting clientset, evictor, shared informer factory, etc.
- plugin extension point interfaces
- example strategy arguments + defaulting/validation
framework internals: creating plugin registry, plugin initialization from configuration, framework initialization, collecting metrics, descheduling loop, running fleet of extension points (e.g. RunAllDeschedulePlugins, RunAllBalancePlugins), ...
evictor plugin: migrate the PodEvictor's Evictable into a plugin to separate the pod eviction mechanism (framework internals) from the "is eligible for eviction" logic (evictor plugin), with examples of how to configure it (e.g. evicting system critical pods, evicting pods with a local storage, etc.)
migrate strategies: migration of all existing strategies into plugins, versioned strategy arguments+defaulting
leftovers: anything left

Every step needs to be properly tested. Before the descheduler code base is flipped into the framework, some refactoring will be required. E.g. to run the plugins alongside the strategies while transitioning. New documentation with diagrams will be needed. Including a guide for creating a new plugin.

ingvagabund · 2022-05-25T08:34:35Z

@damemi @eminaktas @jklaw90 @JaneLiuL @Dentrax @pravarag and others, any volunteers? This will take some time to implement. The reviews will help to crystallize the design further. Feel free to ask any questions.

Those 4 steps can be further broken up into more steps if needed. Also feel free to suggest other/new steps if you have some in mind.

pravarag · 2022-05-25T08:50:22Z

@damemi @eminaktas @jklaw90 @JaneLiuL @Dentrax @pravarag and others, any volunteers? This will take some time to implement. The reviews will help to crystallize the design further. Feel free to ask any questions.

Those 4 steps can be further broken up into more steps if needed. Also feel free to suggest other/new steps if you have some in mind.

I would love to work on it but I wanted to check, will this be having some sub-tasks for starters (I'm guessing the 4 steps could be formed into 4 different sub issues)? or this needs to be approached as one whole issue?

binacs · 2022-05-25T13:12:57Z

Hi, everyone! I also hope I can contribute to this ^ ^

damemi · 2022-05-25T13:55:36Z

or this needs to be approached as one whole issue?

We don't want to do this as one whole issue, but rather with atomic changes. Additionally, each change should be seamless in that it doesn't change any current functionality or break due to relying on another change. This is how the Scheduler framework was implemented, so we want to take a similar approach piece-by-piece. There should be plenty of work to go around

ingvagabund · 2022-05-25T14:45:25Z

As @damemi says. We might open individual issues where we go more into the details of each step. The steps provided are high-level steps. I'd like to wait for more opinions so we can join the effort and make sure we cover as much of the important aspects as we can. I did some preliminary implementation in #781 to see how complex the implementation can get. It's not a referential implementation. Though, it can serve as an idea of what to keep in mind. I'd like to start over so we can discuss individual pieces one by one while still heading in the right direction.

ingvagabund · 2022-06-06T13:12:33Z

The first issue can be found at #837. Let's move the discussion about the framework data types there.

pravarag · 2022-06-08T16:25:57Z

@ingvagabund @damemi thanks for sharing the detailed info about framework data types. I wanted to check do we have any plan to introduce exposing more metrics as part of Framework in any of the above mentioned steps? I was thinking if this pending work #503 can be included as part of Framework level change as well 🤔

ingvagabund · 2022-06-16T11:31:36Z

Adding #793 for awareness for defaulting the plugin arguments

ingvagabund · 2022-08-04T15:22:09Z

Hi everyone,

we have currently the following strategies already migrated into the plugins:

Nothing left to migrate.

Thank you for the nice work of transitioning into plugins.

@pravarag @Dentrax @BinacsLee @jklaw90 @eminaktas would you be interested in converting some of the strategies into plugins?

pravarag · 2022-08-04T15:26:57Z

LowNodeUtilization + HighNodeUtilization

@pravarag @Dentrax @BinacsLee @jklaw90 @eminaktas would you be interested in converting some of the strategies into plugins?

Hi @ingvagabund, I'm already working on migrating LowNodeUtilization + HighNodeUtilization. Will share the PR soon, apologies for the delay.

Dentrax · 2022-08-04T15:40:16Z

Hey! Thanks for pinging us! I'm not making any promises (due to limited free time), but maybe we (@eminaktas) can get into PodLifeTime to convert into plugin.

jklaw90 · 2022-08-04T15:50:29Z

Yep, i can finish off RemovePodsViolatingTopologySpreadConstraint. i just need to see what the latest changes are to the framework and update what i got.

binacs · 2022-08-04T16:25:40Z

Hi there, thanks for ping us. I will take up the migration of the RemovePodsHavingTooManyRestarts plugin. :)

ingvagabund · 2022-08-17T13:30:41Z

All strategies have been migrated!!! Thank you everyone involved in the code changes and reviews. I very much appreciate all the discussions we have during the reviews. Some of them helped to discover weak spots and improve the code quality. Good work everyone. Thank you.

ingvagabund · 2022-08-25T15:29:09Z

Time to move forward :) I am proposing the following next steps:

providing a plugin example + readme with instructions: Descheduling plugin examples with guides #923
converting current evictor filter into a plugin + introducing new preEvictFilter extension point for filtering right before a pod eviction is performed (e.g. for LowNodeUtilization plugin): Convert current evictor filter into a plugin #924
plugin arguments defaulting + moving plugin arguments to each corresponding plugin: Plugin arguments defaulting + moving plugin arguments to each corresponding plugin #925
introduction of new v1alpha2 configuration + configuration conversion from v1alpha1 to v1alpha2 (the fact a plugin arguments are mapped to a specific configuration version will be determined by importing a plugin arguments in corresponding conversion code): New v1alpha2 configuration + conversion #926
introduce sort and preFilterSort extension points: Introduce sort and preEvictionSort extension points #927

Step 1 is independent and can be a good exercise for everyone. There can be more examples. Step 2 is mostly self-standing and can be performed right way. Benefit of moving the evictor filter under a plugin will allow framework users to build their own eviction plugins and either replace or extend the default filter with new filters. Step 3 is prerequisite for step 4 as all the plugin arguments need to be defaulted before the conversion to v1alpha2 takes place. Argument defaulting will gives chance to discuss which values are the most suitable candidates for setting by default. One PR for each plugin arguments. We can discuss arguments renaming as well during the reviews. Lastly, step 4 will allow the framework to convert both v1alpha1 and v1alpha2 into an internal configuration on which the framework data types and mechanics will be built. Step 5 will extend ability to change sorting algorithm of processing (sort) and evicting (preEvictSort) pods (e.g. by oldest or by priority class). Worth making some measurements or discussing which default sorting algorithms the framework provide. Integration of sorting algorithms is another self-standing task.

Once all bits are in place we can start extending the framework.

@damemi @a7i step 3 is aligned with the discussion we had about making the plugins more modular

damemi · 2022-08-25T15:33:55Z

@ingvagabund sounds good! thanks for working out the next steps.

For step 4, the v1alpha2 configuration should be backward-compatible with v1alpha1 configs, so no plugins should be dependent on v1alpha1. Instead, we can do the conversion in the framework from any v1alpha1->v1alpha2 configs.

Also, what's the plan for getting rid of the migration wrapper?

Thanks for laying all of this out and keeping track of the plan

ingvagabund · 2022-08-25T15:41:17Z

For step 4, the v1alpha2 configuration should be backward-compatible with v1alpha1 configs, so no plugins should be dependent on v1alpha1

Conversion from v1alpha2 to v1alpha1 will be possible only when there's a single profile. There's no unambiguous mapping otherwise. I wonder whether it makes sense to converse from v1alpha2 to v1alpha1. Can you think of any use cases where this might be needed?

Also, what's the plan for getting rid of the migration wrapper?

After the conversion from v1alpha1 to v1alpha2 is implemented. We will still need to implement the plugin registry so all the plugins can be initialized systematically while injecting defaulted plugin arguments in the constructors.

damemi · 2022-08-25T16:00:44Z

Conversion from v1alpha2 to v1alpha1 will be possible only when there's a single profile

the framework code only needs to convert v1alpha1 -> v1alpha2, such as when the user is still using a v1alpha1 config. if the config is already given in v1alpha2, then we can just use it as is.

this way, all plugins can be written assuming v1alpha2, and let the framework do the conversion work. I don't want to deal with versioned plugins 😬

ingvagabund · 2022-08-25T16:17:51Z

So I just misunderstood :) I would like to avoid any conversion from v1alpha2 to v1alpha1.

damemi · 2022-08-25T16:36:26Z

Yeah, there's no need to convert v1alpha2 to v1alpha1 (and like you said, it's probably not even possible). Just 1->2 to handle legacy configs, which should be possible

JaneLiuL · 2022-09-27T01:08:03Z

is there any task need to be take for this proposal now?

ingvagabund · 2022-09-28T09:34:15Z

@JaneLiuL any issue in #753 (comment) that has not been assigned yet

ingvagabund · 2022-09-28T11:14:55Z

#926 is a pre-requisite for implementing a plugin registry. Once v1alpha1 -> v1alpha2 conversion (with plugin argument defaulting and validation) and plugin registry are in place, the descheduler will get updated to initialize plugins based on each profile configuration (plugin configuration with list of enabled/disabled extension points). Followed by implementing the framework methods for running a batch of extension points (i.e. runDescheduleExtensionPoints and runBalanceExtensionPoints).

Right now it is important to complete the following issues:

New v1alpha2 configuration + conversion #926 (defining new data type and conversion)
Plugin arguments defaulting + moving plugin arguments to each corresponding plugin #925 (partially done, we still need to discuss individual plugin arguments whether we keep the keywords or change them into more suitable ones)

rentseen · 2022-10-18T08:52:36Z

Is this excellent work released in v0.25.0?

ingvagabund · 2022-10-18T12:59:28Z

We do not target any specific release. Though, the goal is to finish the framework as soon as possible.

k8s-triage-robot · 2023-01-16T13:20:41Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

ingvagabund · 2023-02-05T14:33:32Z

The conversion from v1alpha1 -> v1alpha2 is implemented. Thank you everyone who participated.

The next item on the list is to discuss the descheduling framework in #979. Once implemented, all the remaining descheduling framework types and mechanism can be put in place.

k8s-triage-robot · 2023-05-06T14:39:00Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

ingvagabund · 2023-07-11T16:34:44Z

Remaining work left listed in #1187

k8s-triage-robot · 2024-01-24T06:57:04Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot · 2024-02-23T07:05:04Z

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle rotten
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

damemi · 2024-04-23T12:56:23Z

Closing this as we are merging a formal descheduler KEP in #1372

damemi added the kind/feature Categorizes issue or PR as related to a new feature. label Mar 2, 2022

damemi mentioned this issue Mar 9, 2022

Proposal: Long-lived deployment: Triggering Descheduler with Events (SharedIndexInformer rules, endpoints, etc) #696

Open

ingvagabund mentioned this issue Apr 6, 2022

WIP: Descheduling framework PoC #781

Closed

This was referenced May 9, 2022

Extend descheduler to support out-of-tree strategies #679

Open

Proposal: Fail Fast Validation #701

Closed

ingvagabund mentioned this issue Jun 6, 2022

Descheduling framework: define framework types #837

Closed

12 tasks

Dentrax mentioned this issue Jun 16, 2022

Improve podEvictor statistics #503

Closed

damemi pinned this issue Sep 15, 2022

knelasevero mentioned this issue Sep 20, 2022

add PreEvictionFilter extension to DefaultEvictor Plugin #955

Merged

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 16, 2023

ingvagabund removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 16, 2023

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 6, 2023

ingvagabund removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 9, 2023

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 24, 2024

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Feb 23, 2024

ingvagabund removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Feb 23, 2024

ingvagabund mentioned this issue Apr 8, 2024

753 - Descheduling framework #1372

Merged

damemi unpinned this issue Apr 23, 2024

damemi closed this as completed Apr 23, 2024

Descheduler Framework Proposal #753

Descheduler Framework Proposal #753

Comments

damemi commented Mar 2, 2022 • edited Loading

JaneLiuL commented Mar 3, 2022 • edited Loading

binacs commented Mar 7, 2022

damemi commented Mar 9, 2022

ingvagabund commented May 3, 2022

ingvagabund commented May 25, 2022

ingvagabund commented May 25, 2022 • edited Loading

pravarag commented May 25, 2022

binacs commented May 25, 2022

damemi commented May 25, 2022

ingvagabund commented May 25, 2022

ingvagabund commented Jun 6, 2022

pravarag commented Jun 8, 2022

ingvagabund commented Jun 16, 2022

ingvagabund commented Aug 4, 2022 • edited Loading

pravarag commented Aug 4, 2022

Dentrax commented Aug 4, 2022

jklaw90 commented Aug 4, 2022

binacs commented Aug 4, 2022 • edited Loading

ingvagabund commented Aug 17, 2022 • edited Loading

ingvagabund commented Aug 25, 2022 • edited Loading

damemi commented Aug 25, 2022

ingvagabund commented Aug 25, 2022 • edited Loading

damemi commented Aug 25, 2022

ingvagabund commented Aug 25, 2022

damemi commented Aug 25, 2022

JaneLiuL commented Sep 27, 2022

ingvagabund commented Sep 28, 2022

ingvagabund commented Sep 28, 2022 • edited Loading

rentseen commented Oct 18, 2022

ingvagabund commented Oct 18, 2022

k8s-triage-robot commented Jan 16, 2023

ingvagabund commented Feb 5, 2023

k8s-triage-robot commented May 6, 2023

ingvagabund commented Jul 11, 2023

k8s-triage-robot commented Jan 24, 2024

k8s-triage-robot commented Feb 23, 2024

damemi commented Apr 23, 2024

damemi commented Mar 2, 2022 •

edited

Loading

JaneLiuL commented Mar 3, 2022 •

edited

Loading

ingvagabund commented May 25, 2022 •

edited

Loading

ingvagabund commented Aug 4, 2022 •

edited

Loading

binacs commented Aug 4, 2022 •

edited

Loading

ingvagabund commented Aug 17, 2022 •

edited

Loading

ingvagabund commented Aug 25, 2022 •

edited

Loading

ingvagabund commented Aug 25, 2022 •

edited

Loading

ingvagabund commented Sep 28, 2022 •

edited

Loading