-
Notifications
You must be signed in to change notification settings - Fork 669
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Node utilization util snapshot #1533
base: master
Are you sure you want to change the base?
Node utilization util snapshot #1533
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Hello, master. Due to the company's busy schedule previously, I only managed to complete half of the related KEP. I'm glad to see that you're working on this. It looks like you're aiming to reuse the current Node utilization logic. I have a few suggestions: It should support different data sources, similar to PayPal's load-watcher. Hope these suggestions help! |
Hello sir :) thank you for taking part in composing the out-of-tree descheduling plugin KEP.
You are on the right track here. I'd like to get in touch with load-watcher maintainers and extend the codebase to provide a generic interface for accessing metrics related to pod utilization as well. Currently, only actual node utilization gets collected. Meantime, I am forming the code here to be able to better integrate with other utilization sources like metrics.
This is where we can debate more. Thank you for sharing the specifics. There's an open issue for the pod autoscaler suggesting to introduce EMA: kubernetes/kubernetes#62235. Are you aware if there's a similar issue or a discussion for the cluster autoscaler? I'd love to learn more about how it's implemented there. Ultimately, the current plugin just needs to know which pod, when evicted, will improve the overall node/workload utilization when properly re-scheduled. I could see various ways to produce the utilization snapshot using various methods.
I can see how evicting hotspot pods is related to consuming the metrics/real-time node utilization. In the current plugin context this is more suitable for a new/different plugin. I can also see how |
c889a53
to
1f55c4d
Compare
d744a96
to
800c92c
Compare
kubernetes/kubernetes#128663 to address the discrepancy in the fake metrics client node/pod metricses resource name. |
…xtraction into a dedicated usage client Turning a usage client into an interface allows to implement other kinds of usage clients like actual usage or prometheus based resource collection.
f30f8a1
to
2e63411
Compare
2e63411
to
4dc6da0
Compare
4dc6da0
to
e540957
Compare
e715f8c
to
0330902
Compare
0330902
to
baa6650
Compare
/test pull-descheduler-verify-master |
Built on top of #1532.
Prep work for load-aware descheduling: #225.
Later on,
usageSnapshot
can be promoted to an interface withactualUsageSnapshot
implementation pulling the usage from metrics.Notes (towards integrating plugins with metrics):
kubectl get podmetrics
+kubectl get nodemetrics
TODO:
For testing purposes: