Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs(refactor): updates the cruise control rebalance concepts #10810

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

PaulRMellor
Copy link
Contributor

@PaulRMellor PaulRMellor commented Nov 6, 2024

Documentation

Refactor and refresh of Cruise Control concepts

  • Adds new optimization proposal process flow description and diagram, including description of partition reassignment commands
  • Less verbose and more direct intro and concepts
  • Removes three overview files to create a single "components and features" file for goals and proposals concepts
  • Consolidates related conceptual information (goals, proposals) into single sections
  • Retitled sections to provide more direction to readers from ToC

Checklist

Please go through this checklist and make sure all applicable tasks have been done

  • Write tests
  • Make sure all tests pass
  • Update documentation
  • Check RBAC rights for Kubernetes / OpenShift roles
  • Try your changes from Pod inside your Kubernetes and OpenShift cluster, not just locally
  • Reference relevant issue(s) and close them after merging
  • Update CHANGELOG.md
  • Supply screenshots for visual changes, such as Grafana dashboards

@PaulRMellor PaulRMellor added this to the 0.45.0 milestone Nov 6, 2024
@PaulRMellor PaulRMellor requested review from kyguy, fvaleri and a team November 6, 2024 16:32
@PaulRMellor PaulRMellor self-assigned this Nov 6, 2024
Copy link
Contributor

@fvaleri fvaleri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @PaulRMellor, thanks. I must say that our CC doc is quite good :)

I left some comments for your consideration.

@PaulRMellor
Copy link
Contributor Author

Thanks for the reviews @fvaleri and @kyguy
I've addressed all the comments except the suggestion to change "user-defined goals" to something else.
I agree it's not clear, but I'm not sure changing to "KafkaRebalance goals" is the way:
#10810 (comment)

Copy link
Contributor

@fvaleri fvaleri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks.

Some main goals are preset as hard goals.

To simplify configuration, use the inherited main goals unless you need to exclude specific goals from `KafkaRebalance` resources.
You can adjust the priority order in the default optimization goals configuration.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe it's just my luck of knowledge here, can we really change the priority order? @kyguy ?

Copy link
Member

@kyguy kyguy Nov 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To my understanding, this is true for the Kafka resource default.goal and KafkaRebalance goals lists but not for Kafka goals list

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. So I think we should clarify here (and everywhere goals priority is mentioned) that the priority is based on the order. Not sure if "priority order" is used for this purpose but it's not clear to me.

Copy link
Contributor

@fvaleri fvaleri Nov 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have to admit that CC naming doesn't help much here.

IMO, this is the default inter-broker goals priority list:

https://github.com/linkedin/cruise-control/blob/main/cruise-control/src/main/java/com/linkedin/kafka/cruisecontrol/config/constants/AnalyzerConfig.java#L312-L327

To change it, the user can configure Kafka default.goals or KafkaRebalance goals with a different order. Kafka goals priority is irrelevant, but only goals listed there can be used in the previous configurations (DEFAULT_DEFAULT_GOALS is a subset of DEFAULT_GOALS).

  • Do you agree?
  • Are we exposing Kafka goals because we intend to eventually support custom goals?

@PaulRMellor
Copy link
Contributor Author

Thanks for the latest comments @kyguy and @ppatierno
I've updated the diagram and addressed all comments apart from this #10810 (comment)
Looking for guidance

@PaulRMellor
Copy link
Contributor Author

Thanks for the latest @kyguy and @ppatierno
I've changed "main" to "supported" goals. (Much better)
Diagram now shows "Rebalance" as flow towards CC

Copy link
Member

@ppatierno ppatierno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@PaulRMellor
Copy link
Contributor Author

Thanks for latest comments @kyguy
I've addressed them all as suggested.

Regarding the summary of Cruise Control components. Previously, we mentioned some but not all them without any context. I thought it would be useful at least to mention them all in summary and show how they operate within a Strimzi context, how requests are made and proposals generated etc. I think the diagram is useful (and good for SEO and user experience).


* *Default goals* refer to the goals used by default when generating proposals.
They match the supported goals unless specifically set by the user.
* *Proposal-specific goals* are a subset of default goals configured for specific proposals.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* *Proposal-specific goals* are a subset of default goals configured for specific proposals.
* *Proposal-specific goals* are a subset of supported goals configured for specific proposals.

=== Default goals

Cruise Control uses default goals to generate an optimization proposal.
You can override default goals by setting proposal-specific optimization goals in a `KafkaRebalance` resource.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would move this sentence/information to the Proposal-specific goals section, saying that if proposal-specific goals are not set in the KafkaRebalance resource then default goals are used

Optimization proposals comprise a list of partition reassignment mappings.
When you approve a proposal, the Cruise Control server applies these partition reassignments to the Kafka cluster.

A partition reassignment command consists of either of the following types of operations:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
A partition reassignment command consists of either of the following types of operations:
A partition reassignment consists of either of the following types of operations:


* Leadership movement: Involves switching the leader of the partition's replicas.

Cruise Control issues partition reassignment commands to the Kafka cluster in batches.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Cruise Control issues partition reassignment commands to the Kafka cluster in batches.
Cruise Control issues partition reassignments to the Kafka cluster in batches.

* Leadership movement: Involves switching the leader of the partition's replicas.

Cruise Control issues partition reassignment commands to the Kafka cluster in batches.
The performance of the cluster during the rebalance is affected by the number of each type of movement contained in each batch.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The performance of the cluster during the rebalance is affected by the number of each type of movement contained in each batch.
The performance of the cluster during the rebalance is affected by the number and magnitude of each type of movement contained in each batch.

The score is calculated by subtracting the sum of the `BalancednessScore` of each violated soft goal from 100. Cruise Control assigns a `BalancednessScore` to every optimization goal based on several factors, including priority--the goal's position in the list of `default.goals` or proposal-specific goals.

The `Before` score is based on the current configuration of the Kafka cluster.
The `After` score is based on the generated optimization proposal.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The `After` score is based on the generated optimization proposal.
The `After` score is based on the predicted workload model after applying the generated optimization proposal.

Broker load data provides insights into current and anticipated usage of resources following a rebalance.
The data is stored in a `ConfigMap` (with the same name as the `KafkaRebalance` resource) as a JSON formatted string

When a Kafka rebalance proposal reaches the `ProposalReady` state, Cruise Control generates a `ConfigMap` (named after the `KafkaRebalance` custom resource) containing a JSON string of broker metrics.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
When a Kafka rebalance proposal reaches the `ProposalReady` state, Cruise Control generates a `ConfigMap` (named after the `KafkaRebalance` custom resource) containing a JSON string of broker metrics.
When a Kafka rebalance proposal reaches the `ProposalReady` state, Strimzi creates a `ConfigMap` (named after the `KafkaRebalance` custom resource) containing a JSON string of broker metrics generated from Cruise Control.


Cluster rebalance performance is also influenced by the _replica movement strategy_ that is applied to the batches of partition reassignment commands.
By default, Cruise Control uses the `BaseReplicaMovementStrategy`, which simply applies the commands in the order they were generated.
However, if there are some very large partition reassignments early in the proposal, this strategy can slow down the application of the other reassignments.
By default, Cruise Control uses the `BaseReplicaMovementStrategy`, which applies the commands in the order they were generated.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Movements or reassignments might be better here

Suggested change
By default, Cruise Control uses the `BaseReplicaMovementStrategy`, which applies the commands in the order they were generated.
By default, Cruise Control uses the `BaseReplicaMovementStrategy`, which applies the reassignments in the order they were generated.

By default, Cruise Control uses the `BaseReplicaMovementStrategy`, which simply applies the commands in the order they were generated.
However, if there are some very large partition reassignments early in the proposal, this strategy can slow down the application of the other reassignments.
By default, Cruise Control uses the `BaseReplicaMovementStrategy`, which applies the commands in the order they were generated.
However, if large partition reassignments are handled early, this strategy may delay other reassignments.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
However, if large partition reassignments are handled early, this strategy may delay other reassignments.
However, this strategy could lead to the delay of other partition reassignments if some large partition reassignments are generated then ordered first.


The relevant configurations are summarized in the following table.
* Set Cruise Control server configurations in `Kafka.spec.cruiseControl.config` in the `Kafka` resource.
* Set individual rebalances in `KafkaRebalance.spec` in the `KafkaRebalance` resource.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Set individual rebalances in `KafkaRebalance.spec` in the `KafkaRebalance` resource.
* Set proposal-specific configurations in `KafkaRebalance.spec` in the `KafkaRebalance` resource.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants