-
Notifications
You must be signed in to change notification settings - Fork 498
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support to customize spare replicas during VolumeReplace #5666
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: csuzhangxc The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
[LGTM Timeline notifier]Timeline:
|
/run-pull-e2e-kind |
/cherry-pick release-1.6 |
/cherry-pick release-1.5 |
@csuzhangxc: new pull request created to branch In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository. |
Signed-off-by: ti-chi-bot <[email protected]>
@csuzhangxc: new pull request created to branch In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository. |
…5673) Co-authored-by: Anish Shankar <[email protected]>
…5674) Signed-off-by: ti-chi-bot <[email protected]> Co-authored-by: Anish Shankar <[email protected]> Co-authored-by: csuzhangxc <[email protected]>
What problem does this PR solve?
Currently the VolumeReplace feature assumes a default spare replica of 1 for PD & TiKV.
Sometimes it is useful to change this number. Setting a larger spare replica number helps for faster tikv region transfer
In multi-az setups, it is possible to get zone skewness with only a single replica, which will cause issues during scale down of spare replica.
Example multi-az senario:
3 zones , 6 replicas, initial zones per replica: 1,2,3,1,2,3
After adding 1 spare replica: 1,2,3,1,2,3,1
Now replace will start replacing from start, but now the new disks can get different zones, example:
after replacing first disk it could now become 2,2,3,1,2,3,1
After this if we scale down the spare replica (last one), it will become 2,2,3,1,2,3 which is zone-skewed and can be blocked depending on configured topology constraints.
In this case setting spareReplicas to 3 will avoid this issue
What is changed and how does it work?
Add a config option in TidbCluster spec, inside TiKVSpec and PDSpec to configure spare replicas
Code changes
Tests
Re ran the manual test described in : #5150
Ran once as it is, and observed default spare replica of 1 was used.
Ran once by setting tikv spare replicas to 3, and observed 3 spare replicas was used
Side effects
Related changes
Release Notes
Please refer to Release Notes Language Style Guide before writing the release note.