Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ZOOKEEPER-2789 Reassign ZXID for solving 32bit overflow problem #2164

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

zichen-gan
Copy link

The original PR link: #262
Since the aforementioned PR does not support rolling hot updates, this PR aims to add rolling upgrade capabilities. The goal of this PR is to change the counter bit length from 32 to 40 and the epoch bit length from 32 to 20 through a rolling upgrade approach.

@anmolnar
Copy link
Contributor

@kezhuw @eolivelli @li4wang @ztzg

I've just come across this one and started to review for 3.9.3, but need more eyeballs. Could you please review?

@anmolnar anmolnar changed the title Zookeeper 2789 ZOOKEEPER-2789 Reassign ZXID for solving 32bit overflow problem Sep 19, 2024
@kezhuw
Copy link
Member

kezhuw commented Sep 23, 2024

This pr will change the server side data format. I think it does not fit a patch release.

If it is 1k/s ops, then as long as $2^{32} / (86400 * 1000) \approx 49.7$ days ZXID will exhausted. #262 (comment)

Thinking about some abnormal situations, maybe 24 bit for epoch and 40 bit for counter is more better choice: M a t h . m i n ( 2 24 / ( 24 ∗ 365 ) , 2 40 / ( 86400 ∗ 1000 ∗ 365 ) ) ≈ M a t h . m i n ( 1915.2 , 34.9 ) = 34.9 years. #262 (comment)

So i offered a better solution is 24-bit epoch in second comment. Even if the frequency of leader election is once by every single hours, we will not experience the epoch overflow until 1915.2 years later. #262 (comment)

Given above, I think it is promising. It promotes rollover rate from 49.7 days to 34.9 years assuming 1k/s ops. The best is that it demands no protocol change at the price of zxid format change.

Before finalizing this path, I may want to taste whether leadership inheritance is feasible.

Copy link
Contributor

@anmolnar anmolnar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. Let's submit it to master (3.10.0) as @kezhuw suggested.

* Here, a new file is created locally to store the current number of epoch bits. This file will be created locally after upgrading the version to ensure a smooth joining of the cluster upon startup.
*/
try {
currentEpochPosition = readLongFromFile(CURRENT_EPOCH_POSITION_FILENAME);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need to store this information in a file?
Leader will set it automatically, learners will use the new version based on leader's version.

} else {
byte[] ver = new byte[4];
ByteBuffer.wrap(ver).putInt(0x10000);
ByteBuffer.wrap(ver).putInt(0x11000);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The patch doesn't alter anything in the protocol, but you still have to increase protocol version number in order to reject old followers. Odd, but seams reasonable.

@anmolnar
Copy link
Contributor

@zichen-gan You need to close / re-open PR or force push to trigger another CI run.

@anmolnar
Copy link
Contributor

Given above, I think it is promising. It promotes rollover rate from 49.7 days to 34.9 years assuming 1k/s ops. The best is that it demands no protocol change at the price of zxid format change.
Before finalizing this path, I may want to taste whether leadership inheritance is feasible.

Sure, I'll wait for your review.
Strange thing is that as you outlined it doesn't require protocol change, but still the patch has to increase protocol version.

@kezhuw
Copy link
Member

kezhuw commented Sep 24, 2024

as you outlined it doesn't require protocol change, but still the patch has to increase protocol version.

My fault! By "no protocol change", I mean we don't need to prove its correctness in ZAB.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants