Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check whether to switch to fail when setting the node to pfail in cron #1061

Open
wants to merge 1 commit into
base: unstable
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion src/cluster_legacy.c
Original file line number Diff line number Diff line change
Expand Up @@ -5066,7 +5066,7 @@ void clusterCron(void) {
if (!(node->flags & (CLUSTER_NODE_PFAIL | CLUSTER_NODE_FAIL))) {
node->flags |= CLUSTER_NODE_PFAIL;
update_state = 1;
if (server.cluster->size == 1 && clusterNodeIsVotingPrimary(myself)) {
if (clusterNodeIsVotingPrimary(myself)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change makes a lot of sense to me. I actually don't fully understand why we special-cased single-shard clusters in b3aaa0a

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it was just to reduce the impact (to solve a specific problem (single shard) at the time). @bentotten would you be able to take a review?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For single-shard clusters, there is no need to wait for a quorum of 1, so we can proceed to directly fail the node here and not wait for the original gossip-logic (for reference). Unfortunately prior to that, a single-shard cluster would never properly mark a replica as failed.

I believe we still need to wait to achieve a quorum from gossip to mark this node as truly failed, no? If the node uses the information it has from "old gossip" during each cluster cron run, wont this increase the number of calls to this function without increasing the accuracy of the failure report?

Copy link
Contributor

@bentotten bentotten Sep 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though the benefit of the node being able to mark the replica as failed as soon as it knows it cant reach said replica in the last-needed-vote scenario also makes sense - @madolson I know you did the initial investigation into the original bug, if you have any insights. This was not a scenario I had considered for the original commit, and dont see any issue

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right, thanks for the input, that is new to me. clusterNodeFailureReportsCount will cleanup the old failure report, so i think (hope) it can avoid the old gossip information risk. But you do make a good point, let see if others have a new thought.

markNodeAsFailingIfNeeded(node);
} else {
serverLog(LL_NOTICE, "NODE %.40s (%s) possibly failing.", node->name, node->human_nodename);
Expand Down
Loading