-
Notifications
You must be signed in to change notification settings - Fork 136
Fix possible split brain and consequent crash due to assertion failure #429
Fix possible split brain and consequent crash due to assertion failure #429
Conversation
This test currently fails because the code that sends RequestVote messages uses information about the in-memory log cache, not about the persisted log. Signed-off-by: Free Ekanayaka <[email protected]>
At the moment we correctly apply pending configurations only once they are persisted. However, we don't do the same when running an election, where we use not-yet-persisted log indexes, leading to incosistencies and violation of Raft invariants. This commits makes RequestVote messages include the index/term of the last persisted entry, not the one in the in-memory log cache. Signed-off-by: Free Ekanayaka <[email protected]>
Codecov Report
@@ Coverage Diff @@
## master #429 +/- ##
==========================================
+ Coverage 76.62% 76.71% +0.08%
==========================================
Files 51 51
Lines 9597 9597
Branches 2444 2443 -1
==========================================
+ Hits 7354 7362 +8
+ Misses 1171 1161 -10
- Partials 1072 1074 +2
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I agree that this change is needed, since the formal description of Raft views the log as always being stored persistently, and our in-memory log is just an optimization against that that should never affect the core protocol implementation.
/* Server 0 is still leader, since it can contact server 3. */ | ||
munit_assert_int(CLUSTER_STATE(0), ==, RAFT_LEADER); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a server 4 implicit in all this, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not entirely sure what you mean, the test has 4 servers, indexed from 0
to 3
, with IDs from 1
to 4
(the off-by-one difference between indexes and IDs is a unfortunate, I'd like to fix that eventually, but it's the current convention).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought clusters with an even number of servers weren't allowed? Okay, I was confused because I'm not used to seeing cluster with an even number of servers, I forgot that one of them is not a voter and so two votes is enough to win an election.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can have any number of servers, with any role combination you wish. An even number of (voting) servers is not recommended because having 4 servers (all voters) doesn't make the cluster more "available" than having 3, since with 4 servers to have a majority you need 3 votes, so you can afford to lose at most 1 server, exactly like when you have 3 servers.
Note that the test sets up 4 servers, but only 3 of them are meant to be voting at any given time (the test mimics automatic roles management in that regard, by demoting an offline voter node and promoting a standby to replace it).
Right, the in-memory log is basically a cache (in the sense that it saves a disk round-trip when a leader needs to fetch entries in order to send them to followers) and a buffer (in the sense that it's where we store entries that we receive, waiting for them to be asynchronously persisted on disk). |
Fixes #386. Please see the individual commit messages and code comments for a detailed explanation.