Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Fix: crmd: Reset stonith failcount to recover transitioner when the n…
…ode rejoins CRMd transitioner could not recover from "Too many failures to fence". Steps to produce: 1. Two-node cluster with stonith, for example using IPMI. 2. Node-1 has a complete power outage for a couple of minutes. The IPMI device is also without power, which causes the fencing to fail 3. Node-2 tries to fence node-1 for several times but fails. 4. Node-2 reports "Too many failures to fence node-1 (11), giving up". 5. The power returns and node-1 boots up normally. 6. Node-1 rejoins the cluster, but resources are not started on it. Expected result: The stonith failcount for node-1 should be reset and resources should be started on node-1. Actual result: Node-2 still logs "Too many failures to fence" and resources are not started on node-1.
- Loading branch information