Skip to content

Commit

Permalink
Merge pull request #851 from kgaillot/master
Browse files Browse the repository at this point in the history
Merge Pacemaker-1.1.14-rc3 into master branch
  • Loading branch information
kgaillot committed Dec 14, 2015
2 parents 5b41ae1 + ea4f3a7 commit 89d36bc
Show file tree
Hide file tree
Showing 24 changed files with 220 additions and 58 deletions.
3 changes: 2 additions & 1 deletion .travis.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# Control file for the Travis autobuilder
# http://docs.travis-ci.com/user/build-configuration/
# https://docs.travis-ci.com/user/customizing-the-build/

language: c
compiler:
Expand Down Expand Up @@ -65,6 +65,7 @@ notifications:
email:
recipients:
- [email protected]

# whitelist
branches:
only:
Expand Down
80 changes: 80 additions & 0 deletions ChangeLog
Original file line number Diff line number Diff line change
@@ -1,3 +1,83 @@
* Mon Dec 14 2015 Ken Gaillot <[email protected]> Pacemaker-1.1.14-rc3-1
- Update source tarball to revision: 97ada31
- Changesets: 7
- Diff: 2 files changed, 7 insertions(+), 2 deletions(-)

- Changes since Pacemaker-1.1.14-rc2
+ stonithd: fix issue where deleting a fence device attribute can delete the device

* Tue Dec 08 2015 Ken Gaillot <[email protected]> Pacemaker-1.1.14-rc2-1
- Update source tarball to revision: ce830a7
- Changesets: 1
- Diff: 1 file changed, 2 insertions(+)

- Changes since Pacemaker-1.1.14-rc1
+ crmd: ensure compilation works with built-in notifications disabled

* Tue Dec 08 2015 Ken Gaillot <[email protected]> Pacemaker-1.1.14-rc1-1
- Update source tarball to revision: 7cd6dcf
- Changesets: 656
- Diff: 169 files changed, 13014 insertions(+), 7579 deletions(-)

- Features added since Pacemaker-1.1.13
+ crm_resource: Indicate common reasons why a resource may not start after a cleanup
+ crm_resource: New --force-promote and --force-demote options for debugging
+ fencing: Support targeting fencing topologies by node name pattern or node attribute
+ fencing: Remap sequential topology reboots to all-off-then-all-on
+ pengine: Allow guest remote nodes using containers/vms to be nested in a group resource
+ pengine: Allow resources to start and stop as soon as their state is known on all nodes
+ pengine: Include a list of all and available nodes with clone notifications
+ pengine: Addition of the clone resource clone-min metadata option
+ pengine: Support of multiple-active=block for resource groups
+ remote: reconnect_interval option for remote nodes to delay reconnect after fence

- Changes since Pacemaker-1.1.13
+ fix multiple memory issues (leaks, use-after-free, double free, use-of-NULL) in components and tools
+ cib: Do not terminate due to badly behaving clients
+ cman: handle corosync-invented node names of the form Node{id} for peers not in its node list
+ controld: replace bashism
+ crm_node: Display node state with -l and quorum status with -q, if available
+ crmd: resources would sometimes be restarted when only non-unique parameters changed
+ crmd: fence remote node after connection failure only once
+ crmd: handle resources named the same as cluster nodes
+ crmd: Pre-emptively fail in-flight actions when lrmd connections fail
+ crmd: Record actions in the CIB as failed if we cannot execute them
+ crm_report: Enable password sanitizing by default
+ crm_report: Allow log file discovery to be disabled
+ crm_resource: Allow the resource configuration to be modified for --force-{check,start,..} calls
+ crm_resource: Compensate for -C and -p being called with the child resource for clones
+ crm_resource: Correctly clean up all children for anonymous cloned groups
+ crm_resource: Correctly clean up failcounts for inactive anonymous clones
+ crm_resource: Correctly observe --force when deleting and updating attributes
+ crm_shadow: Fix "crm_shadow --diff"
+ crm_simulate: Prevent segfault on arches with 64bit time_t
+ fencing: ensure "required"/"automatic" only apply to "on" actions
+ fencing: Return a provider for the internal fencing agent "#watchdog" instead of logging an error
+ fencing: ignore stderr output of fence agents (often used for debug messages)
+ libcib: potential user input overflow
+ libcluster: overhaul peer cache management
+ log: make syslog less noisy
+ lrmd: cancel currently pending STONITH op if stonithd connection is lost
+ lrmd: Finalize all pending and recurring operations when cleaning up a resource
+ pengine: Bug cl#5247 - Imply resources running on a container are stopped when the container is stopped
+ pengine: cl#5235 - Prevent graph loops that can be introduced by "load_stopped -> migrate_to" ordering
+ pengine: Correctly bypass fencing for resources that do not require it
+ pengine: do not timeout remote node recurring monitor op failure until after fencing
+ pengine: Ensure recurring monitor operations are cancelled when clone instances are de-allocated
+ pengine: fixes segfault in pengine when fencing remote node
+ pengine: properly handle blocked clone actions
+ pengine: ensure failed actions that occurred in node shutdown are displayed
+ remote: Correctly display the usage of the ocf:pacemaker:remote resource agent
+ remote: do not fail operations because of a migration
+ remote: enable reloads for select remote connection options
+ resources: allow for top output with or without percent sign in HealthCPU
+ resources: Prevent an error message on stopping "Dummy" resource
+ systemd: Prevent segfault when logging failed operations
+ systemd: Reconnect to System DBus if the connection is closed
+ systemd: set systemd resources' timeout values higher than systemd's own default
+ tools: Do not send command lines to syslog
+ upstart: Ensure pending structs are correctly unreferenced


* Wed Jun 24 2015 Andrew Beekhof <[email protected]> Pacemaker-1.1.13-1
- Update source tarball to revision: 2a1847e
Expand Down
4 changes: 2 additions & 2 deletions GNUmakefile
Original file line number Diff line number Diff line change
Expand Up @@ -286,8 +286,8 @@ www: all global doxygen

summary:
@printf "\n* `date +"%a %b %d %Y"` `git config user.name` <`git config user.email`> $(NEXT_RELEASE)-1"
@printf "\n- Update source tarball to revision: `git id`"
@printf "\n- Changesets: `git log --pretty=format:'%h' $(LAST_RELEASE)..HEAD | wc -l`"
@printf "\n- Update source tarball to revision: `git log --pretty=format:%h -n 1`"
@printf "\n- Changesets: `git log --pretty=oneline $(LAST_RELEASE)..HEAD | wc -l`"
@printf "\n- Diff: "
@git diff -r $(LAST_RELEASE)..HEAD --stat include lib mcp pengine/*.c pengine/*.h cib crmd fencing lrmd tools xml | tail -n 1

Expand Down
14 changes: 14 additions & 0 deletions crmd/control.c
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,13 @@
#include <sys/types.h>
#include <sys/stat.h>

/* Enable support for built-in notifications
*
* The interface is expected to change significantly, and this will be defined
* in the upstream master branch only until a new design is finalized.
*/
#define RHEL7_COMPAT

qb_ipcs_service_t *ipcs = NULL;

extern gboolean crm_connect_corosync(crm_cluster_t * cluster);
Expand Down Expand Up @@ -893,6 +900,8 @@ pe_cluster_option crmd_opts[] = {
" To ensure these changes take effect, we can optionally poll the cluster's status for changes."
},

#ifdef RHEL7_COMPAT
/* this interface is expected to change but was released in RHEL 7 */
{ "notification-agent", NULL, "string", NULL, "/dev/null", &check_script,
"Notification script or tool to be called after significant cluster events",
"Full path to a script or binary that will be invoked when resources start/stop/fail, fencing occurs or nodes join/leave the cluster.\n"
Expand All @@ -902,6 +911,7 @@ pe_cluster_option crmd_opts[] = {
"Destination for notifications (Optional)",
"Where should the supplied script send notifications to. Useful to avoid hard-coding this in the script."
},
#endif

{ "load-threshold", NULL, "percentage", NULL, "80%", &check_utilization,
"The maximum amount of system resources that should be used by nodes in the cluster",
Expand Down Expand Up @@ -963,7 +973,9 @@ crmd_pref(GHashTable * options, const char *name)
static void
config_query_callback(xmlNode * msg, int call_id, int rc, xmlNode * output, void *user_data)
{
#ifdef RHEL7_COMPAT
const char *script = NULL;
#endif
const char *value = NULL;
GHashTable *config_hash = NULL;
crm_time_t *now = crm_time_new(NULL);
Expand Down Expand Up @@ -992,9 +1004,11 @@ config_query_callback(xmlNode * msg, int call_id, int rc, xmlNode * output, void

verify_crmd_options(config_hash);

#ifdef RHEL7_COMPAT
script = crmd_pref(config_hash, "notification-agent");
value = crmd_pref(config_hash, "notification-recipient");
crmd_enable_notifications(script, value);
#endif

value = crmd_pref(config_hash, XML_CONFIG_ATTR_DC_DEADTIME);
election_trigger->period_ms = crm_get_msec(value);
Expand Down
2 changes: 1 addition & 1 deletion doc/Pacemaker_Explained/en-US/Book_Info.xml
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@
changes (pacemaker), and PUBSNUMBER for
simple textual changes (corrections, translations, etc.).
-->
<edition>5</edition>
<edition>6</edition>
<pubsnumber>0</pubsnumber>
<abstract>
<para>
Expand Down
9 changes: 7 additions & 2 deletions doc/Pacemaker_Explained/en-US/Ch-Advanced-Options.txt
Original file line number Diff line number Diff line change
Expand Up @@ -244,6 +244,10 @@ it again on the same node. However if a resource fails repeatedly,
it is possible that there is an underlying problem on that node, and you
might desire trying a different node in such a case.

indexterm:[migration-threshold]
indexterm:[failure-timeout]
indexterm:[start-failure-is-fatal]

Pacemaker allows you to set your preference via the +migration-threshold+
resource option.
footnote:[
Expand Down Expand Up @@ -272,8 +276,9 @@ minute.
There are two exceptions to the migration threshold concept:
when a resource either fails to start or fails to stop.

Start failures cause the failcount to be set to +INFINITY+ and thus always
cause the resource to move immediately.
If the cluster property +start-failure-is-fatal+ is set to +true+ (which is the
default), start failures cause the failcount to be set to +INFINITY+ and thus
always cause the resource to move immediately.

Stop failures are slightly different and crucial. If a resource fails
to stop and STONITH is enabled, then the cluster will fence the node
Expand Down
36 changes: 13 additions & 23 deletions doc/Pacemaker_Explained/en-US/Ch-Advanced-Resources.txt
Original file line number Diff line number Diff line change
Expand Up @@ -215,6 +215,13 @@ Options inherited from <<s-resource-options,primitive>> resources:
indexterm:[clone-node-max,Clone Option]
indexterm:[Clone,Option,clone-node-max]

|clone-min
|1
|Require at least this number of clone instances to be runnable before allowing
resources depending on the clone to be runnable '(since 1.1.14)'
indexterm:[clone-min,Clone Option]
indexterm:[Clone,Option,clone-min]

|notify
|true
|When stopping or starting a copy of the clone, tell all the other
Expand Down Expand Up @@ -582,12 +589,13 @@ location constraints. These constraints are written no differently from
those for primitive resources except that the master's +id+ is used.

When considering multi-state resources in constraints, for most
purposes it is sufficient to treat them as clones. The exception is
when the +rsc-role+ and/or +with-rsc-role+ fields (for colocation
constraints) and +first-action+ and/or +then-action+ fields (for
ordering constraints) are used.
purposes it is sufficient to treat them as clones. The exception is
that the +first-action+ and/or +then-action+ fields for ordering constraints
may be set to +promote+ or +demote+ to constrain the master role,
and colocation constraints may contain +rsc-role+ and/or +with-rsc-role+
fields.

.Additional constraint options relevant to multi-state resources
.Additional colocation constraint options for multi-state resources
[width="95%",cols="1m,1,3<",options="header",align="center"]
|=========================================================

Expand All @@ -611,24 +619,6 @@ ordering constraints) are used.
indexterm:[with-rsc-role,Ordering Constraints]
indexterm:[Constraints,Ordering,with-rsc-role]

|first-action
|start
|An additional attribute of ordering constraints that specifies the
action that the +first+ resource must complete before executing the
specified action for the +then+ resource. Allowed values: +start+,
+stop+, +promote+, +demote+.
indexterm:[first-action,Ordering Constraints]
indexterm:[Constraints,Ordering,first-action]

|then-action
|value of +first-action+
|An additional attribute of ordering constraints that specifies the
action that the +then+ resource can only execute after the
+first-action+ on the +first+ resource has completed. Allowed
values: +start+, +stop+, +promote+, +demote+.
indexterm:[then-action,Ordering Constraints]
indexterm:[Constraints,Ordering,then-action]

|=========================================================

.Constraints involving multi-state resources
Expand Down
39 changes: 29 additions & 10 deletions doc/Pacemaker_Explained/en-US/Ch-Constraints.txt
Original file line number Diff line number Diff line change
Expand Up @@ -238,29 +238,44 @@ indexterm:[Constraints,Ordering,id]

|first
|
|The name of a resource that must be started before the +then+
resource is allowed to.
|Name of the resource that the +then+ resource depends on
indexterm:[first,Ordering Constraints]
indexterm:[Constraints,Ordering,first]

|then
|
|The name of a resource. This resource will start after the +first+ resource.
|Name of the dependent resource
indexterm:[then,Ordering Constraints]
indexterm:[Constraints,Ordering,then]

|first-action
|start
|The action that the +first+ resource must complete before +then-action+
can be initiated for the +then+ resource. Allowed values: +start+,
+stop+, +promote+, +demote+.
indexterm:[first-action,Ordering Constraints]
indexterm:[Constraints,Ordering,first-action]

|then-action
|value of +first-action+
|The action that the +then+ resource can execute only after the
+first-action+ on the +first+ resource has completed. Allowed
values: +start+, +stop+, +promote+, +demote+.
indexterm:[then-action,Ordering Constraints]
indexterm:[Constraints,Ordering,then-action]

|kind
|
|How to enforce the constraint. Allowed values:

* +Optional:+ Just a suggestion. Only applies if both resources are
starting/stopping. Any change in state by the +first+ resource will have no
effect on the +then+ resource.
* +Mandatory:+ Always. If 'first' is stopping or cannot be started,
'then' must be stopped. If 'first' is restarted, 'then' (if running)
will be stopped beforehand and started afterward.
executing the specified actions. Any change in state by the +first+ resource
will have no effect on the +then+ resource.
* +Mandatory:+ Always. If +first+ does not perform +first-action+, +then+ will
not be allowed to performed +then-action+. If +first+ is restarted, +then+
(if running) will be stopped beforehand and started afterward.
* +Serialize:+ Ensure that no two stop/start actions occur concurrently
for the resources. 'First' and 'then' can start in either order,
for the resources. +First+ and +then+ can start in either order,
but one must complete starting before the other can be started. A typical use
case is when resource start-up puts a high load on the host.

Expand All @@ -269,12 +284,16 @@ indexterm:[Constraints,Ordering,kind]

|symmetrical
|TRUE
|If true, stop the resources in the reverse order.
|If true, the reverse of the constraint applies for the opposite action (for
example, if B starts after A starts, then B stops before A stops).
indexterm:[symmetrical,Ordering Constraints]
indexterm:[Ordering Constraints,symmetrical]

|=========================================================

+Promote+ and +demote+ apply to the master role of
<<s-resource-multistate,multi-state>> resources.

=== Optional and mandatory ordering ===

Here is an example of ordering constraints where +Database+ 'must' start before
Expand Down
7 changes: 4 additions & 3 deletions doc/Pacemaker_Explained/en-US/Ch-Options.txt
Original file line number Diff line number Diff line change
Expand Up @@ -185,9 +185,10 @@ Should deleted actions be cancelled?
| start-failure-is-fatal | TRUE |
indexterm:[start-failure-is-fatal,Cluster Option]
indexterm:[Cluster,Option,start-failure-is-fatal]
Should a failure to start be treated as fatal for a resource?
If FALSE, the cluster will instead use the resource's
+failcount+ and value for +migration-threshold+ (see <<s-failure-migration>>).
Should a failure to start a resource on a particular node prevent further start
attempts on that node? If FALSE, the cluster will decide whether to try
starting on the same node again based on the resource's current failure count
and +migration-threshold+ (see <<s-failure-migration>>).

| enable-startup-probes | TRUE |
indexterm:[enable-startup-probes,Cluster Option]
Expand Down
23 changes: 21 additions & 2 deletions doc/Pacemaker_Explained/en-US/Ch-Stonith.txt
Original file line number Diff line number Diff line change
Expand Up @@ -550,7 +550,7 @@ Some possible uses of topologies include:
* Initiate a kdump and then poweroff the node

.Properties of Fencing Levels
[width="95%",cols="1m,6<",options="header",align="center"]
[width="95%",cols="1m,3<",options="header",align="center"]
|=========================================================

|Field
Expand All @@ -562,10 +562,22 @@ Some possible uses of topologies include:
indexterm:[Fencing,fencing-level,id]

|target
|The node to which this level applies
|The name of a single node to which this level applies
indexterm:[target,fencing-level]
indexterm:[Fencing,fencing-level,target]

|target-pattern
|A regular expression matching the names of nodes to which this level applies
'(since 1.1.14)'
indexterm:[target-pattern,fencing-level]
indexterm:[Fencing,fencing-level,target-pattern]

|target-attribute
|The name of a node attribute that is set for nodes to which this level applies
'(since 1.1.14)'
indexterm:[target-attribute,fencing-level]
indexterm:[Fencing,fencing-level,target-attribute]

|index
|The order in which to attempt the levels.
Levels are attempted in ascending order 'until one succeeds'.
Expand Down Expand Up @@ -871,3 +883,10 @@ be logged but ignored.
When a reboot operation is remapped, any action-specific timeout for the
remapped action will be used (for example, +pcmk_off_timeout+ will be used when
executing the +off+ command, not +pcmk_reboot_timeout+).

[NOTE]
====
In Pacemaker versions 1.1.13 and earlier, reboots will not be remapped in the
second case. To achieve the same effect, separate fencing devices for off and
on actions must be configured.
====
Loading

0 comments on commit 89d36bc

Please sign in to comment.