Merge pull request #851 from kgaillot/master

Merge Pacemaker-1.1.14-rc3 into master branch
ClusterLabs · Dec 14, 2015 · 89d36bc · 89d36bc
2 parents 5b41ae1 + ea4f3a7
commit 89d36bc
Show file tree

Hide file tree

Showing 24 changed files with 220 additions and 58 deletions.
diff --git a/.travis.yml b/.travis.yml
@@ -1,5 +1,5 @@
 # Control file for the Travis autobuilder
-# http://docs.travis-ci.com/user/build-configuration/
+# https://docs.travis-ci.com/user/customizing-the-build/
 
 language: c
 compiler:
@@ -65,6 +65,7 @@ notifications:
   email:
     recipients:
       - [email protected]
+
 # whitelist
 branches:
   only:

diff --git a/ChangeLog b/ChangeLog
@@ -1,3 +1,83 @@
+* Mon Dec 14 2015 Ken Gaillot <[email protected]> Pacemaker-1.1.14-rc3-1
+- Update source tarball to revision: 97ada31
+- Changesets: 7
+- Diff:       2 files changed, 7 insertions(+), 2 deletions(-)
+
+- Changes since Pacemaker-1.1.14-rc2
+  + stonithd: fix issue where deleting a fence device attribute can delete the device
+
+* Tue Dec 08 2015 Ken Gaillot <[email protected]> Pacemaker-1.1.14-rc2-1
+- Update source tarball to revision: ce830a7
+- Changesets: 1
+- Diff:       1 file changed, 2 insertions(+)
+
+- Changes since Pacemaker-1.1.14-rc1
+  + crmd: ensure compilation works with built-in notifications disabled
+
+* Tue Dec 08 2015 Ken Gaillot <[email protected]> Pacemaker-1.1.14-rc1-1
+- Update source tarball to revision: 7cd6dcf
+- Changesets: 656
+- Diff:       169 files changed, 13014 insertions(+), 7579 deletions(-)
+
+- Features added since Pacemaker-1.1.13
+  + crm_resource: Indicate common reasons why a resource may not start after a cleanup
+  + crm_resource: New --force-promote and --force-demote options for debugging
+  + fencing: Support targeting fencing topologies by node name pattern or node attribute
+  + fencing: Remap sequential topology reboots to all-off-then-all-on
+  + pengine: Allow guest remote nodes using containers/vms to be nested in a group resource
+  + pengine: Allow resources to start and stop as soon as their state is known on all nodes
+  + pengine: Include a list of all and available nodes with clone notifications
+  + pengine: Addition of the clone resource clone-min metadata option
+  + pengine: Support of multiple-active=block for resource groups
+  + remote: reconnect_interval option for remote nodes to delay reconnect after fence
+
+- Changes since Pacemaker-1.1.13
+  + fix multiple memory issues (leaks, use-after-free, double free, use-of-NULL) in components and tools
+  + cib: Do not terminate due to badly behaving clients
+  + cman: handle corosync-invented node names of the form Node{id} for peers not in its node list
+  + controld: replace bashism
+  + crm_node: Display node state with -l and quorum status with -q, if available
+  + crmd: resources would sometimes be restarted when only non-unique parameters changed
+  + crmd: fence remote node after connection failure only once
+  + crmd: handle resources named the same as cluster nodes
+  + crmd: Pre-emptively fail in-flight actions when lrmd connections fail
+  + crmd: Record actions in the CIB as failed if we cannot execute them
+  + crm_report: Enable password sanitizing by default
+  + crm_report: Allow log file discovery to be disabled
+  + crm_resource: Allow the resource configuration to be modified for --force-{check,start,..} calls
+  + crm_resource: Compensate for -C and -p being called with the child resource for clones
+  + crm_resource: Correctly clean up all children for anonymous cloned groups
+  + crm_resource: Correctly clean up failcounts for inactive anonymous clones
+  + crm_resource: Correctly observe --force when deleting and updating attributes
+  + crm_shadow: Fix "crm_shadow --diff"
+  + crm_simulate: Prevent segfault on arches with 64bit time_t
+  + fencing: ensure "required"/"automatic" only apply to "on" actions
+  + fencing: Return a provider for the internal fencing agent "#watchdog" instead of logging an error
+  + fencing: ignore stderr output of fence agents (often used for debug messages)
+  + libcib: potential user input overflow
+  + libcluster: overhaul peer cache management
+  + log: make syslog less noisy
+  + lrmd: cancel currently pending STONITH op if stonithd connection is lost
+  + lrmd: Finalize all pending and recurring operations when cleaning up a resource
+  + pengine: Bug cl#5247 - Imply resources running on a container are stopped when the container is stopped
+  + pengine: cl#5235 - Prevent graph loops that can be introduced by "load_stopped -> migrate_to" ordering
+  + pengine: Correctly bypass fencing for resources that do not require it
+  + pengine: do not timeout remote node recurring monitor op failure until after fencing
+  + pengine: Ensure recurring monitor operations are cancelled when clone instances are de-allocated
+  + pengine: fixes segfault in pengine when fencing remote node
+  + pengine: properly handle blocked clone actions
+  + pengine: ensure failed actions that occurred in node shutdown are displayed
+  + remote: Correctly display the usage of the ocf:pacemaker:remote resource agent
+  + remote: do not fail operations because of a migration
+  + remote: enable reloads for select remote connection options
+  + resources: allow for top output with or without percent sign in HealthCPU
+  + resources: Prevent an error message on stopping "Dummy" resource
+  + systemd: Prevent segfault when logging failed operations
+  + systemd: Reconnect to System DBus if the connection is closed
+  + systemd: set systemd resources' timeout values higher than systemd's own default
+  + tools: Do not send command lines to syslog
+  + upstart: Ensure pending structs are correctly unreferenced
+
 
 * Wed Jun 24 2015 Andrew Beekhof <[email protected]> Pacemaker-1.1.13-1
 - Update source tarball to revision: 2a1847e

diff --git a/GNUmakefile b/GNUmakefile
@@ -286,8 +286,8 @@ www:	all global doxygen
 
 summary:
 	@printf "\n* `date +"%a %b %d %Y"` `git config user.name` <`git config user.email`> $(NEXT_RELEASE)-1"
-	@printf "\n- Update source tarball to revision: `git id`"
-	@printf "\n- Changesets: `git log --pretty=format:'%h' $(LAST_RELEASE)..HEAD | wc -l`"
+	@printf "\n- Update source tarball to revision: `git log --pretty=format:%h -n 1`"
+	@printf "\n- Changesets: `git log --pretty=oneline $(LAST_RELEASE)..HEAD | wc -l`"
 	@printf "\n- Diff:      "
 	@git diff -r $(LAST_RELEASE)..HEAD --stat include lib mcp pengine/*.c pengine/*.h  cib crmd fencing lrmd tools xml | tail -n 1
 

diff --git a/crmd/control.c b/crmd/control.c
@@ -41,6 +41,13 @@
 #include <sys/types.h>
 #include <sys/stat.h>
 
+/* Enable support for built-in notifications
+ *
+ * The interface is expected to change significantly, and this will be defined
+ * in the upstream master branch only until a new design is finalized.
+ */
+#define RHEL7_COMPAT
+
 qb_ipcs_service_t *ipcs = NULL;
 
 extern gboolean crm_connect_corosync(crm_cluster_t * cluster);
@@ -893,6 +900,8 @@ pe_cluster_option crmd_opts[] = {
 	  "  To ensure these changes take effect, we can optionally poll the cluster's status for changes."
         },
 
+#ifdef RHEL7_COMPAT
+    /* this interface is expected to change but was released in RHEL 7 */
 	{ "notification-agent", NULL, "string", NULL, "/dev/null", &check_script,
           "Notification script or tool to be called after significant cluster events",
           "Full path to a script or binary that will be invoked when resources start/stop/fail, fencing occurs or nodes join/leave the cluster.\n"
@@ -902,6 +911,7 @@ pe_cluster_option crmd_opts[] = {
           "Destination for notifications (Optional)",
           "Where should the supplied script send notifications to.  Useful to avoid hard-coding this in the script."
         },
+#endif
 
 	{ "load-threshold", NULL, "percentage", NULL, "80%", &check_utilization,
 	  "The maximum amount of system resources that should be used by nodes in the cluster",
@@ -963,7 +973,9 @@ crmd_pref(GHashTable * options, const char *name)
 static void
 config_query_callback(xmlNode * msg, int call_id, int rc, xmlNode * output, void *user_data)
 {
+#ifdef RHEL7_COMPAT
     const char *script = NULL;
+#endif
     const char *value = NULL;
     GHashTable *config_hash = NULL;
     crm_time_t *now = crm_time_new(NULL);
@@ -992,9 +1004,11 @@ config_query_callback(xmlNode * msg, int call_id, int rc, xmlNode * output, void
 
     verify_crmd_options(config_hash);
 
+#ifdef RHEL7_COMPAT
     script = crmd_pref(config_hash, "notification-agent");
     value  = crmd_pref(config_hash, "notification-recipient");
     crmd_enable_notifications(script, value);
+#endif
 
     value = crmd_pref(config_hash, XML_CONFIG_ATTR_DC_DEADTIME);
     election_trigger->period_ms = crm_get_msec(value);

diff --git a/doc/Pacemaker_Explained/en-US/Book_Info.xml b/doc/Pacemaker_Explained/en-US/Book_Info.xml
@@ -12,7 +12,7 @@
 	changes (pacemaker), and PUBSNUMBER for
 	simple textual changes (corrections, translations, etc.).
   -->
-  <edition>5</edition>
+  <edition>6</edition>
   <pubsnumber>0</pubsnumber>
   <abstract>
     <para>

diff --git a/doc/Pacemaker_Explained/en-US/Ch-Advanced-Options.txt b/doc/Pacemaker_Explained/en-US/Ch-Advanced-Options.txt
@@ -244,6 +244,10 @@ it again on the same node. However if a resource fails repeatedly,
 it is possible that there is an underlying problem on that node, and you
 might desire trying a different node in such a case.
 
+indexterm:[migration-threshold]
+indexterm:[failure-timeout]
+indexterm:[start-failure-is-fatal]
+
 Pacemaker allows you to set your preference via the +migration-threshold+
 resource option.
 footnote:[
@@ -272,8 +276,9 @@ minute.
 There are two exceptions to the migration threshold concept:
 when a resource either fails to start or fails to stop.
 
-Start failures cause the failcount to be set to +INFINITY+ and thus always
-cause the resource to move immediately.
+If the cluster property +start-failure-is-fatal+ is set to +true+ (which is the
+default), start failures cause the failcount to be set to +INFINITY+ and thus
+always cause the resource to move immediately.
 
 Stop failures are slightly different and crucial.  If a resource fails
 to stop and STONITH is enabled, then the cluster will fence the node

diff --git a/doc/Pacemaker_Explained/en-US/Ch-Advanced-Resources.txt b/doc/Pacemaker_Explained/en-US/Ch-Advanced-Resources.txt
@@ -215,6 +215,13 @@ Options inherited from <<s-resource-options,primitive>> resources:
  indexterm:[clone-node-max,Clone Option]
  indexterm:[Clone,Option,clone-node-max]
 
+|clone-min
+|1
+|Require at least this number of clone instances to be runnable before allowing
+resources depending on the clone to be runnable '(since 1.1.14)'
+ indexterm:[clone-min,Clone Option]
+ indexterm:[Clone,Option,clone-min]
+
 |notify
 |true
 |When stopping or starting a copy of the clone, tell all the other
@@ -582,12 +589,13 @@ location constraints.  These constraints are written no differently from
 those for primitive resources except that the master's +id+ is used.
 
 When considering multi-state resources in constraints, for most
-purposes it is sufficient to treat them as clones.  The exception is
-when the +rsc-role+ and/or +with-rsc-role+ fields (for colocation
-constraints) and +first-action+ and/or +then-action+ fields (for
-ordering constraints) are used.
+purposes it is sufficient to treat them as clones. The exception is
+that the +first-action+ and/or +then-action+ fields for ordering constraints
+may be set to +promote+ or +demote+ to constrain the master role,
+and colocation constraints may contain +rsc-role+ and/or +with-rsc-role+
+fields.
 
-.Additional constraint options relevant to multi-state resources
+.Additional colocation constraint options for multi-state resources
 [width="95%",cols="1m,1,3<",options="header",align="center"]
 |=========================================================
 
@@ -611,24 +619,6 @@ ordering constraints) are used.
  indexterm:[with-rsc-role,Ordering Constraints]
  indexterm:[Constraints,Ordering,with-rsc-role]
 
-|first-action
-|start
-|An additional attribute of ordering constraints that specifies the
- action that the +first+ resource must complete before executing the
- specified action for the +then+ resource.  Allowed values: +start+,
- +stop+, +promote+, +demote+.
- indexterm:[first-action,Ordering Constraints]
- indexterm:[Constraints,Ordering,first-action]
-
-|then-action
-|value of +first-action+
-|An additional attribute of ordering constraints that specifies the
- action that the +then+ resource can only execute after the
- +first-action+ on the +first+ resource has completed.  Allowed
- values: +start+, +stop+, +promote+, +demote+.
- indexterm:[then-action,Ordering Constraints]
- indexterm:[Constraints,Ordering,then-action]
-
 |=========================================================
 
 .Constraints involving multi-state resources       

diff --git a/doc/Pacemaker_Explained/en-US/Ch-Constraints.txt b/doc/Pacemaker_Explained/en-US/Ch-Constraints.txt
@@ -238,29 +238,44 @@ indexterm:[Constraints,Ordering,id]
 
 |first
 |
-|The name of a resource that must be started before the +then+
- resource is allowed to.
+|Name of the resource that the +then+ resource depends on
 indexterm:[first,Ordering Constraints]
 indexterm:[Constraints,Ordering,first]
 
 |then
 |
-|The name of a resource. This resource will start after the +first+ resource.
+|Name of the dependent resource
 indexterm:[then,Ordering Constraints]
 indexterm:[Constraints,Ordering,then]
 
+|first-action
+|start
+|The action that the +first+ resource must complete before +then-action+
+ can be initiated for the +then+ resource.  Allowed values: +start+,
+ +stop+, +promote+, +demote+.
+ indexterm:[first-action,Ordering Constraints]
+ indexterm:[Constraints,Ordering,first-action]
+
+|then-action
+|value of +first-action+
+|The action that the +then+ resource can execute only after the
+ +first-action+ on the +first+ resource has completed.  Allowed
+ values: +start+, +stop+, +promote+, +demote+.
+ indexterm:[then-action,Ordering Constraints]
+ indexterm:[Constraints,Ordering,then-action]
+
 |kind
 |
 |How to enforce the constraint. Allowed values:
 
 * +Optional:+ Just a suggestion. Only applies if both resources are
-  starting/stopping. Any change in state by the +first+ resource will have no
-  effect on the +then+ resource.
-* +Mandatory:+ Always. If 'first' is stopping or cannot be started,
-  'then' must be stopped. If 'first' is restarted, 'then' (if running)
-  will be stopped beforehand and started afterward.
+  executing the specified actions. Any change in state by the +first+ resource
+  will have no effect on the +then+ resource.
+* +Mandatory:+ Always. If +first+ does not perform +first-action+, +then+ will
+  not be allowed to performed +then-action+. If +first+ is restarted, +then+
+  (if running) will be stopped beforehand and started afterward.
 * +Serialize:+ Ensure that no two stop/start actions occur concurrently
-  for the resources. 'First' and 'then' can start in either order,
+  for the resources. +First+ and +then+ can start in either order,
   but one must complete starting before the other can be started. A typical use
   case is when resource start-up puts a high load on the host.
 
@@ -269,12 +284,16 @@ indexterm:[Constraints,Ordering,kind]
 
 |symmetrical
 |TRUE
-|If true, stop the resources in the reverse order.
+|If true, the reverse of the constraint applies for the opposite action (for
+ example, if B starts after A starts, then B stops before A stops).
 indexterm:[symmetrical,Ordering Constraints]
 indexterm:[Ordering Constraints,symmetrical]
 
 |=========================================================
 
++Promote+ and +demote+ apply to the master role of
+<<s-resource-multistate,multi-state>> resources.
+
 === Optional and mandatory ordering ===
 
 Here is an example of ordering constraints where +Database+ 'must' start before

diff --git a/doc/Pacemaker_Explained/en-US/Ch-Options.txt b/doc/Pacemaker_Explained/en-US/Ch-Options.txt
@@ -185,9 +185,10 @@ Should deleted actions be cancelled?
 | start-failure-is-fatal | TRUE |
 indexterm:[start-failure-is-fatal,Cluster Option]
 indexterm:[Cluster,Option,start-failure-is-fatal]
-Should a failure to start be treated as fatal for a resource?
-If FALSE, the cluster will instead use the resource's
-+failcount+ and value for +migration-threshold+ (see <<s-failure-migration>>).
+Should a failure to start a resource on a particular node prevent further start
+attempts on that node? If FALSE, the cluster will decide whether to try
+starting on the same node again based on the resource's current failure count
+and +migration-threshold+ (see <<s-failure-migration>>).
 
 | enable-startup-probes | TRUE |
 indexterm:[enable-startup-probes,Cluster Option]

diff --git a/doc/Pacemaker_Explained/en-US/Ch-Stonith.txt b/doc/Pacemaker_Explained/en-US/Ch-Stonith.txt
@@ -550,7 +550,7 @@ Some possible uses of topologies include:
 * Initiate a kdump and then poweroff the node
 
 .Properties of Fencing Levels
-[width="95%",cols="1m,6<",options="header",align="center"]
+[width="95%",cols="1m,3<",options="header",align="center"]
 |=========================================================
 
 |Field
@@ -562,10 +562,22 @@ Some possible uses of topologies include:
  indexterm:[Fencing,fencing-level,id]
 
 |target
-|The node to which this level applies
+|The name of a single node to which this level applies
  indexterm:[target,fencing-level]
  indexterm:[Fencing,fencing-level,target]
 
+|target-pattern
+|A regular expression matching the names of nodes to which this level applies
+'(since 1.1.14)'
+ indexterm:[target-pattern,fencing-level]
+ indexterm:[Fencing,fencing-level,target-pattern]
+
+|target-attribute
+|The name of a node attribute that is set for nodes to which this level applies
+'(since 1.1.14)'
+ indexterm:[target-attribute,fencing-level]
+ indexterm:[Fencing,fencing-level,target-attribute]
+
 |index
 |The order in which to attempt the levels.
  Levels are attempted in ascending order 'until one succeeds'.
@@ -871,3 +883,10 @@ be logged but ignored.
 When a reboot operation is remapped, any action-specific timeout for the
 remapped action will be used (for example, +pcmk_off_timeout+ will be used when
 executing the +off+ command, not +pcmk_reboot_timeout+).
+
+[NOTE]
+====
+In Pacemaker versions 1.1.13 and earlier, reboots will not be remapped in the
+second case. To achieve the same effect, separate fencing devices for off and
+on actions must be configured.
+====