Skip to content

Releases: ClusterLabs/pacemaker

Pacemaker 1.1.11 - Release Candidate 1

21 Nov 02:58
Compare
Choose a tag to compare
Pre-release

The most notable changes/fixes since Pacemaker-1.1.10 include:

  • attrd: Implementation of a truely atomic attrd for use with corosync 2.x
  • cib: Allow values to be added/updated and removed in a single update
  • cib: Support XML comments in diffs
  • Core: Allow blackbox logging to be disabled with SIGUSR2
  • crmd: Do not block on proxied calls from pacemaker_remoted
  • crmd: Enable cluster-wide throttling when the cib heavily exceeds its target load
  • crmd: Use the load on our peers to know how many jobs to send them
  • crm_mon: add --hide-headers option to hide all headers
  • crm_report: Collect logs directly from journald if available
  • Fencing: On timeout, clean up the agent's entire process group
  • Fencing: Support agents that need the host to be unfenced at startup
  • ipc: Raise the default buffer size to 128k
  • PE: Add a special attribute for distinguishing between real nodes and containers in constraint rules
  • PE: Allow location constraints to take a regex pattern to match against resource IDs
  • pengine: Distinguish between the agent being missing and something the agent needs being missing
  • remote: Properly version the remote connection protocol
  • services: Detect missing agents and permission errors before forking
  • Bug cl#5171 - pengine: Don't prevent clones from running due to dependant resources
  • Bug cl#5179 - Corosync: Attempt to retrieve a peer's node name if it is not already known
  • Bug cl#5181 - corosync: Ensure node IDs are written to the CIB as unsigned integers

If you are a user of pacemaker_remoted, you should take the time to read about changes to the online wire protocol that are present in this release.

1.1.10 - final

26 Jul 00:24
Compare
Choose a tag to compare

Details - 1.1.10 - final

Changesets  602
Diff 143 files changed, 8162 insertions(+), 5159 deletions(-)

Highlights

Features added since Pacemaker-1.1.9

  • Core: Convert all exit codes to positive errno values
  • crm_error: Add the ability to list and print error symbols
  • crm_resource: Allow individual resources to be reprobed
  • crm_resource: Allow options to be set recursively
  • crm_resource: Implement --ban for moving resources away from nodes and --clear (replaces --unmove)
  • crm_resource: Support OCF tracing when using --force-(check|start|stop)
  • PE: Allow active nodes in our current membership to be fenced without quorum
  • PE: Suppress meaningless IDs when displaying anonymous clone status
  • Turn off auto-respawning of systemd services when the cluster starts them
  • Bug cl#5128 - pengine: Support maintenance mode for a single node

Changes since Pacemaker-1.1.9

  • crmd: cib: stonithd: Memory leaks resolved and improved use of glib reference counting
  • attrd: Fixes deleted attributes during dc election
  • Bug cf#5153 - Correctly display clone failcounts in crm_mon
  • Bug cl#5133 - pengine: Correctly observe on-fail=block for failed demote operation
  • Bug cl#5148 - legacy: Correctly remove a node that used to have a different nodeid
  • Bug cl#5151 - Ensure node names are consistently compared without case
  • Bug cl#5152 - crmd: Correctly clean up fenced nodes during membership changes
  • Bug cl#5154 - Do not expire failures when on-fail=block is present
  • Bug cl#5155 - pengine: Block the stop of resources if any depending resource is unmanaged
  • Bug cl#5157 - Allow migration in the absence of some colocation constraints
  • Bug cl#5161 - crmd: Prevent memory leak in operation cache
  • Bug cl#5164 - crmd: Fixes crash when using pacemaker-remote
  • Bug cl#5164 - pengine: Fixes segfault when calculating transition with remote-nodes.
  • Bug cl#5167 - crm_mon: Only print "stopped" node list for incomplete clone sets
  • Bug cl#5168 - Prevent clones from being bounced around the cluster due to location constraints
  • Bug cl#5170 - Correctly support on-fail=block for clones
  • cib: Correctly read back archived configurations if the primary is corrupted
  • cib: The result is not valid when diffs fail to apply cleanly for CLI tools
  • cib: Restore the ability to embed comments in the configuration
  • cluster: Detect and warn about node names with capitals
  • cman: Do not pretend we know the state of nodes we've never seen
  • cman: Do not unconditionally start cman if it is already running
  • cman: Support non-blocking CPG calls
  • Core: Ensure the blackbox is saved on abnormal program termination
  • corosync: Detect the loss of members for which we only know the nodeid
  • corosync: Do not pretend we know the state of nodes we've never seen
  • corosync: Ensure removed peers are erased from all caches
  • corosync: Nodes that can persist in sending CPG messages must be alive afterall
  • crmd: Do not get stuck in S_POLICY_ENGINE if a node we couldn't fence returns
  • crmd: Do not update fail-count and last-failure for old failures
  • crmd: Ensure all membership operations can complete while trying to cancel a transition
  • crmd: Ensure operations for cleaned up resources don't block recovery
  • crmd: Ensure we return to a stable state if there have been too many fencing failures
  • crmd: Initiate node shutdown if another node claims to have successfully fenced us
  • crmd: Prevent messages for remote crmd clients from being relayed to wrong daemons
  • crmd: Properly handle recurring monitor operations for remote-node agent
  • crmd: Store last-run and last-rc-change for all operations
  • crm_mon: Ensure stale pid files are updated when a new process is started
  • crm_report: Correctly collect logs when 'uname -n' reports fully qualified names
  • fencing: Fail the operation once all peers have been exhausted
  • fencing: Restore the ability to manually confirm that fencing completed
  • ipc: Allow unpriviliged clients to clean up after server failures
  • ipc: Restore the ability for members of the haclient group to connect to the cluster
  • legacy: Support "crm_node --remove" with a node name for corosync plugin (bnc#805278)
  • lrmd: Default to the upstream location for resource agent scratch directory
  • lrmd: Pass errors from lsb metadata generation back to the caller
  • pengine: Correctly handle resources that recover before we operate on them
  • pengine: Delete the old resource state on every node whenever the resource type is changed
  • pengine: Detect constraints with inappropriate actions (ie. promote for a clone)
  • pengine: Ensure per-node resource parameters are used during probes
  • pengine: If fencing is unavailable or disabled, block further recovery for resources that fail to stop
  • pengine: Implement the rest of get_timet_now() and rename to get_effective_time
  • pengine: Re-initiate active recurring monitors that previously failed but have timed out
  • remote: Workaround for inconsistent tls handshake behavior between gnutls versions
  • systemd: Ensure we get shut down correctly by systemd
  • systemd: Reload systemd after adding/removing override files for cluster services
  • xml: Check for and replace non-printing characters with their octal equivalent while exporting xml text
  • xml: Prevent lockups by setting a more reliable buffer allocation strategy

1.1.10 - Release Candidate 7

22 Jul 00:46
Compare
Choose a tag to compare
Pre-release

Details - 1.1.10-rc6

Changesets  57
Diff 37 files changed, 414 insertions(+), 331 deletions(-)

Features added in Pacemaker-1.1.10-rc7

  • N/A

Changes since Pacemaker-1.1.10-rc6

  • Bug cl#5168 - Prevent clones from being bounced around the cluster due to location constraints
  • Bug cl#5170 - Correctly support on-fail=block for clones
  • Bug cl#5164 - crmd: Fixes crmd crash when using pacemaker-remote
  • cib: The result is not valid when diffs fail to apply cleanly for CLI tools
  • cluster: Correctly construct the header for compressed messages
  • cluster: Detect and warn about node names with capitals
  • Core: remove the mainloop_trigger that are no longer needed.
  • corosync: Ensure removed peers are erased from all caches
  • cpg: Correctly free sent messages
  • crmd: Prevent messages for remote crmd clients from being relayed to wrong daemons
  • crmd: Properly handle recurring monitor operations for remote-node agent
  • crm_mon: Bug cl#5167 - Only print "stopped" node list for incomplete clone sets
  • crm_node: Return 0 if --remove passed
  • fencing: Correctly detect existing device entries when registering a new one
  • lrmd: Prevent use-of-NULL in client library
  • pengine: cl5164 - Fixes pengine segfault when calculating transition with remote-nodes.
  • pengine: Do the right thing when admins specify the internal resource instead of the clone
  • pengine: Re-allow ordering constraints with fencing devices now that it is safe to do so

1.1.10 - Release Candidate 6

04 Jul 06:54
Compare
Choose a tag to compare
Pre-release

Details

Changesets  63
Diff 24 files changed, 356 insertions(+), 133 deletions(-)

Highlights

Features added

  • tools: crm_mon --neg-location drbd-fence-by-handler
  • pengine: cl#5128 - Support maintenance mode for a single node

Other Changes

  • cluster: Correctly remove duplicate peer entries
  • crmd: Ensure operations for cleaned up resources don't block recovery
  • pengine: Bug cl#5157 - Allow migration in the absence of some colocation constraints
  • pengine: Delete the old resource state on every node whenever the resource type is changed
  • pengine: Detect constraints with inappropriate actions (ie. promote for a clone)
  • pengine: Do the right thing when admins specify the internal resource instead of the clone

1.1.10 - Release Candidate 5

03 Jul 12:58
Compare
Choose a tag to compare
Pre-release

Details

Changesets  168
Diff 96 files changed, 4983 insertions(+), 3097 deletions(-)

Features added

  • crm_error: Add the ability to list and print error symbols
  • crm_resource: Allow individual resources to be reprobed
  • crm_resource: Implement --ban for moving resources away from nodes and --clear (replaces --unmove)
  • crm_resource: Support OCF tracing when using --force-(check|start|stop)
  • PE: Allow active nodes in our current membership to be fenced without quorum
  • Turn off auto-respawning of systemd services when the cluster starts them

Other Changes

  • Bug pengine: cl#5155 - Block the stop of resources if any depending resource is unmanaged
  • Convert all exit codes to positive errno values
  • Core: Ensure the blackbox is saved on abnormal program termination
  • corosync: Detect the loss of members for which we only know the nodeid
  • corosync: Do not pretend we know the state of nodes we've never seen
  • corosync: Nodes that can persist in sending CPG messages must be alive afterall
  • crmd: Do not get stuck in S_POLICY_ENGINE if a node we couldn't fence returns
  • crmd: Ensure all membership operations can complete while trying to cancel a transition
  • crmd: Everyone who gets a fencing notification should mark the node as down
  • crmd: Initiate node shutdown if another node claims to have successfully fenced us
  • crmd: Update the status section with details of nodes for which we only know the nodeid
  • crm_report: Find logs in compressed files
  • logging: If SIGTRAP is sent before tracing is turned on, turn it on
  • pengine: If fencing is unavailable or disabled, block further recovery for resources that fail to stop
  • remote: Workaround for inconsistent tls handshake behavior between gnutls versions
  • systemd: Ensure we get shut down correctly by systemd

1.1.10 - Release Candidate 3

03 Jul 13:04
Compare
Choose a tag to compare
Pre-release

Details

Changesets  116
Diff 59 files changed, 707 insertions(+), 408 deletions(-)

Highlights

Features added

  • PE: Display a list of nodes on which stopped anonymous clones are not active instead of meaningless clone IDs
  • PE: Suppress meaningless IDs when displaying anonymous clone status

Other Changes

  • Bug cl#5133 - pengine: Correctly observe on-fail=block for failed demote operation
  • Bug cl#5151 - Ensure node names are consistently compared without case
  • Check for and replace non-printing characters with their octal equivalent while exporting xml text
  • cib: CID#1023858 - Explicit null dereferenced
  • cib: CID#1023862 - Improper use of negative value
  • cib: CID#739562 - Improper use of negative value
  • cman: Our daemons have no need to connect to pacemakerd in a cman based cluster
  • crmd: Do not record pending delete operations in the CIB
  • crmd: Ensure pending and lost actions have values for last-run and last-rc-change
  • crmd: Insert async failures so that they appear in the correct order
  • crmd: Store last-run and last-rc-change for fail operations
  • Detect child processes that terminate before our SIGCHLD handler is installed
  • fencing: CID#739461 - Double close
  • fencing: Correctly broadcast manual fencing ACKs
  • fencing: Correctly mark manual confirmations as complete
  • fencing: Do not send duplicate replies for manual confirmation operations
  • fencing: Restore the ability to manually confirm that fencing completed
  • lrmd: CID#1023851 - Truncated stdio return value
  • lrmd: Don't complain when heartbeat invokes us with -r
  • pengine: Correctly handle resources that recover before we operate on them
  • pengine: Re-initiate active recurring monitors that previously failed but have timed out
  • xml: Restore the ability to embed comments in the cib

1.1.10 - Release Candidate 2

03 Jul 13:12
Compare
Choose a tag to compare
Pre-release

Details - 1.1.10-rc2

Changesets  31
Diff 30 files changed, 687 insertions(+), 138 deletions(-)

Highlights

Features added in Pacemaker-1.1.10-rc2

N/A

Changes since Pacemaker-1.1.10-rc1

  • Bug cl#5152 - Correctly clean up fenced nodes during membership changes
  • Bug cl#5153 - Correctly display clone failcounts in crm_mon
  • Bug cl#5154 - Do not expire failures when on-fail=block is present
  • cman: Skip cman_pre_stop in the init script if fenced is not running
  • Core: Ensure the last field in transition keys is 36 characters
  • crm_mon: Check if a process can be daemonized before forking so the parent can report an error
  • crm_mon: Ensure stale pid files are updated when a new process is started
  • crm_report: Correctly collect logs when 'uname -n' reports fully qualified names
  • crm_resource: Allow --cleanup without a resource name
  • init: Unless specified otherwise, assume cman is in use if cluster.conf exists
  • mcp: inhibit error messages without cman
  • pengine: Ensure per-node resource parameters are used during probes
  • pengine: Implement the rest of get_timet_now() and rename to get_effective_time

1.1.10 - Release Candidate 1

03 Jul 13:12
Compare
Choose a tag to compare
Pre-release

Details - 1.1.10-rc1

Changesets  143
Diff 104 files changed, 3327 insertions(+), 1186 deletions(-)

Highlights

Features added in Pacemaker-1.1.10

  • crm_resource: Allow individual resources to be reprobed
  • mcp: Alternate Upstart job controlling both pacemaker and corosync
  • mcp: Prevent the cluster from trying to use cman even when it is installed

Changes since Pacemaker-1.1.9

  • Allow programs in the haclient group to use CRM_CORE_DIR
  • cman: Do not unconditionally start cman if it is already running
  • core: Ensure custom error codes are less than 256
  • crmd: Clean up memory at exit
  • crmd: Do not update fail-count and last-failure for old failures
  • crmd: Ensure we return to a stable state if there have been too many fencing failures
  • crmd: Indicate completion of refresh to callers
  • crmd: Indicate completion of re-probe to callers
  • crmd: Only perform a dry run for deletions if built with ACL support
  • crmd: Prevent use-after-free when the blackbox is enabled
  • crmd: Suppress secondary errors when no metadata is found
  • doc: Pacemaker Remote deployment and reference guide
  • fencing: Avoid memory leak in can_fence_host_with_device()
  • fencing: Clean up memory at exit
  • fencing: Correctly filter devices when no nodes are configured yet
  • fencing: Correctly unpack device parameters before using them
  • fencing: Fail the operation once all peers have been exhausted
  • fencing: Fix memory leaks during query phase
  • fencing: Prevent empty call-id during notification processing
  • fencing: Prevent invalid read in parse_host_list()
  • fencing: Prevent memory leak when registering devices
  • crmd: lrmd: stonithd: fixed memory leaks
  • ipc: Allow unpriviliged clients to clean up after server failures
  • ipc: Restore the ability for members of the haclient group to connect to the cluster
  • legacy: cl#5148 - Correctly remove a node that used to have a different nodeid
  • legacy: Support "crm_node --remove" with a node name for corosync plugin (bnc#805278)
  • logging: Better checks when determining if file based logging will work
  • Pass errors from lsb metadata generation back to the caller
  • pengine: Do not use functions from the cib library during unpack
  • Prevent use-of-NULL when reading CIB_shadow from the environment
  • Skip WNOHANG when waiting after sending SIGKILL to child processes
  • tools: crm_mon - Print a timing field only if its value is non-zero
  • Use custom OCF_ROOT_DIR if requested
  • xml: Prevent lockups by setting a more reliable buffer allocation strategy
  • xml: Prevent use-after-free in cib_process_xpath()
  • xml: Prevent use-after-free when not processing all xpath query results

Details - 1.1.9

Changesets  731
Diff 1301 files changed, 92909 insertions(+), 57455 deletions(-)

Highlights

Features added in Pacemaker-1.1.9

  • corosync: Allow cman and corosync 2.0 nodes to use a name other than uname()
  • corosync: Use queues to avoid blocking when sending CPG messages
  • ipc: Compress messages that exceed the configured IPC message limit
  • ipc: Use queues to prevent slow clients from blocking the server
  • ipc: Use shared memory by default
  • lrmd: Support nagios remote monitoring
  • lrmd: Pacemaker Remote Daemon for extending pacemaker functionality outside corosync cluster.
  • pengine: Check for master/slave resources that are not OCF agents
  • pengine: Support a 'requires' resource meta-attribute for controlling whether it needs quorum, fencing or nothing
  • pengine: Support for resource container
  • pengine: Support resources that require unfencing before start

Changes since Pacemaker-1.1.8

  • attrd: Correctly handle deletion of non-existant attributes
  • Bug cl#5135 - Improved detection of the active cluster type
  • Bug rhbz#913093 - Use crm_node instead of uname
  • cib: Avoid use-after-free by correctly support cib_no_children for non-xpath queries
  • cib: Correctly process XML diff's involving element removal
  • cib: Performance improvements for non-DC nodes
  • cib: Prevent error message by correctly handling peer replies
  • cib: Prevent ordering changes when applying xml diffs
  • cib: Remove text nodes from cib replace operations
  • cluster: Detect node name collisions in corosync
  • cluster: Preserve corosync membership state when matching node name/id entries
  • cman: Force fenced to terminate on shutdown
  • cman: Ignore qdisk 'nodes'
  • core: Drop per-user core directories
  • corosync: Avoid errors when closing failed connections
  • corosync: Ensure peer state is preserved when matching names to nodeids
  • corosync: Clean up CMAP connections after querying node name
  • corosync: Correctly detect corosync 2.0 clusters even if we don't have permission to access it
  • crmd: Bug cl#5144 - Do not updated the expected status of failed nodes
  • crmd: Correctly determin if cluster disconnection was abnormal
  • crmd: Correctly relay messages for remote clients (bnc#805626, bnc#804704)
  • crmd: Correctly stall the FSA when waiting for additional inputs
  • crmd: Detect and recover when we are evicted from CPG
  • crmd: Differentiate between a node that is up and coming up in peer_update_callback()
  • crmd: Have cib operation timeouts scale with node count
  • crmd: Improved continue/wait logic in do_dc_join_finalize()
  • crmd: Prevent election storms caused by getrusage() values being too close
  • crmd: Prevent timeouts when performing pacemaker level membership negotiation
  • crmd: Prevent use-after-free of fsa_message_queue during exit
  • crmd: Store all current actions when stalling the FSA
  • crm_mon: Do not try to render a blank cib and indicate the previous output is now stale
  • crm_mon: Fixes crm_mon crash when using snmp traps.
  • crm_mon: Look for the correct error codes when applying configuration updates
  • crm_report: Ensure policy engine logs are found
  • crm_report: Fix node list detection
  • crm_resource: Have crm_resource generate a valid transition key when sending resource commands to the crmd
  • date/time: Bug cl#5118 - Correctly convert seconds-since-epoch to the current time
  • fencing: Attempt to provide more information that just 'generic error' for failed actions
  • fencing: Correctly record completed but previously unknown fencing operations
  • fencing: Correctly terminate when all device options have been exhausted
  • fencing: cov#739453 - String not null terminated
  • fencing: Do not merge new fencing requests with stale ones from dead nodes
  • fencing: Do not start fencing until entire device topology is found or query results timeout.
  • fencing: Do not wait for the query timeout if all replies have arrived
  • fencing: Fix passing of parameters from CMAN containing '='
  • fencing: Fix non-comparison when sorting devices by priority
  • fencing: On failure, only try a topology device once from the remote level.
  • fencing: Only try peers for non-topology based operations once
  • fencing: Retry stonith device for duration of action's timeout period.
  • heartbeat: Remove incorrect assert during cluster connect
  • ipc: Bug cl#5110 - Prevent 100% CPU usage when looking for synchronous replies
  • ipc: Use 50k as the default compression threshold
  • legacy: Prevent assertion failure on routing ais messages (bnc#805626)
  • legacy: Re-enable logging from the pacemaker plugin
  • legacy: Relax the 'active' check for plugin based clusters to avoid false negatives
  • legacy: Skip peer process check if the process list is empty in crm_is_corosync_peer_active()
  • mcp: Only define HA_DEBUGLOG to avoid agent calls to ocf_log printing everything twice
  • mcp: Re-attach to existing pacemaker components when mcp fails
  • pengine: Any location constraint for the slave role applies to all roles
  • pengine: Avoid leaking memory when cleaning up failcounts and using containers
  • pengine: Bug cl#5101 - Ensure stop order is preserved for partially active groups
  • pengine: Bug cl#5140 - Allow set members to be stopped when the subseqent set has require-all=false
  • pengine: Bug cl#5143 - Prevent shuffling of anonymous master/slave instances
  • pengine: Bug rhbz#880249 - Ensure orphan masters are demoted before being stopped
  • pengine: Bug rhbz#880249 - Teach the PE how to recover masters into primitives
  • pengine: cl#5025 - Automatically clear failcount for start/monitor failures after resource parameters change
  • pengine: cl#5099 - Probe operation uses the timeout value from the minimum interval monitor by default (#bnc776386)
  • pengine: cl#5111 - When clone/master child rsc has on-fail=stop, insure all children stop on failure.
  • pengine: cl#5142 - Do not delete orphaned children of an anonymous clone
  • pengine: Correctly unpack active anonymous clones
  • pengine: Ensure previous migrations are closed out before attempting another one
  • pengine: Introducing the whitebox container resources feature
  • pengine: Prevent double-free for cloned primitive from template
  • pengine: Process rsc_ticket dependencies earlier for correctly allocating resources (bnc#802307)
  • pengine: Remove special cases for fencing resources
  • pengine: rhbz#902459 - Remove rsc node status for orphan resources
  • systemd: Gracefully handle unexpected DBus return types
  • Replace the use of the insecure mktemp(3) with mkstemp(3)

1.1.9 - Final

03 Jul 13:05
Compare
Choose a tag to compare

Release Statistics

Changesets  731 
Diff 1301 files changed, 92909 insertions(+), 57455 deletions(-)

Features added in Pacemaker-1.1.9

  • corosync: Allow cman and corosync 2.0 nodes to use a name other than uname()
  • corosync: Use queues to avoid blocking when sending CPG messages
  • ipc: Compress messages that exceed the configured IPC message limit
  • ipc: Use queues to prevent slow clients from blocking the server
  • ipc: Use shared memory by default
  • lrmd: Support nagios remote monitoring
  • lrmd: Pacemaker Remote Daemon for extending pacemaker functionality outside corosync cluster.
  • pengine: Check for master/slave resources that are not OCF agents
  • pengine: Support a 'requires' resource meta-attribute for controlling whether it needs quorum, fencing or nothing
  • pengine: Support for resource container
  • pengine: Support resources that require unfencing before start

Changes since Pacemaker-1.1.8

  • attrd: Correctly handle deletion of non-existant attributes
  • Bug cl#5135 - Improved detection of the active cluster type
  • Bug rhbz#913093 - Use crm_node instead of uname
  • cib: Avoid use-after-free by correctly support cib_no_children for non-xpath queries
  • cib: Correctly process XML diff's involving element removal
  • cib: Performance improvements for non-DC nodes
  • cib: Prevent error message by correctly handling peer replies
  • cib: Prevent ordering changes when applying xml diffs
  • cib: Remove text nodes from cib replace operations
  • cluster: Detect node name collisions in corosync
  • cluster: Preserve corosync membership state when matching node name/id entries
  • cman: Force fenced to terminate on shutdown
  • cman: Ignore qdisk 'nodes'
  • core: Drop per-user core directories
  • corosync: Avoid errors when closing failed connections
  • corosync: Ensure peer state is preserved when matching names to nodeids
  • corosync: Clean up CMAP connections after querying node name
  • corosync: Correctly detect corosync 2.0 clusters even if we don't have permission to access it
  • crmd: Bug cl#5144 - Do not updated the expected status of failed nodes
  • crmd: Correctly determin if cluster disconnection was abnormal
  • crmd: Correctly relay messages for remote clients (bnc#805626, bnc#804704)
  • crmd: Correctly stall the FSA when waiting for additional inputs
  • crmd: Detect and recover when we are evicted from CPG
  • crmd: Differentiate between a node that is up and coming up in peer_update_callback()
  • crmd: Have cib operation timeouts scale with node count
  • crmd: Improved continue/wait logic in do_dc_join_finalize()
  • crmd: Prevent election storms caused by getrusage() values being too close
  • crmd: Prevent timeouts when performing pacemaker level membership negotiation
  • crmd: Prevent use-after-free of fsa_message_queue during exit
  • crmd: Store all current actions when stalling the FSA
  • crm_mon: Do not try to render a blank cib and indicate the previous output is now stale
  • crm_mon: Fixes crm_mon crash when using snmp traps.
  • crm_mon: Look for the correct error codes when applying configuration updates
  • crm_report: Ensure policy engine logs are found
  • crm_report: Fix node list detection
  • crm_resource: Have crm_resource generate a valid transition key when sending resource commands to the crmd
  • date/time: Bug cl#5118 - Correctly convert seconds-since-epoch to the current time
  • fencing: Attempt to provide more information that just 'generic error' for failed actions
  • fencing: Correctly record completed but previously unknown fencing operations
  • fencing: Correctly terminate when all device options have been exhausted
  • fencing: cov#739453 - String not null terminated
  • fencing: Do not merge new fencing requests with stale ones from dead nodes
  • fencing: Do not start fencing until entire device topology is found or query results timeout.
  • fencing: Do not wait for the query timeout if all replies have arrived
  • fencing: Fix passing of parameters from CMAN containing '='
  • fencing: Fix non-comparison when sorting devices by priority
  • fencing: On failure, only try a topology device once from the remote level.
  • fencing: Only try peers for non-topology based operations once
  • fencing: Retry stonith device for duration of action's timeout period.
  • heartbeat: Remove incorrect assert during cluster connect
  • ipc: Bug cl#5110 - Prevent 100% CPU usage when looking for synchronous replies
  • ipc: Use 50k as the default compression threshold
  • legacy: Prevent assertion failure on routing ais messages (bnc#805626)
  • legacy: Re-enable logging from the pacemaker plugin
  • legacy: Relax the 'active' check for plugin based clusters to avoid false negatives
  • legacy: Skip peer process check if the process list is empty in crm_is_corosync_peer_active()
  • mcp: Only define HA_DEBUGLOG to avoid agent calls to ocf_log printing everything twice
  • mcp: Re-attach to existing pacemaker components when mcp fails
  • pengine: Any location constraint for the slave role applies to all roles
  • pengine: Avoid leaking memory when cleaning up failcounts and using containers
  • pengine: Bug cl#5101 - Ensure stop order is preserved for partially active groups
  • pengine: Bug cl#5140 - Allow set members to be stopped when the subseqent set has require-all=false
  • pengine: Bug cl#5143 - Prevent shuffling of anonymous master/slave instances
  • pengine: Bug rhbz#880249 - Ensure orphan masters are demoted before being stopped
  • pengine: Bug rhbz#880249 - Teach the PE how to recover masters into primitives
  • pengine: cl#5025 - Automatically clear failcount for start/monitor failures after resource parameters change
  • pengine: cl#5099 - Probe operation uses the timeout value from the minimum interval monitor by default (#bnc776386)
  • pengine: cl#5111 - When clone/master child rsc has on-fail=stop, insure all children stop on failure.
  • pengine: cl#5142 - Do not delete orphaned children of an anonymous clone
  • pengine: Correctly unpack active anonymous clones
  • pengine: Ensure previous migrations are closed out before attempting another one
  • pengine: Introducing the whitebox container resources feature
  • pengine: Prevent double-free for cloned primitive from template
  • pengine: Process rsc_ticket dependencies earlier for correctly allocating resources (bnc#802307)
  • pengine: Remove special cases for fencing resources
  • pengine: rhbz#902459 - Remove rsc node status for orphan resources
  • systemd: Gracefully handle unexpected DBus return types
  • Replace the use of the insecure mktemp(3) with mkstemp(3)

1.1.8 - Final

03 Jul 13:07
Compare
Choose a tag to compare

Release Statistics

Details

Changesets  1019 
Diff 2107 files changed, 117258 insertions(+), 73606 deletions(-)

Features in Pacemaker-1.1.8

  • All APIs have been cleaned up and reduced to essentials
  • Pacemaker now includes a replacement lrmd that supports systemd and upstart agents
  • Config and state files (cib.xml, PE inputs and core files) have moved to new locations
  • The crm shell has become a separate project and no longer included with Pacemaker
  • All daemons/tools now have a unified set of error codes based on errno.h (see crm_error)

Changes since Pacemaker-1.1.7

  • Core: Bug cl#5032 - Rewrite the iso8601 date handling code
  • Core: Correctly extract the version details from a diff
  • Core: Log blackbox contents, if enabled, when an error occurs
  • Core: Only LOG_NOTICE and higher are sent to syslog
  • Core: Replace use of IPC from clplumbing with IPC from libqb
  • Core: SIGUSR1 now enables blackbox logging, SIGTRAP to write out
  • Core: Support a blackbox for additional logging detail after crashes/errors
  • Promote support for advanced fencing logic to the stable schema
  • Promote support for node starting scores to the stable schema
  • Promote support for service and systemd to the stable schema
  • attrd: Differentiate between updating all our attributes and everybody updating all theirs too
  • attrd: Have single-shot clients wait for an ack before disconnecting
  • cib: cl#5026 - Synced cib updates should not return until the cpg broadcast is complete.
  • corosync: Detect when the first corosync has not yet formed and handle it gracefully
  • corosync: Obtain a full list of configured nodes, including their names, when we connect to the quorum API
  • corosync: Obtain a node name from DNS if one was not already known
  • corosync: Populate the cib nodelist from corosync if available
  • corosync: Use the CFG API and DNS to determine node names if not configured in corosync.conf
  • crmd: Block after 10 failed fencing attempts for a node
  • crmd: cl#5051 - Fixes file leak in pe ipc connection initialization.
  • crmd: cl#5053 - Fixes fail-count not being updated properly.
  • crmd: cl#5057 - Restart sub-systems correctly (bnc#755671)
  • crmd: cl#5068 - Fixes crm_node -R option so it works with corosync 2.0
  • crmd: Correctly re-establish failed attrd connections
  • crmd: Detect when the quorum API isn't configured for corosync 2.0
  • crmd: Do not overwrite any configured node type (eg. quorum node)
  • crmd: Enable use of new lrmd daemon and client library in crmd.
  • crmd: Overhaul the way node state is recorded and updated in the CIB
  • fencing: Bug rhbz#853537 - Prevent use-of-NULL when the cib libraries are not available
  • fencing: cl#5073 - Add 'off' as an valid value for stonith-action option.
  • fencing: cl#5092 - Always timeout stonith operations if timeout period expires.
  • fencing: cl#5093 - Stonith per device timeout option
  • fencing: Clean up if we detect a failed connection
  • fencing: Delegate complex self fencing requests - we wont be around to see it to completion
  • fencing: Ensure all peers are notified of complex fencing op completion
  • fencing: Fix passing of fence_legacy parameters containing '='
  • fencing: Gracefully handle metadata requests for unknown agents
  • fencing: Return cached dynamic target list for busy devices.
  • fencing: rhbz#801355 - Abort transition on DC when external fencing operation is detected
  • fencing: rhbz#801355 - Merge fence requests for identical operations already in progress.
  • fencing: rhbz#801355 - Report fencing operations external of pacemaker to cib
  • fencing: Specify the action to perform using action= instead of the older option=
  • fencing: Stop building fake metadata for broken agents
  • fencing: Tolerate agents that report empty metadata in the admin tool
  • mcp: Correctly retry the connection to corosync on failure
  • mcp: Do not shut down IPC until the last client exits
  • mcp: Prevent use-after-free when running against corosync 1.x
  • pengine: Bug cl#5059 - Use the correct action's status when calculating required actions for interleaved clones
  • pengine: Bypass online/offline checking resource detection for ping/quorum nodes
  • pengine: cl#5044 - migrate_to no longer requires load_stopped for avoiding possible transition loop
  • pengine: cl#5069 - Honor 'on-fail=ignore' even when operation is disabled.
  • pengine: cl#5070 - Allow influence of promotion score when multistate rsc is left hand of colocation
  • pengine: cl#5072 - Fixes monitor op stopping after rsc promotion.
  • pengine: cl#5072 - Fixes pengine regression test failures
  • pengine: Correctly set the status for nodes not intended to run Pacemaker
  • pengine: Do not append instance numbers to anonymous clones
  • pengine: Fix failcount expiration
  • pengine: Fix memory leaks found by valgrind
  • pengine: Fix use-after-free and use-of-NULL errors detected by coverity
  • pengine: Fixes use of colocation scores other than +/- INFINITY
  • pengine: Improve detection of rejoining nodes
  • pengine: Prevent use-of-NULL when tracing is enabled
  • pengine: Stonith resources are allowed to start even if their probes haven't completed on partially active nodes
  • services: New class called 'service' which expands to the correct (LSB/systemd/upstart) standard
  • services: Support Asynchronous systemd/upstart actions
  • Tools: crm_shadow - Bug cl#5062 - Correctly set argv[0] when forking a shell process
  • Tools: crm_report: Always include system logs (if we can find them)