From c6f1835d897715af4c9ae68addd07a39fbac0994 Mon Sep 17 00:00:00 2001 From: Michael Klishin Date: Mon, 4 Mar 2024 13:48:29 -0500 Subject: [PATCH 1/3] Clustering: don't mention RAM nodes, they are completely incompatible with Khepri --- docs/clustering.md | 116 --------------------------------------------- 1 file changed, 116 deletions(-) diff --git a/docs/clustering.md b/docs/clustering.md index 4ec770a9e1..3fa5b53917 100644 --- a/docs/clustering.md +++ b/docs/clustering.md @@ -1192,119 +1192,3 @@ connection to nodes within a cluster is beyond the scope of this guide, and we recommend the use of other technologies designed specifically to address these problems. - -## Disk and RAM Nodes {#cluster-node-types} - -A node can be a disk node or a RAM node. -(Note: disk and disc are used -interchangeably). RAM nodes store internal database tables -in RAM only. This does not include messages, message store -indices, queue indices and other node state. - -In the vast majority of cases you want all your nodes to be -disk nodes; RAM nodes are a special case that can be used -to improve the performance clusters with high queue, -exchange, or binding churn. RAM nodes do not provide -higher message rates. When in doubt, use -disk nodes only. - -Since RAM nodes store internal database tables in RAM only, they must sync -them from a peer node on startup. This means that a cluster must contain -at least one disk node. It is therefore not possible to manually remove -the last remaining disk node in a cluster. - - -## Clusters with RAM nodes {#ram-nodes} - -RAM nodes keep their metadata only in memory. As RAM nodes -don't have to write to disc as much as disc nodes, they can -perform better. However, note that since persistent queue -data is always stored on disc, the performance improvements -will affect only resource management (e.g. adding/removing -queues, exchanges, or vhosts), but not publishing or -consuming speed. - -RAM nodes are an advanced use case; when setting up your -first cluster you should simply not use them. You should -have enough disc nodes to handle your redundancy -requirements, then if necessary add additional RAM nodes for -scale. - -A cluster containing only RAM nodes would be too volatile; if the -cluster stops you will not be able to start it again and -**will lose all data**. RabbitMQ will prevent the creation of a -RAM-node-only cluster in many situations, but it can't -absolutely prevent it. - -The examples here show a cluster with one disc and one RAM -node for simplicity only; such a cluster is a poor design -choice. - -### Creating RAM nodes {#creating-ram} - -We can declare a node as a RAM node when it first joins -the cluster. We do this with -`rabbitmqctl join_cluster` as before, but passing the -`--ram` flag: - -```bash -# on rabbit2 -rabbitmqctl stop_app -# => Stopping node rabbit@rabbit2 ...done. - -rabbitmqctl join_cluster --ram rabbit@rabbit1 -# => Clustering node rabbit@rabbit2 with [rabbit@rabbit1] ...done. - -rabbitmqctl start_app -# => Starting node rabbit@rabbit2 ...done. -``` - -RAM nodes are shown as such in the cluster status: - -```bash -# on rabbit1 -rabbitmqctl cluster_status -# => Cluster status of node rabbit@rabbit1 ... -# => [{nodes,[{disc,[rabbit@rabbit1]},{ram,[rabbit@rabbit2]}]}, -# => {running_nodes,[rabbit@rabbit2,rabbit@rabbit1]}] -# => ...done. - -# on rabbit2 -rabbitmqctl cluster_status -# => Cluster status of node rabbit@rabbit2 ... -# => [{nodes,[{disc,[rabbit@rabbit1]},{ram,[rabbit@rabbit2]}]}, -# => {running_nodes,[rabbit@rabbit1,rabbit@rabbit2]}] -# => ...done. -``` - -### Changing node types {#change-type} - -We can change the type of a node from ram to disc and vice -versa. Say we wanted to reverse the types of -`rabbit@rabbit2` and `rabbit@rabbit1`, turning -the former from a ram node into a disc node and the latter from a -disc node into a ram node. To do that we can use the -`change_cluster_node_type` command. The node must be -stopped first. - -```bash -# on rabbit2 -rabbitmqctl stop_app -# => Stopping node rabbit@rabbit2 ...done. - -rabbitmqctl change_cluster_node_type disc -# => Turning rabbit@rabbit2 into a disc node ...done. - -rabbitmqctl start_app -# => Starting node rabbit@rabbit2 ...done. - -# on rabbit1 -rabbitmqctl stop_app -# => Stopping node rabbit@rabbit1 ...done. - -rabbitmqctl change_cluster_node_type ram -# => Turning rabbit@rabbit1 into a ram node ...done. - -rabbitmqctl start_app -# => Starting node rabbit@rabbit1 ...done. -``` From ea5bccee09b0f9ba529ba88c41797c0a693ea566 Mon Sep 17 00:00:00 2001 From: Michael Klishin Date: Mon, 4 Mar 2024 14:17:25 -0500 Subject: [PATCH 2/3] Move queue leader replica locator section to Clustering since CMQs are deprecated and scheduled to be removed later this year. --- docs/clustering.md | 44 ++++++++++++++++++++++++++++++++++++++++---- docs/configure.md | 9 +++++++++ docs/ha.md | 36 ++++++++++-------------------------- 3 files changed, 59 insertions(+), 30 deletions(-) diff --git a/docs/clustering.md b/docs/clustering.md index 3fa5b53917..2c69e23240 100644 --- a/docs/clustering.md +++ b/docs/clustering.md @@ -31,6 +31,7 @@ This guide covers fundamental topics related to RabbitMQ clustering: * [How clusters are formed](#cluster-formation) * How nodes [authenticate to each other](#erlang-cookie) (and with CLI tools) * Why it's important to [use an odd number of nodes](#node-count) and **two-cluster nodes are strongly discouraged** + * [Queue and stream leader replica placement](#replica-placement) strategies * [Node restarts](#restarting) and how nodes rejoin their cluster * [Node readiness probes](#restarting-readiness-probes) and how they can affect rolling cluster restarts * How to [remove a cluster node](#removing-nodes) @@ -427,6 +428,42 @@ however. See [Connecting to Streams](https://blog.rabbitmq.com/posts/2021/07/connecting-to-streams/#well-behaved-clients) to learn more. +### Queue and Stream Leader Replica Placement {#replica-placement} + +Every queue and srream in RabbitMQ has a primary replica. That replica is called +_the leader_. All publishing operations on queues and streams go through the leader +replica first and then are replicated to followers (secondary replicas). This is necessary to +guarantee FIFO ordering of messages. + +To avoid some nodes in a cluster hosting a significant majority of queue leader +replicas and thus handling most of the load, queue leaders should +be reasonably evenly distributed across cluster nodes. + +Queue leaders can be distributed between nodes using several +strategies. Which strategy is used is controlled in three ways, +namely, using the `x-queue-master-locator` [optional queue argument](./queues#optional-arguments), setting the `queue-master-locator` +policy key or by defining the `queue_master_locator` +key in [`the configuration file`](./configure#configuration-files). + +There are two options available: + + * `balanced`, the default strategy, uses the data on how many replicas peer nodes host, + when there are relatively few (say, no more than 1000) queues in the cluster; it falls back + to a more efficient strategy of picking a random node when there are many queues + * `client-local` will always pick the node the client is connected to + +The following example sets the `queue_leader_locator` setting in `rabbitmq.conf` to its default value: + +``` ini +queue_leader_locator = balanced +``` + +The client-provided queue argument takes presedence when both are used. + +Note that all Raft-based features, namely quorum queues and streams, use this value as a suggestion. +Raft leader election algorithm involves a degree of randomness, therefore the selected recommended +node will have a replica placed on it but it will not always be the leader replica. + ## Clustering and Observability {#clustering-and-observability} @@ -1158,9 +1195,9 @@ Learn more in the [section on ports](#ports) above and dedicated [RabbitMQ Netwo ## Erlang Versions Across the Cluster {#erlang} -All nodes in a cluster are *highly recommended* to run the same major [version of Erlang](./which-erlang): `22.2.0` -and `22.2.8` can be mixed but `21.3.6` and `22.2.6` can potentially introduce breaking changes in -inter-node communication protocols. While such breaking changes are relatively rare, they are possible. +All nodes in a cluster are *highly recommended* to run the same major [version of Erlang](./which-erlang): `26.2.0` +and `26.1.2` can be mixed but `25.3.2.8` and `26.2.0` can potentially introduce breaking changes in +inter-node communication protocols. While such breaking changes are rare, they are possible. Incompatibilities between patch releases of Erlang/OTP versions are very rare. @@ -1191,4 +1228,3 @@ In general, this aspect of managing the connection to nodes within a cluster is beyond the scope of this guide, and we recommend the use of other technologies designed specifically to address these problems. - diff --git a/docs/configure.md b/docs/configure.md index 3edbb16a7c..d5c36c76c2 100644 --- a/docs/configure.md +++ b/docs/configure.md @@ -844,6 +844,15 @@ disk_free_limit.absolute = 50MB

+ + + `queue_leader_locator` + + Controls the [strategy used when selecting a node](./clustering#replica-placement) to host the leader replica of a newly declared + queue or stream. + + + `log.file.level` diff --git a/docs/ha.md b/docs/ha.md index e8b24ed6b7..360fd58c27 100644 --- a/docs/ha.md +++ b/docs/ha.md @@ -18,17 +18,17 @@ See the License for the specific language governing permissions and limitations under the License. --> -# Classic Queue Mirroring +# Classic Queue Mirroring (Deprecated) -## Wait, There's a Better Way: Next Generation Highly Available Queues and Streams {#interstitial} +## Wait, There's a Better Way: Next Replicated Queues and Streams {#interstitial} This guide covers a [**deprecated and scheduled for removal feature**](https://blog.rabbitmq.com/posts/2021/08/4.0-deprecation-announcements/): mirroring (queue contents replication) of classic queues. [Quorum queues](./quorum-queues) and/or [streams](./streams) should be used instead of mirrored classic queues. -Quorum queues are a more advanced queue type, which offers high availability using replication and focuses on data safety. From RabbitMQ 3.10, quorum queues [support message TTL](https://blog.rabbitmq.com/posts/2022/05/rabbitmq-3.10-release-overview/) and provide [higher throughput and more stable latency](https://blog.rabbitmq.com/posts/2022/05/rabbitmq-3.10-performance-improvements/) compared to mirrored classic queues. You can [Migrate your RabbitMQ Mirrored Classic Queues to Quorum Queues](./migrate-mcq-to-qq) now. +Quorum queues are a more advanced queue type, which offers high availability using replication and focuses on data safety. Quorum queues [support message TTL](https://blog.rabbitmq.com/posts/2022/05/rabbitmq-3.10-release-overview/) and provide [higher throughput and more stable latency](https://blog.rabbitmq.com/posts/2022/05/rabbitmq-3.10-performance-improvements/) compared to mirrored classic queues. Please [migrate from Mirrored Classic Queues to Quorum Queues](./migrate-mcq-to-qq) now. -[Streams](./streams) is a messaging data structure available as of [RabbitMQ 3.9](./changelog), -and is also replicated. +[Streams](./streams) is an [alternative messaging data structure](./blog/2021/07/13/rabbitmq-streams-overview) supported by RabbitMQ. +Just like quorum queues, streams are replicated. Quorum queues should be the **default choice** for a replicated queue type. Classic queue mirroring will be **removed in a future version** of RabbitMQ: @@ -44,7 +44,7 @@ using policies before changing the default version in the configuration. Topics covered in this guide include - * [Next generation replicated queue type](#interstitial) and why it should be preferred over classic queue mirroring. + * [Next generation replicated queues and streams](#interstitial), and why they should be preferred over classic queue mirroring * What is [classic queue mirroring](#what-is-mirroring) and how it works * How to [enable it](#ways-to-configure) * What [mirroring settings are available](#mirroring-arguments) @@ -62,7 +62,8 @@ This guide assumes general familiarity with [RabbitMQ clustering](./clustering). ## What is Queue Mirroring {#what-is-mirroring} **Important**: mirroring of classic queues will be **removed in a future version** of RabbitMQ. -Consider using [quorum queues](./quorum-queues) or a non-replicated classic queue instead. +Consider using [quorum queues](./quorum-queues), [streams](./streams), or a non-replicated classic queue +v2 instead. By default, contents of a queue within a RabbitMQ cluster are located on a single node (the node on which the queue was @@ -263,27 +264,10 @@ See [Runtime Parameters and Policies](./parameters#policies) to learn more. ## Queue Leader Replicas, Leader Migration, Data Locality {#leader-migration-data-locality} + ### Queue Leader Location {#queue-leader-location} -Every queue in RabbitMQ has a primary replica. That replica is called -_queue leader_ (originally "queue master"). All queue operations go through the leader -replica first and then are replicated to followers (mirrors). This is necessary to -guarantee FIFO ordering of messages. - -To avoid some nodes in a cluster hosting the majority of queue leader -replicas and thus handling most of the load, queue leaders should -be reasonably evenly distributed across cluster nodes. - -Queue leaders can be distributed between nodes using several -strategies. Which strategy is used is controlled in three ways, -namely, using the `x-queue-master-locator` [optional queue argument](./queues#optional-arguments), setting the `queue-master-locator` -policy key or by defining the `queue_master_locator` -key in [`the configuration file`](./configure#configuration-files). Here are the possible strategies and how to set them: - - * Pick the node hosting the minimum number of leaders of the same queue type: - `balanced` - * Pick the node the client that declares the queue is - connected to: `client-local` +This section has been moved to the [Clustering guide](./clustering#replica-placement). ### "nodes" Policy and Migrating Leaders {#fixed-leader-promotion} From 7fc41b8a691de167a48a663736dde15711634867 Mon Sep 17 00:00:00 2001 From: Michael Klishin Date: Mon, 4 Mar 2024 14:33:34 -0500 Subject: [PATCH 3/3] Fix deployment build --- docs/confirms.md | 4 ++-- docs/ha.md | 2 +- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/confirms.md b/docs/confirms.md index bdd7d8d8c6..0a08e4ef78 100644 --- a/docs/confirms.md +++ b/docs/confirms.md @@ -72,7 +72,7 @@ are very similar features that solve similar problems in different contexts: The two features, however, are entirely orthogonal and unaware of each other. **Publisher confirms are not aware of consumers**: they only cover publisher's interactions -with node it is connected to, and the queue (or [stream](/streams)) leader replica. +with node it is connected to, and the queue (or [stream](./streams)) leader replica. **Consumer acknowledgements are not aware of publishers**: their goal is to confirm to a RabbitMQ node that a given delivery was successfully received and processed successfully, @@ -80,7 +80,7 @@ so the delivered message can be marked for future deletion. Sometimes publishing and consuming applications need to communicate via requests and responses that need an explicit acknowledgement from the peer. [RabbitMQ tutorial #6](/tutorials) -demonstrates the basics of how that's done, and [Direct Reply-to](/direct-reply-to) provides +demonstrates the basics of how that's done, and [Direct Reply-to](./direct-reply-to) provides a way to do it without declaring a lot of short-lived temporary response queues. This type of communication, however, is not covered in this guide, and is mentioned only to diff --git a/docs/ha.md b/docs/ha.md index 360fd58c27..32a9a107dd 100644 --- a/docs/ha.md +++ b/docs/ha.md @@ -27,7 +27,7 @@ This guide covers a [**deprecated and scheduled for removal feature**](https://b Quorum queues are a more advanced queue type, which offers high availability using replication and focuses on data safety. Quorum queues [support message TTL](https://blog.rabbitmq.com/posts/2022/05/rabbitmq-3.10-release-overview/) and provide [higher throughput and more stable latency](https://blog.rabbitmq.com/posts/2022/05/rabbitmq-3.10-performance-improvements/) compared to mirrored classic queues. Please [migrate from Mirrored Classic Queues to Quorum Queues](./migrate-mcq-to-qq) now. -[Streams](./streams) is an [alternative messaging data structure](./blog/2021/07/13/rabbitmq-streams-overview) supported by RabbitMQ. +[Streams](./streams) is an [alternative messaging data structure](https://blog.rabbitmq.com/blog/2021/07/13/rabbitmq-streams-overview) supported by RabbitMQ. Just like quorum queues, streams are replicated. Quorum queues should be the **default choice** for a replicated queue type.