Wednesday, February 9, 2011

what is RabbitMQ

General:
  • Some reading on clustering http://www.rabbitmq.com/clustering.html
  • DNS errors cause the DB(mnesia) to crash
  • A RabbitMQ instance won’t scale to LOTS of queues with each queue having fair load since all queues are stored in memory (queue metadata) and also in a clustered setup, each queue’s metadata (but not the queue’’s messages) is replicated on each node. Hence, there is the same amount of overhead due to queues on every node in a cluster
  • No ONCE-ONLY semanamntics. Messages may be sent twice by RabbitMQ to the consumer(s)
  • Multiple consumers can be configured for a single queue, and they will all get mutually exclusive messages
  • Unordered; not FIFO delivery
  • Single socket multiple connections. Each socket can have multiple channels and each channel can have multiple consumers
  • No provision for ETA
  • maybe auto-requeue (based on timeout) — needs investigation
  • Only closing connection NACKs a message. Removing the consumer from that channel does NOT. Hence, all queues being listened to on that channel/connetion are closed for the current consumer
  • NO EXPONENTIAL BACKOFF for failed consumers. Failed messages are re-tried almost immediately. Hence an error in the consumer logic that crashes the consumer while consuming a particular message may potentially block the whole queue. Hence, the consumer needs to be programmed well — error free. However, apps are like; well apps…
  • Consumer has to do rate limiting by not consuming messages too fast (if it wants to); no provision for this in RabbitMQ
Persistence:
  • It will use only it’s own DB — you can’t configure mySQL or any such thing
Clustering and Replication:
  • A RabbitMQ cluster is just a set of nodes running the RabbitMQ. No master node is involved.
  • You need to specify hostname of cluster nodes in a cluster manually on the command line or in a config file.
  • Basic load balancing by nodes in a cluster by redirecting requests to other nodes
  • A node can be a RAM node or a disk node. RAM nodes keep their state only in memory (with the exception of the persistent contents of durable queues which are still stored safely on disc). Disk nodes keep state in memory and on disk.
  • Queue metadata shared across all nodes.
  • RabbitMQ brokers tolerate the failure of individual nodes. Nodes can be started and stopped at will
  • It is advisable to have at least 1 disk node in a cluster of nodes
  • You need to specify which nodes are part of a cluster during node startup. Hence, when A is the first one to start, it will think that it is the only one in the cluster. When B is started it will be told that A is also in the cluster and when C starts, it should be told that BOTH A and B are part of the cluster. This is because if A or B go down, C still knows one of the machines in the cluster. This is only required for RAM nodes, since they don’t persist metadata on disk. So, if C is a memory node and it goes down and comes up, it will have to be manually told which nodes to query for cluster membership (since it itself doesn’t store that state locally).
  • Replication needs to be investigated (check addtl resources) however, from initial reading, it seems queue data replication does not exist
  • FAQ: “How do you migrate an instance of RabbitMQ to another machine?”. Seems to be a very manual process.
Transactions:
  • Any number of queues can be involved in a transaction
Addtl Resources