Skip to content

dataflow notes from zeromq

Philip (flip) Kromer edited this page May 9, 2012 · 1 revision

wukong/dataflow -- Notes from ZeroMQ

ZeroMQ Guide

note: this file alone is GPL; do not copy/paste from it to other documents or code.

The built-in core ØMQ patterns are:

  • Request-reply -- connects a set of clients to a set of services. This is a remote procedure call and task distribution pattern.
  • Publish-subscribe -- connects a set of publishers to a set of subscribers. This is a data distribution pattern.
  • Pipeline -- connects nodes in a fan-out / fan-in pattern that can have multiple steps, and loops. This is a parallel task distribution and collection pattern.

There's one more pattern that people tend to try to use when they still think of ØMQ in terms of traditional TCP sockets:

  • Exclusive pair -- connects two sockets in an exclusive pair. This is a low-level pattern for specific, advanced use-cases. We'll see an example at the end of this chapter.

These are the socket combinations that are valid for a connect-bind pair (either side can bind):

  • PUB and SUB
  • REQ and REP
  • REQ and ROUTER
  • DEALER and REP
  • DEALER and ROUTER
  • DEALER and DEALER
  • ROUTER and ROUTER
  • PUSH and PULL
  • PAIR and PAIR

Any other combination will produce undocumented and unreliable results and future versions of ØMQ will probably return errors if you try them. You can and will of course bridge other socket types via code, i.e. read from one socket type and write to another.

topologies

http://www.digistan.org/spec:1/COSS

  • ZMQ Queue is a simple forwarding device for request-reply messaging, which can allow you to aggregate requests from many requesters, and distribute them to one or more reply sockets - a chokepoint.

  • ZMQ Streamer serves a similar function, but is designed to handle the pipeline pattern instead.

  • ZMQ Forwarder allows you to aggregate data from multiple publishers, and distribute these messages via a fanout to all the connected subscribers.

  • Majordomo Protocol (MDP) defines a reliable service-oriented request-reply dialog between a set of client applications, a broker and a set of worker applications. MDP covers presence, heartbeating, and service-oriented request-reply processing. MDP is an evolution of 6/PPP, adding name-based service resolution and more structured protocol commands. The goals of MDP are to:

    • Allow requests to be routed to workers on the basis of abstract service names.
    • Allow both peers to detect disconnection of the other peer, through the use of heartbeating.
    • Allow the broker to implement a "least recently used" pattern for task distribution to workers for a given service.
    • Allow the broker to recover from dead or disconnected workers by resending requests to other workers.
    • Majordomo Protocol Diagram
  • Paranoid Pirate Protocol (PPP) defines a reliable request-reply dialog between a client (or client) and a worker peer. PPP covers presence, heartbeating, and request-reply processing. The goals of PPP are to:

    • Allow both peers to detect disconnection of the other peer, through the use of heartbeating.
    • Allow the client to implement a "least recently used" pattern for task distribution to workers.
    • Allow the client to recover from dead or disconnected workers by resending requests to other workers.
  • Majordomo Management Interface (MMI) defines a namespace and set of management services that MDP brokers may provide. MMI is layered on top of the 7/MDP protocol. The goals of MMI are to:

    • Define a namespace for management services provided by a MDP broker to MDP client applications.
    • Define a default set of management services that MMI-compatible brokers SHOULD implement.
  • Titanic Service Protocol (TSP) defines a set of services, requests, and replies that implement the Titanic pattern for disconnected persistent messaging across a network of arbitrarily connected clients and workers. The Titanic pattern is developed in Chapter 4 of the Guide[4], as a simple design for disk-based reliable messaging. Titanic allows clients and workers to work without being connected to the network at the same time, and defines handshaking for safe storage of requests, and retrieval of replies. Titanic is a layer built on top of the Majordomo Protocol (7/MDP). TSP clients use MDP/Client to talk to an MDP broker. Titanic requires no modifications to workers, which use the MDP/Worker protocol to speak to an MDP broker. The Titanic pattern places the persistence outside the broker, as a proxy service that looks like a worker to clients, and a client to workers: Titanic Pattern Diagram

  • Freelance Protocol (FLP) defines brokerless reliable request-reply dialogs across an N-to-N network of clients and servers. It connects a (normally large) set of clients with a (normally small) set of servers, each capable of replacing the others. Though clients may prioritise servers (primary, secondary, etc.) this is irrelevant to FLP. In the Freelance pattern, clients connect to servers, and address them explicitly. Servers can only reply to clients that have first sent a command.

  • Message Transfer Layer (MTL), a connection-oriented protocol that supports broker-based messaging. MTL connects a set of clients with a central message broker, allowing clients to issue commands to the broker, send messages to the broker, and receive messages back from the broker. It separates all client-server activity into two flows: a synchronous req/resp low-volume control flow and an asynchronous bidirectional high-volume data flow. The main goals of MTL are to provide:

    • support for arbitrary messaging semantics based on extensible profiles.
    • a simple synchronous flow for commands from the client to the server.
    • a fast asynchronous flow for data from the client to or from the server.
    • authentication of clients using SASL[4].
    • safe forwards and backwards compatibility.
    • detection of errors such as dead or blocked peers.
    • robust error handling.
  • Worker-Manager Protocol is a generalization of request-reply pattern, allowing many workers talk to many managers (servers) with intermediate devices and custom load-balancing.

  • Clustered Hashmap Protocol (CHP) defines a cluster-wide key-value hashmap, and mechanisms for sharing this across a set of clients. CHP allows clients to work with subtrees of the hashmap, to update values, and to define ephemeral values. CHP originated from the Clone pattern defined in Chapter 5 of the Guide.

  • Size-Prefixed Blob format (SPB) is a portable optimal wire-framing format for opaque blobs of data carried over streaming protocols such as TCP/IP.

  • ZeroMQ Message Transport Protocol (ZMTP) is a transport layer protocol for exchanging messages between two peers over a connected transport layer such as TCP. ZMTP consists of these layers:

    • A framing layer that imposes a size-prefixed regularity on the underlying transport.
    • A connection layer that allows two peers to exchange messages.
    • A content layer that define how application data is formatted, according to the socket type.
  • ZeroMQ Property Language ZPL is an ASCII text format that uses whitespace - line endings and indentation - for framing and hierarchy. ZPL data consists of a series of properties encoded as name/value pairs, one per line, where the name may be structured, and where the value is an untyped string.

transports

  • in-process
  • inter-process
  • TCP
  • multicast