Watch the first video by Ian Barber. He's an excellent presenter.
The guide ( http://zguide.zeromq.org/page:all ) is very lengthy and comprehensive, but the "hello world" example and the "Divide and Conquer" example (fan out jobs, work, fan in results) will get you up and running and start showing you a sliver of the amazing power of 0MQ.
They've even got a lot of their examples in many languages:
0MQ looks very interesting, but most of the information available seems to be "how to use it". I can't find detailed specifications of its behavior, nor a detailed description of the implementation that I could use to figure that out myself. (Short of wading through the source code.) So it's hard to understand exactly how load balancing works, exactly what happens to in-flight messages on a server crash, exactly what guarantees are provided for message ordering (e.g. in a complex multi-participant scenario), at-least-once vs. at-most-once message delivery, etc.
Does anyone know of a good, detailed explanation of how 0MQ is implemented?
You're right, the exact protocol and many other important details are undocumented. As I understand it, the "receiver down" scenario has the rather nice property of neither guaranteeing that messages will be sent or won't be sent.
ZeroMQ is amazing. The last few months I have been using it for all kinds of things -- from creating multiprocessing Python apps to replacing an HTTP REST backend with a high-performance ZeroMQ server.
ZeroMQ is more like an interesting sockets library. It's really not very much like AMQP or XMPP.
The "Zero" in ZeroMQ means there is no broker (server) between the publisher and the subscribers. Instead what happens is that each process runs a separate thread which deals exclusively with ZeroMQ messaging.
Another effect of having no broker is there is no disk persistence at all, so unless you take your own steps you will lose messages if any component crashes.
Last time I looked into zmq, it was not recommended for use in public internet services -- not resilient enough against a potential DoS attack or buggy client implementation. Has this changed?
Through versions 2.x, there were assert()s peppered throughout the code that could be triggered if malformed data was sent over the wire, making it unsuitable for internet exposure. There was also behavior in durable sockets where messages delivered to an identity that had disappeared would accumulate in memory, effectively manifesting as a memory leak.
3.x removes the asserts so you can't segfault remotely. 3.x also transparently drops messages on the floor destined for unroutable identities. This silent dropping sounds annoying, but you can work around it via a well-designed application protocol. Remember, 0MQ is effectively a transport layer, not a full-blown messaging system.
If I recall, the issue with a public 0mq socket was that anyone driving by sending something 0mq didn't like would cause an assertion to fail, so your process would crash. It wasn't so much a "supposed to be exposed" as "expose it and crash".
Yes you are right. I guess what I wanted to say, is that 0MQ relies on trusted code and network. It's okay for code to assert, as soon as there is someone to fix it, and it's probably expected. (Crash early, fix early).
My favorite - "I recall the hot summer in remote Bulgarian village, almost on the Turkish border, with car broken and no way to get home, where I devised the first version of ØMQ wire protocol. Thanks to the guy who fixed the car! Without him I would have been herding donkeys today. There would be no ØMQ."
Well, if "CERN is considering ØMQ as a control infrastructure for Large Hadron Collider." and they manage petabytes of data, also "There are trading firms using ØMQ as a secret weapon in low-latency arms race." I think it could be well suited for machine learning.
It can be used to stitch together low latency and high throughput distributed systems. Think of it as transport lego blocks, it lets you connect different parts of your system. They support a number of common messaging patterns : request-response, publish-subscribe, etc.
But it stops there (and that is the beauty of it). It doesn't prescribe or make assumptions about the types of messages that are sent. Those are just byte blobs as far it it is concerned. You can choose to represent your messages as json, protobuf, xml, raw video frames.
So can you use it for HPC? It is hard to tell, sure, it is like asking if hammer can be used to be fix a car. Yeah, it can, in some cases, but it is just a good versatile tool, that can be used for more than that.
That depends more on whether your algorithm is easily distributed. If it is, zeromq will be good for you.
I am currently using it to transfer model parameters learned on machine #1 to machine #2 where real time prediction takes place. The predictions are then sent out via 0mq to a robot controller.
The great thing of building systems with 0mq is that you just add another component without having to think too much about 'but then I will have to implement a protocol for it and HTTP will probably to slow and the request/reply pattern doesn't really fit there blabla'.
Looks very interesting! I cannot find any details about its release though, even though the article talks about September 19. Will it be open source? proprietary?
I used it to build some simple topologies to parallelize collecting statistics over huge data sets. I never got around to doing dynamic programming using a 0mq topology, but I wrote up some fun plans for it.
Congratz! I've been using 0MQ for few years on distributed db, now it's hard not to architect distributed/threaded apps without 'message passing' design which 0mq is pushing for. Plus the mailing list is pretty helpful and friendly.
0MQ is a product, AMQP and STOMP are protocols. In addition to that both AMQP and STOMP assume classic star topology with a messaging server in the middle, whereas 0MQ allows you to connect your applications directly and create arbitrary topologies.
dotCloud (http://dotcloud.com) supports binding arbitrary TCP ports to your application - and you can easily drop your favorite client implementation regardless of language.
However, many services won't let you make full use of it; e.g. last year when I tried to get multicast between my EC2 nodes, it discovered it was unsupported.
If you're only using ZeroMQ in tcp or inproc mode, it is nice, but in a pub/sub scenario you are wasting a lot of bandwidth and CPU time.
if you're not using multicast in pub sub, you're establishing individual tcp connections between nodes, it's not any different than doing N point to point communications, whereas with multicast, the data is only copied when needed, your data will fan out if that makes sense with your network topology I think if you look at the picture it'll be clear
http://www.zeromq.org/intro:read-the-manual
Watch the first video by Ian Barber. He's an excellent presenter.
The guide ( http://zguide.zeromq.org/page:all ) is very lengthy and comprehensive, but the "hello world" example and the "Divide and Conquer" example (fan out jobs, work, fan in results) will get you up and running and start showing you a sliver of the amazing power of 0MQ.
They've even got a lot of their examples in many languages:
C++ | C# | Clojure | CL | Erlang | F# | Haskell | Haxe | Java | Lua | Node.js | Perl | PHP | Python | Ruby | Scala | Ada | Basic | Go | Objective-C | ooc