Hacker News new | past | comments | ask | show | jobs | submit login
ØMQ: Mission Accomplished (250bpm.com)
256 points by nephics on Nov 10, 2011 | hide | past | favorite | 44 comments



I think 0MQ is brilliant. If this is your first time looking at 0MQ, I'd suggest the following:

http://www.zeromq.org/intro:read-the-manual

Watch the first video by Ian Barber. He's an excellent presenter.

The guide ( http://zguide.zeromq.org/page:all ) is very lengthy and comprehensive, but the "hello world" example and the "Divide and Conquer" example (fan out jobs, work, fan in results) will get you up and running and start showing you a sliver of the amazing power of 0MQ.

They've even got a lot of their examples in many languages:

C++ | C# | Clojure | CL | Erlang | F# | Haskell | Haxe | Java | Lua | Node.js | Perl | PHP | Python | Ruby | Scala | Ada | Basic | Go | Objective-C | ooc


Yes, these are good as well...

ZeroMQ: Modern & Fast Networking Stack, by Ilya Grigorik http://www.igvita.com/2010/09/03/zeromq-modern-fast-networki...

ZeroMQ an Introduction, by Nicholas Piël http://nichol.as/zeromq-an-introduction

Advanced Network Architectures with ZeroMQ, by Zed Shaw http://blip.tv/pycon-us-videos-2009-2010-2011/pycon-2011-adv...


0MQ looks very interesting, but most of the information available seems to be "how to use it". I can't find detailed specifications of its behavior, nor a detailed description of the implementation that I could use to figure that out myself. (Short of wading through the source code.) So it's hard to understand exactly how load balancing works, exactly what happens to in-flight messages on a server crash, exactly what guarantees are provided for message ordering (e.g. in a complex multi-participant scenario), at-least-once vs. at-most-once message delivery, etc.

Does anyone know of a good, detailed explanation of how 0MQ is implemented?


I found this, more than anything else, to help me get started with 0mq: http://api.zeromq.org/

A lot of the behavior is on a per-protocol and socket type basis, which is well documented here: http://api.zeromq.org/2-1:zmq-socket


The man reference is a good place if you want an API centric view. http://api.zeromq.org/ The guide does show "how to use it", but chapter one does a good overview of the capabilities: http://zguide.zeromq.org/page:all#Chapter-One-Basic-Stuff


As for in-flight messages and server crashes, some explanation is given in the zguide in the part about durable sockets.

And you're right that some explanation of implementation is needed. I spent a few hours digging through the code today and it's not easy to read.



You're right, the exact protocol and many other important details are undocumented. As I understand it, the "receiver down" scenario has the rather nice property of neither guaranteeing that messages will be sent or won't be sent.


http://rfc.zeromq.org/spec:15 documents the protocol ZeroMQ uses. The previous version of the protocol (used by 2.1.x) can be found here: http://rfc.zeromq.org/spec:13


ZeroMQ is amazing. The last few months I have been using it for all kinds of things -- from creating multiprocessing Python apps to replacing an HTTP REST backend with a high-performance ZeroMQ server.


How does this compare to XMPP? Are there any benefits to learning ZeroMQ for web notifications over using, say Strophe.js?


ZeroMQ is very low level, compared to XMPP. ZeroMQ considers itself as part of the networking stack: tcp/udp/0mq (or perhaps just above tcp and udp).

ZeroMQ and XMPP are not really competitors.


ZeroMQ is more like an interesting sockets library. It's really not very much like AMQP or XMPP.

The "Zero" in ZeroMQ means there is no broker (server) between the publisher and the subscribers. Instead what happens is that each process runs a separate thread which deals exclusively with ZeroMQ messaging.

Another effect of having no broker is there is no disk persistence at all, so unless you take your own steps you will lose messages if any component crashes.


0MQ allows to interconnect components directly. There's no need for a messaging server.


Last time I looked into zmq, it was not recommended for use in public internet services -- not resilient enough against a potential DoS attack or buggy client implementation. Has this changed?


Through versions 2.x, there were assert()s peppered throughout the code that could be triggered if malformed data was sent over the wire, making it unsuitable for internet exposure. There was also behavior in durable sockets where messages delivered to an identity that had disappeared would accumulate in memory, effectively manifesting as a memory leak.

3.x removes the asserts so you can't segfault remotely. 3.x also transparently drops messages on the floor destined for unroutable identities. This silent dropping sounds annoying, but you can work around it via a well-designed application protocol. Remember, 0MQ is effectively a transport layer, not a full-blown messaging system.


It have got much better. The latest stable version should be pretty much resilient.


I don't think it's supposed to be exposed in this way, rather it's very good for internal wiring between trusted systems.


If I recall, the issue with a public 0mq socket was that anyone driving by sending something 0mq didn't like would cause an assertion to fail, so your process would crash. It wasn't so much a "supposed to be exposed" as "expose it and crash".


Yes you are right. I guess what I wanted to say, is that 0MQ relies on trusted code and network. It's okay for code to assert, as soon as there is someone to fix it, and it's probably expected. (Crash early, fix early).


Wow, programming can really be romantic.


My favorite - "I recall the hot summer in remote Bulgarian village, almost on the Turkish border, with car broken and no way to get home, where I devised the first version of ØMQ wire protocol. Thanks to the guy who fixed the car! Without him I would have been herding donkeys today. There would be no ØMQ."


What HPC products use 0mq?

Would it be applicable to large-scale optimization problems, e.g. in context of machine learning?


Well, if "CERN is considering ØMQ as a control infrastructure for Large Hadron Collider." and they manage petabytes of data, also "There are trading firms using ØMQ as a secret weapon in low-latency arms race." I think it could be well suited for machine learning.


It can be used to stitch together low latency and high throughput distributed systems. Think of it as transport lego blocks, it lets you connect different parts of your system. They support a number of common messaging patterns : request-response, publish-subscribe, etc.

But it stops there (and that is the beauty of it). It doesn't prescribe or make assumptions about the types of messages that are sent. Those are just byte blobs as far it it is concerned. You can choose to represent your messages as json, protobuf, xml, raw video frames.

So can you use it for HPC? It is hard to tell, sure, it is like asking if hammer can be used to be fix a car. Yeah, it can, in some cases, but it is just a good versatile tool, that can be used for more than that.


That depends more on whether your algorithm is easily distributed. If it is, zeromq will be good for you.

I am currently using it to transfer model parameters learned on machine #1 to machine #2 where real time prediction takes place. The predictions are then sent out via 0mq to a robot controller.

The great thing of building systems with 0mq is that you just add another component without having to think too much about 'but then I will have to implement a protocol for it and HTTP will probably to slow and the request/reply pattern doesn't really fit there blabla'.



Looks very interesting! I cannot find any details about its release though, even though the article talks about September 19. Will it be open source? proprietary?




I used it to build some simple topologies to parallelize collecting statistics over huge data sets. I never got around to doing dynamic programming using a 0mq topology, but I wrote up some fun plans for it.


He's talking about 3.1, while publicly 3.0 is still in beta. Does anyone know when 3.1 will come out?


Community vote taken last weeks was to continue 3.x series of releases by 3.1.x versions.


Congratz! I've been using 0MQ for few years on distributed db, now it's hard not to architect distributed/threaded apps without 'message passing' design which 0mq is pushing for. Plus the mailing list is pretty helpful and friendly.


Anyone know of a good and recent ØMQ/AMPQ/STOMP comparison, specifically with attention paid to the implementations available?


0MQ is a product, AMQP and STOMP are protocols. In addition to that both AMQP and STOMP assume classic star topology with a messaging server in the middle, whereas 0MQ allows you to connect your applications directly and create arbitrary topologies.


Not saying you're wrong, but I wouldn't phrase it this way.

I've only been using 0MQ as a library and AMQP in the form of RabbitMQ - so it can be quite the opposite.


Are there any hosting services out there that currently support using ZeroMQ?


dotCloud (http://dotcloud.com) supports binding arbitrary TCP ports to your application - and you can easily drop your favorite client implementation regardless of language.


Any VPS host, or EC2, will let you run ZeroMQ in whatever way is appropriate to your problem.

edit: and same for if you run your own server or rent a server, obviously


However, many services won't let you make full use of it; e.g. last year when I tried to get multicast between my EC2 nodes, it discovered it was unsupported.

If you're only using ZeroMQ in tcp or inproc mode, it is nice, but in a pub/sub scenario you are wasting a lot of bandwidth and CPU time.


Can you explain why you're wasting a lot of bandwidth/CPU time in pub/sub?


if you're not using multicast in pub sub, you're establishing individual tcp connections between nodes, it's not any different than doing N point to point communications, whereas with multicast, the data is only copied when needed, your data will fan out if that makes sense with your network topology I think if you look at the picture it'll be clear

http://en.wikipedia.org/wiki/Multicast


But when you have to add the second ZeroMQ (say, two Linode VPSs), how do you secure communication between those nodes?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: