Hacker News new | past | comments | ask | show | jobs | submit login
Uses For A Message Queue (iron.io)
117 points by paddyforan on Dec 11, 2012 | hide | past | favorite | 20 comments



Not listed: ability to route through very restrictive network setups ("galvanically isolated" networks.) MQs, being latency insensitive, can even go over protocols like e-mail.


Nice one. #11.


Side note, I think the picture are of White Geese and not ducks.


I had the same thought. However.

http://www.flickr.com/photos/david-hilgart/4142396713/

If the photographer calls them ducks, who am I to argue?


Great stuff. We used message queues when rewriting major components from an ancient VFP Client heavy application to newer C# code running on an application server. It enabled both asynchronous work as well as ordered operations.


In the freezing cold at Clapham waiting for a train, so not much time here to dig deeper. However, the elasticity part is interesting, and I'd like very much to know what that really means in context.

Other than that I've worked with messages queues a lot over the years, 'specially in the late '90s/early 2000's. Cutting to the chase - I have a problem with message queues, in that they tend to add a lot of bloat (in terms of both deployment size and integration cost) and in return don't solve a very difficult problem.

First and foremost, "message queue" is really just store and forward. This is hugely relevant in today's mobile world. Being able to edit a row in a database while on the London Underground, and knowing that it will just sync up the next time I connect to a network is a big thing.

That said, other than elasticity I'd like very much to know how this is different to MQ Series or MSMQ. Things like in-order delivery and guaranteed delivery are not a big deal to achieve.

Guaranteed delivery is super easy. When uploading, you call and the server with your payload, and inspect the server response. If the response is an acknowledge/success you've guaranteed delivery. Downloading a message requires two calls - the first to get the payload, the second to confirm back to the server that you've received it. Again, guaranteed delivery, done! If either the upload or download fails then just keep trying until it works. It doesn't matter if the server gets the data 20 times, it only needs to record one receipt. "Only once" delivery is the same - the server just discards subsequent uploads from clients that didn't get the appropriate response, for whatever reason.

In-order delivery is solved by adding a sequence number to the batch/payload packets. You can actually send them in any order you like -- the server simply re-assembles the messages in the required order when the whole batch has been delivered. Where messages are spaced apart in time this is even less of a requirement.

What's really interesting to me, and what isn't mentioned at all, is whether they do push from the server. Everything there can be done using client pull. MSMQ and MQ Series both fall down with push, because you can't see through a NAT'd network.

Finally, store and forward, by definition, provides resilience and buffering. The same concept of store before forward means that any half-decent implementation will handle spikes (hence my question on "elasticity"). Async comms are always a good idea, because they help you scale. Not sure how others do comms, but I haven't used a synchronous call since .NET first came out in 2002.

I think real innovations around store and forward/message queues come from performance (relevant especially on mobile networks, where I prefer to avoid chatty formats like XML or JSoN), and security (encrypting a bitstream without requiring SSL). Because of this I wrote a library that I simply drop into every one of my mobile app projects. It took me a week to build and stabilise (I add that to demonstrate that this is not a difficult engineering problem - and I'm no rocket scientist by any stretch of the imagination).

The other innovation that's a fair bit harder to solve, is when a single item is edited by two parties while one is offline, and then merging the updates when both versions sync up.

[Edit] I've had a look now, and elasticity means little more here than the caching nature of message queues (the store part of store and forward). As an architect I see yet another middleware vendor, albeit with hipster terms like spike and elasticity. As a programmer I imagine they solved this problem for themselves the same way I did - and if they can make some money from their efforts then they've done more with it than I have, and I think that's awesome for them!


> In-order delivery is solved by adding a sequence number to the batch/payload packets.

And just as I let TCP do this for me at a lower layer, I'd personally be quite happy to farm this out to a queue library or service. That a problem has been solved in a well-known pattern, and that I could solve it with my own code, doesn't mean that I ought to spend time and effort doing so.

> security (encrypting a bitstream without requiring SSL). Because of this I wrote a library that I simply drop into every one of my mobile app projects.

I am wary of not using SSL. I mean it's a heavy protocol, but that's what you need to rebuff the many, many clever attacks against homegrown protocols.

Recently I filed a patent for a protocol involving cryptographic elements and it took me 11 goes-around with a Well-Known Expert to get a protocol that wasn't basically retarded beyond all comprehension. This after spending a year reading about crypto and security for my honours project. I tried, but really, it's one of those topics where soaking for a few years pays and being out of the loop for a month can cost.

Anyhow. I guess my question is, why not use SSL? Performance/battery life?


It is indeed heavy, and I avoid it for both performance and payload size, and more importantly, because some implementations of SSL have vulnerabilities. Others may have. The generalisation of the protocol I use is well known -

A -> R: A, B

R -> A: {A,B,Kᴬᴮ,T}ᴷᴬᴿ,{A,B,Kᴬᴮ,T}ᴷᴮᴿ

A -> B: {A,B,Kᴬᴮ,T}ᴷᴮᴿ,{M}ᴷᴬᴮ

Where A = Alice, B = Bob, R = Robert, K = key, T = time stamp, M = message

A planning system that uses my store and forward implementation is used by a defence force, and went through some pretty thorough accreditation.

As for your willingness to farm this out to a third party - hey that's cool. It is, after all, a commodity service. Personally? I prefer to avoid dependencies on third parties where I can, because I'm not comfortable with the longer-term versioning dependencies, patch management and general operational support that's added over and above my code, which must already be supported.


Guaranteed delivery is super easy.

Oh, then why do you propose a wrong solution right there in your next paragraph?

you call and the server with your payload, and inspect the server response.

Wait, what if the response gets lost on the way? Are your messages idempotent? Do they depend on state on the other end? How long are they valid? Is each message atomic?

You may want to take a look at http://en.wikipedia.org/wiki/Two_Generals%27_Problem and http://en.wikipedia.org/wiki/Atomic_commit

Guaranteed delivery can become quite tricky depending on which constraints you can and cannot enforce in your system.


Messages are indeed idempotent. As I mentioned, if the response is lost you resend, until you do get a response. The server will accept the same message over and over, until the client stops sending the message (i.e. the server response was received by the client). The server simply discards any subsequent messages. That message integrity is verified with a hash value goes without saying.

The two generals problem is a non-problem. Messages don't equate to messenger's lives, in that there is no cost equivalent to a loss of life when resending a message. Also, the probability of not receiving a server response after n attempts is virtually null. Ref. how the internet continues to work.

Atomicity is only relevant where there's a batch. Delivery of all messages in a batch is taken care of as described in the previous paragraph, and I'm therefore satisfied that transmission will work. Persistence on the client and server is therefore where atomicity applies, and is subject to the same constraints that any database transaction is subject to (again, the risk impact is high, but the probability so low that while I could, I don't even bother[1]). I don't see a problem here.

[1] Hardware is quite sufficiently reliable these days. If I ever had to work in a scenario in which the cost of loss is sufficiently high (even considering the low probability of loss), then I'm still not sure what you suggest I do, because messages aren't placed onto a queue as a single operation. They're discreet. If they weren't I'd bundle them as a single message, and so wouldn't face the problem in the first place.


Completely agree. One small correction: "Guaranteed delivery can become quite tricky" should roughly be: "Guaranteed delivery ARE quite tricky, if not, you have not run into all contraints yet."


Can you actually list these constraints? If not all, at least some examples? I've not encountered any I'm not handling [1]. My code has been running on various device types and under all sorts of loads for years, and have yet to encounter something not handled as described above.

[1] If you're referring to guaranteed delivery in the sense of guaranteeing that your message is delivered to a person that is guaranteed to action the message, then this is not the same problem that message queues attempt to solve. Such a scenario requires non-delivery timeouts and re-routing along a routing table, culminating in what the military world calls a "guaranteed action point", which is staffed by real people 24 hours a day.

I built a prototype based on Exchange and SharePoint once that tries to do this. It's difficult, but not impossible. The real issue we faced was cultural, in that a commander sending a message just "felt better" when sending a courier on a motorcycle, as opposed to clicking send in Outlook.


My commiserations on the train journey, I don't miss the Brighton <> London commute one bit (assuming it's the same Clapham, of course).

Guaranteed delivery is indeed easy if your messages are idempotent, but if they're not then life is significantly more difficult. Most messaging systems have an "at least once" guarantee, but building in an "exactly once" guarantee, especially with a system with any significant distribution doesn't seem easy at all (tell me if I've missed something though, but event ordering in distributed systems has a reasonably interesting history).

The same obviously applies to message ordering. These (presumably incrementing/timestamped?) ordering identifiers are fine if they don't have to cross a sender/batch boundary, but plenty of situations are more complex than that. How do you "simply" avoid the problem of either centralised synchronisation or time slip in ordering, as well as out of order delivery with major partition events?

Just curious as to whether you're saying that these are all easy problems, or that the domain is easy if you have enough constraints!


I think I'm understanding now why others below have such doubts about the simplicity of what I describe ;-)

You're describing two problems - the first is around message delivery (in-order, exactly once) and the second is a distributed transaction. Both of these problems are solved by separating message transmission from message processing.

First, in-order and exactly once. Lets assume we have two systems. One we'll call the messaging client (MC) and server (MS), which sends and receives messages. The other we'll call a system of record, also client (SoRC) and server (SoRS), which is the system that has, say, an exactly once requirement:

SoRC -> MC -> MS -> SoRS

Lets say that I'm ordering a pair of shoes from Amazon. I don't want to receive two pairs, and I also don't want to receive no pair. Receiving a single pair of shoes exactly once is a business function. The fact that the message containing my order has been received exactly once provides no assurance. What if the ordering system on the other side rejects my order because I entered an invalid product code? Even a semantically correct and valid product code doesn't help me if the shoes are out of stock. I want those shoes, not the certainty that the message was delivered. The assurance that my order was delivered once and only once is therefore a business rule, and completely immaterial at the messaging level.

Therefore I can send a once, and exactly once message to the server a 1000 times over. The server-side messaging component (MS) will submit the first valid incoming message to the sever system of record (SoRS). If the client re-sends the same message because an acknowledgement failed, then the MS simply discards that resent message.

Given the above, the use case for reliable messaging evaporates. Joe Gregorio (http://bitworking.org/news/201/RESTify-DayTrader) explains it well (search for "Orders should be reliable").

Second, distributed transactions: This one is solved using the same philosophy - that a transactional requirement has nothing to do with a messaging requirement.

Assume we need to debit one account, and credit another. On the client I'd wrap that in a local database transaction, because databases happen to do transactions really well. Then I'd send the message as described in my first post above. When received on the server I'd use another database transaction to persist it in the business application.

Again, the use case for reliable messaging evaporates, because this is a business logic problem, and not a message delivery problem.

More info using REST is available from http://www.infoq.com/articles/tilkov-rest-doubts

Hope that clears it up?


I'll go for qualified agreement to some extent :)

You're right in saying that, for example, exactly once delivery is a symptom of a business requirement. I do agree, but even something like message de-duplication has issues with scaling in a genuinely distributed system. Admittedly, whether you choose to look at that as part of messaging (a term which I admit to possibly over-broadening sometimes, as I tend to look on it as more of an approach than a technology) is open to debate, but I do get where you're coming from much better now.

On the second point, I wouldn't describe my scenario as a distributed transaction - there's also very possibly no single server on which to do a classic database transaction - as soon as you do there's a scalability barrier. Anyway - it's more a difference of scenarios I think, and I'd agree that in the ones you've described the solutions sit outside of a messaging layer/domain.

Always interesting to hear, thanks for clarifying!


I'm a little in love with you for the phrase "hipster terms".

I am not ignoring your technical points--I think we may have to agree to disagree, here. You certainly have a valid opinion, and are entitled to it. We think a little differently, but variety is the spice, etc. etc.

I just saw your edit and couldn't contain my glee at finally earning the moniker "hipster", so I had to comment.


What is meant by elasticity is that you can handle varying load on demand. You don't need to provision/scale for spiky/unexpected loads in advance, your infrastructure will expand and contract on demand. This is a very difficult problem to solve.

Also, this is on a much different scale than a small queue on a mobile app like you are comparing it to. A queue on a mobile app doesn't really need elasticity or scale. But when you're dealing with millions/billions of messages, the complexity is on a whole different level. And providing the ability to scale/grow with zero effort is extremely compelling.

Now if you had millions of users of your mobile app and they all sent messages for processing, an elastic, scalable message queue would be a godsend.


You're not describing anything here that isn't just an inherent attribute of message queuing. It just sends when it can, and keeps trying until the local persistence cache is cleared.

What am I missing?


Nothing, that's what the article is about, "uses for a message queue". The inherent attributes of a queue are what make a queue useful.





Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: