Hacker News new | past | comments | ask | show | jobs | submit login
Ask YC: Why the excitement over XMPP on the backend?
39 points by gstar on July 31, 2008 | hide | past | favorite | 31 comments
I'm developing something now that implements a push client, and I keep seeing people getting excited about XMPP for publish-subscribe type things.

The slide deck that keeps getting press lately is here: http://en.oreilly.com/oscon2008/public/schedule/detail/4359

It really seems to me that using XMPP (which is a stateful chat protocol) for RPC might be the wrong approach. Have the XMPP advocates just re-invented message queues?

In the OSS world, ActiveMQ exists, so does AMQP. MQ theory is very old, having been used for decades for things like financial systems.

Am I missing something?




Message queuing is old, boring, and well understood. The same kinds of people that are trying to convince us that REST is a huge improvement over WS-<star> (despite the fact that once-RESTful protocols are becoming increasingly similar to WS-<star>) are trying to convince us that XMPP is to message queuing as REST is to WS-<star>.

More cynically, XMPP vendors generally give everything they have away in hopes that somebody with money will buy support for it. Since customers of message queuing products are people with money, it is only natural for people to hype XMPP as the "worse is better", cheaper alternative to "enterprisy" MQ products in hopes of having some money fall into their pockets.

Less cynically, the people hyping XMPP understand it very well from their personal experiences with chat systems and "chatbot" like applications including Twitter. It is easy to say "hey, if a computer can IM me, I bet a computer could IM another computer." It is a little tricky to abstract that into "hey, that is the same thing as that old, boring message queing stuff," and anybody that gets that far realizes it's not that exciting to blog about.


And this is what scares me when I read about things like vertebra (http://www.slideshare.net/ezmobius/vertebra) - how in earth could a company that sells cloud computing not already have a message queueing architecture!


> once-RESTful protocols are becoming increasingly similar to WS-*

How so?


It is pretty common for a RESTful protocol to end up with batch-fetching and batch-updating operations. Once you add those operations, you've already violated the "uniform interface" constraint and removed all the benefits (primarily caching, authentication, authorization, and parallelism) that derive from the uniform interface constraint. Once you have batch-fetch and batch-update, those two operations can generally do everything that you usually did via individual PUT/POST/DELETE operations. So, why do you need the separate PUT/POST/DELETE operations at all? It is just more surface area for you to maintain.

Now, maybe you want to combine your batch-update and batch-fetch operations into a single "batch-update-and-fetch" POST operation. Well, that is exactly what WS-* does.

On the flip side, sometimes instead of batching you want to fetch or update only parts of individual resources via some extension to partial GET requests and or a PATCH method. There is currently no uniform interface for those so, just like in the case of batching, you lose all the uniform-interface benefits.

Now, let's say you want to combine your partial-resource operations with batch operations over multiple resources to reduce the number of requests you receive and/or the reduce the aggregate size of requests, so you add the ability to do patching and partial-fetches to your batch-update-and-fetch mechanism.

Now take a step back. Doesn't your batch-update-and-fetch-with-patch-and-partial-fetch look a lot like SOAP with WS-ResourceTransfer? It will probably look a lot like SOAP; not even the newish SOAP 1.2 that actually is somewhat HTTP cache friendly, but old school RPC-only SOAP + WS-ResourceTransfer.

Look at Google's GData and OpenSocial's RESTful API for examples.

And, that is totally ignoring anything to do with transactions.


OK, I want to mention one more thing.

Take any WS-<star> specification, and you can probably find one or more ad-hoc proposals for adding similar features for RESTful protocols. For example, AtomPub is adding a mechanism that is almost exactly like SOAP-with-attachments, but more limited. The various HTTP message signing mechanisms like those found in Amazon's web services or the one in OAuth (and OpenID? I forget if there is a different one for that) are analogous to the various WS-Security stuff. As another example, I've seen a couple RESTful protocols that tried to define something like a simpler WS-Policy.

I don't think that the WS-<star> stuff is that great either; a lot of it is really complex. But, WS-<star> stuff is all decomposed in ways that make it relatively simple to combine just the parts you need together. SOAP 1.2 isn't complicated at all. WS-Policy and WS-Security and other stuff is more complicated than it needs to be, but it isn't so complicated that it needs to be totally redone anew. And, a lot of it can be (and is) subsetted pretty easily.


What you say makes a certain kind of sense if you think of REST as a protocol. But REST is not a protocol.

There seem to be quite a few people these days who think that REST means nothing more than "uniform interface" and/or "resources". But that's only part of the picture.

http://www.ics.uci.edu/~fielding/pubs/dissertation/rest_arch...

"REST is defined by four interface constraints: identification of resources; manipulation of resources through representations; self-descriptive messages; and, hypermedia as the engine of application state."

P.S. I'm not sure why you think batch operations are missing from REST. GET/PUT/POST/DELETE can apply just as well to collections as to individual resources.


Thanks for your detailed replies.

I'm not convinced that this is a major trend, but I see your point.


I'm not sure this is an either or proposition. I haven't seen many message queue protocols with tools for identity, user authentication, federation, etc. I'm also pretty sure that xmpp doesn't even begin to replace message queuing on the backend, but when you look at what folks like EngineYard are doing with it (basically distributing messages), it seems like the value for them was more along the lines of 1) easy ui integration, 2) well understood problems and tools by developers, 3) well understood servers by admins.

More on the proposal of mixing the two, the rabbitMQ folks have done some interesting work integrating with xmpp. (See their xmpp transport (http://www.lshift.net/blog/2008/07/01/rabbitmq-xmpp-gateway-...), and rabbiter (http://github.com/tonyg/rabbiter/tree/master).

So there we have an interesting mixture of the two technologies, you get the insanely fast message routing and throughput of RabbitMQ while leveraging the tools that xmpp brings with it (users, federation, etc.)


The merging of xmpp and rabbit mq is fascinating -- i've bookmarked this and will follow-up on it. I'm reading through the documentation now, but from what I can see this is great tech. (mod parent up!)


Almost none of the current generation of large scale web applications including twitter are doing anything remotely new.

The investment banking world has as you say had to deal with huge feeds as well as mashups from multiple sources.

http://www.zeromq.org/ for example is coming out of the financial world and looks like it's going to be the nginx of amqp systems. It looks like it's way faster and "scalable" than creating something over xmpp.


Also check out this Erlang/OTP AMQP server: http://www.rabbitmq.com

AMQP is very interesting. Check it out if you are building a distributed system or if you have a requirement for reliable asynchronous messaging. When I last looked into it about a year ago, the implementations (all two of them that I found - apache qpid and rabbitmq) were still a little rough but it looks like a lot has happened.

It's wierd how few people in the web world are aware of asynchronous messaging.


No. You didn't.

But the context is different. XMPP as MQ mechanism may work better in Internet scale distributed system. Just like HTTP works better as a client server mechanism on Internet scale distributed system than other protocols before it.

I worked in a startup called PubSub from 2003 to 2006. We in fact used XMPP in 2004 internally to relay our error message log of our cluster of machines in real time to our office's archive server. It was an experiment but it seems pretty useful. We called it as fuselet before in our company.


We use XMPP for that now. Any major exceptions, errors or site activity across our farm is pumped to a monitoring app via XMPP. I can also control the cluster as well.


"XMPP as MQ mechanism may work better in Internet scale distributed system."

Why do you think so? What advantages does XMPP have over other MQ technologies at "internet scale"?


Is there any existing MQ system that is decentralized? I said "may works better" because it is a conjecture limited by my ignorant mind. Internet is essentially decentralized. So if any MQ system can be modified to worked in decentralized environment. Then they may work better than XMPP. I believe the performance of throughput of those system may be higher than XMPP.

On the other hand. If I only have an internal cluster, I will choose to use existing MQ system instead of XMPP. I may want higher throughput, and I don't care authentication of endpoints because they are all trusted. Why do I need all the extra stuff that XMPP demands?

Regarding the case I mentioned in PubSub: It was just an idea. In fact, I was the one who mentioned the idea then. And our CTO and VP of Eng thought it was an interesting idea because in this case, we just wanted to see how does it work out and did a test implementation. And it was pretty flexible for us to use it.

Of course XMPP is not the first MQ system on Internet. SMTP is the first one, right? In SMTP, we don't know anything of endpoints. We assume senders and recipients exist and are taken care of by SMTP servers. I think in XMPP, because the original application is chatting, then it adds an new protocol to notify "presence" of endpoints.

Edit: I think when I said "internet scale", I actually meant "a system of billion nodes that are loosely coupled together"


"Is there any existing MQ system that is decentralized?"

Yes. http://softwarelivre.sapo.pt/broker/ for example.

But "decentralized" does not mean "federated" which is I think your point.

Best regards,


XMPP pierces NAT for one reason. This means you can have XMPP clients behind many more firewalls then you can with amqp.

Also xmpp libraries are available in almost every language and platform and traditional MQ's do not have presence or federation(ie horizontal scalability).

It's not about using XMPP as a message queue per say. It's about using xmpp as a p2p messaging system for machine to machine communication as well as integration with chat.


The MQ people don't seem to be explaining their value to the Web 2.0 kids. When I read the ZeroMQ site the other day I kept thinking "but what is it used for?"


It probably doesn't help that books on the subject have desperately unhip titles like "Enterprise Integration Patterns". The introduction to that book is actually a great essay on when and how to use asynchronous messaging, and it is online here: http://www.enterpriseintegrationpatterns.com/Introduction.ht...


You're missing that they don't intend it as an RPC or REST interaction. It's to "stream" updates out of services, that can have multiple listeners.

This is already happening, but in isolated cases. Behind the scenes, Six Apart streams all the blog posts on its services directly to Google Blog search.

There are quite a few well-known services that get more API traffic (search engines, etc, pinging for updates) than actual human traffic. So something like this is ultimately necessary, whether it's XMPP or something similar.

I think they are aware of message queues, didn't they allude to them in the talk? Anyway they billed their talk as two guys who build web services making a tentative foray into the message-queue world, so if you think you can help, send feedback.


The OP understands that XMPP provides pub/sub message queues. He's asking what XMPP provides over the other MQs.


Federation.

I read a bit over ActiveMQ, RabbitMQ, and the recent and lovely ZeroMQ. As far as I could find, none of them provide federation on a Internet scale. Specially Open Federation using TLS.

I'm a XMPP developer for 6 or 7 years now, and if someone thinks that XMPP is going to replace MQ systems, they are delusional. But for Web integration, XMPP beats the crap out of current MQ systems because of Federation.

Sure, you can get 2.5M messages/sec on ZeroMQ. So what? I've seen personally a XCP XMPP server doing 1M messages per second also. But for internet scale stuff, even slower servers like OpenFire would work just fine for the current loads they are seeing.

It also helps that for scripting languages commonly used (like Perl, Ruby, Python), XMPP bindings are commonly available, while AMQP are still rare. Please please prove me wrong :).

My recommendation for startups? Use a MQ system internally from day one to develop your solution, but use XMPP to talk to the outside world.

RabbitMQ (and I'm suggesting it based only on what I read, I don't use it myself, I use another one) is the best placed MQ system right now, because it offers the bridge between local MQ system and XMPP world.

Best regards,


"which is a stateful chat protocol"

You're missing that it's not stateful chat protocol, it's stateful xml protocol (that has, so far, been use mostly for chat).

People are excited that it's fast, stable (some of the implementations anyway), and they don't have to reinvent the wheel.


"stateful xml protocol" makes as much sense as "stateful ASCII protocol" or "stateful bits protocol".


"Stateful XML protocol" makes complete sense, it just isn't very informative. It describes a protocol that is both stateful and XML-based. Those are independent attributes of a protocol; obviously, you can have both stateful non-XML-based protocols and XML-based stateless protocols, but I don't see why the OP's statement is nonsensical.


Totally agreed. That doesn't negate anything I just said. It's stateful. It's XML. It's a stateful XML protocol (and it happens to also have presence, which I left out, but does make some people's knickers wet).


you have a categorical error when you compare ActiveMQ to AMQP; they exist at different layers of the MQ stack (AMQP is a transport protocol.)

http://activemq.apache.org/how-does-activemq-compare-to-amqp...


Noted, but my intention wasn't to list products rather to name some technologies that are available and open for MQ.


Why not? It's extensible, it builds on tech people are already familiar with, and it's fast enough for most purposes.

It's also got lots of good libraries for many languages, and neat tricks like bridges between it and the web browser.


Nobody's proposing using it for RPC. They're suggesting it as a replacement for polling.


Nope.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: