Lets say you want to create some new sort of database because Redis, Cassandra, TokyoTyrant, Postgres, MongoDB, DabbleDB, CouchDB, HBase, etc. just don’t serve your needs that well. You create an amazing in memory tree representation for your data and have a blazing fast indexer. Now all you need is some sort of messaging layer such that different clients can talk to your server. Preferably implemented in different programming language and with clustering capabilities. You could of course create such a messaging framework all by yourself, but that is a lot of hard work.
A simple solution is to just implement your database as a ZeroMQ server and pick a message protocol (fe JSON). As you have seen by now, implementing such functionality with ZeroMQ is really easy and on top of this you will get almost instant scalability because of the way ZeroMQ can route messages.
Interesting scalability and replication problems are not solved by (this kind of) routing. Putting it another way, problems that are solved at this level are also solved by running a bunch of Memcache servers where the standard client already includes a kind of sharding logic.
ZeroMQ works very well in combination with Apache Zookeeper [1] for service discovery and coordination.
Zookeeper is often overlooked because of the general aura of bloat surrounding Hadoop. It's actually quite lean, and now has python bindings, courtesy of Cloudera.
Some things in there seem to be contradictory. In particular, there's a claim about being brokerless:
ZeroMQ follows a brokerless design so that there is no single point of failure.
But then there's this:
ZeroMQ greatly simplifies this pattern by allowing you to have a single socket connect to multiple end points.
How does that work? Seems like a cognitive disconnect to me. AFAIK, a single socket can only connect at exactly two end-points (one being a client, the other being a server). Right? Sounds like a broker would be connecting that one socket up to multiple end points...
Maybe there's no CENTRAL broker, but certainly (at least in some configurations and uses), it sounds like there's an intermediary process involved in the communication.
Anyone know if 0MQ (or similar) can have multiple publishers send to the same queue? Some context:
- I currently have an app where users submit content to a single machine, which is then queued (using Python's inbuilt Queue object), and processed and displayed by threads that run on the queue.
- In future, I've been thinking about using a network queue to easily publish from a single source to multiple display servers, with a load balancer in between them. This neatly handles my 'view only' users.
- But what about if I wanted to allow users to submit from any server, ie, spreading the load for contributing users as well?
I'm guess I'm looking for a multiple publisher / multiple subscriber queue. Does anyone know if they exist?
There's no SPOF unless you create one. So the sentence should be more correct this way "ZeroMQ follows a brokerless design so that doesn't enforce single point of failure."
Also not relevant to the topic but to this comment is the idea of scaling redis pubsub out horizontally with node.js based clients. This is my NodeRed project.
Isn't the final "." supposed to be inside the quotes?
IIRC, that's one of the rules that I intentionally break. IMHO, the stuff inside quotes should have whatever punctuation it needs and there should be sentence-appropriate punctuation outside the quotes as well.
Lets say you want to create some new sort of database because Redis, Cassandra, TokyoTyrant, Postgres, MongoDB, DabbleDB, CouchDB, HBase, etc. just don’t serve your needs that well. You create an amazing in memory tree representation for your data and have a blazing fast indexer. Now all you need is some sort of messaging layer such that different clients can talk to your server. Preferably implemented in different programming language and with clustering capabilities. You could of course create such a messaging framework all by yourself, but that is a lot of hard work.
A simple solution is to just implement your database as a ZeroMQ server and pick a message protocol (fe JSON). As you have seen by now, implementing such functionality with ZeroMQ is really easy and on top of this you will get almost instant scalability because of the way ZeroMQ can route messages.
Interesting scalability and replication problems are not solved by (this kind of) routing. Putting it another way, problems that are solved at this level are also solved by running a bunch of Memcache servers where the standard client already includes a kind of sharding logic.