Clearly a lot of thought and work has been put into this and I give the authors credit for that.
I appreciate any entry into the space but this is a really crowded space. With some very battle tested entries in the field as people have pointed out.
There are a few questions not in the readme that I think need answering for a queue system:
- Execution guarantees (at least once, at most once, exactly once?)
- Order guarantees (FIFO, approximately FIFO, nondeturministic, etc)
- Throughput compared to other systems
- Fault tolerance characteristics, how many nodes can I lose before it stops working, when it does how do I recover
As I said, I have a lot of choices in this field. I'd like to see all the data up front.
Agreed, there are a lot of more robust products already in this space.
My biggest concern is its reliance on mysql. There is no way this could be a valid option for high volume messaging when it is essentially database as a queue.
> Its interface is generic, but was originally designed for reducing the latency of page views in high-volume web applications by running time-consuming tasks asynchronously.
The protocol is easy to drive directly and there are good libraries for most common languages.
Beanstalkd is always the queue engine I reach for. Sometimes I'll build something over SQL if it's low-volume, but I always design my queue APIs to match beanstalkd's so it can be swapped in.
Beyond being a great piece of software, I find the protocol to be really well-designed for a work queue.
It would be interesting to hear the rationale behind using this over something like RabbitMQ, which has its own storage layer, as well as the queue aspect.
Database-as-queue tends to run into performance ceilings before dedicated "true queue" systems, in my experience. As you point out, though, sticking with an RDBMS gives you nice transactionality. Additionally, using a database for a queue for as long as you can before switching also offers you the benefits of using a time-tested, usually widely-supported protocol with well documented reliability guarantees. Your queues are also easily introspectable, which can be nice (this is part and parcel with why databases tend to hit perf ceilings as queues though).
RabbitMQ also supports transactions for some queue operations, but its notion of "transaction" and what you can do inside of one is much more limited than a typical database's: https://www.rabbitmq.com/semantics.html
If the requirement is for persistence, i.e. no data loss if the queue process dies, then Redis won't fit.
EDIT: TIL Redis has the option to turn on fsync-to-disk on every write. Probably not what people are thinking of when they suggest Redis as lightweight.
Having the number of parallel consumers configured per-queue (as opposed to consumers dynamically being able to join and leave) seems like it imposes many of the same restrictions that make Kafka less than ideal at being a job queue.
Basically, if you have messages which can take different amounts of time to process, or you need to quickly dynamically scale the number of consumers on a queue in response to volume.
How well does the "update max_workers" queue-modification command work in situations of very high message volume and/or high consumer counts?
Hm, I've been looking for a lightweight, language-neutral job queue. RDBMS-backed is my preference since I don't need massive scalability, but do want persistence that's easy to reason about (so not Redis), transparency (so not beanstalkd) and a long-lived history (so not most of the other ones).
But, RDMBS only makes sense if you can use your existing installation, so MySQL-only is a nonstarter.
Interesting, but I'd like to know why there is a polling_interval? Considering there is a master server for each dispatch queue, one could expect a design where polling is unnecessary.
I may not be reading the documentation correctly, but I don't think this is a push queue; it uses the term "push", I think, to describe delivering messages to a queue and not as it is traditionally used to describe handing messages to consumers.
In answer to your question generally: push-based models, while more complex, tend to be higher performance (by dint of improving throughput: the broker can push messages to a consumer while it's working on other things rather than waiting for a "gimmie" request; the broker can also coordinate when and how it delivers messages to which consumers for maximum performance, which can lead to significant speedups in high throughput situations).
A very powerful pattern that gives a reasonable amount of control in a push-based situation is combining a push based model with client acknowledgements, and a client "window" of a number of messages that the client may or may not have noticed yet, especially if messages can be taken back from that window programmatically in the event of slow or dead consumers. This is what RabbitMQ/AMQP calls "Qos".
In my experience, push-based messaging models should typically not be adopted up front, unless throughput requirements are known to require such a model. The added complexity (mental and in code) of managing a push-based queue model is very rarely worth investing in at the beginning of development.
Furthermore, it's possible to have extremely high performance in a simpler pull-based model, provided you make some tradeoffs. This is what Kafka does.
I would recommend switching from pull to push only when it becomes necessary (though this can be a non-trivial amount of effort depending on how tightly integrated your code is with your messaging system). RabbitMQ/ActiveMQ/Redis/Resque as brokers will shine here, since they all support both pull models ("get" in AMQP) and push models.
According to its documentation, this does deliver messages to consumers.
I'm just struggling to imagine a use case where a push queue would be preferable over a pull queue. I'm sure they exist, I've just never encountered one before. Seems like the major difference is centralized throughput control, which would allow you to minimize variance in message processing latency. There are similar use cases in e.g. operating systems for minimizing latency variance for better UI responsiveness, but I can't think of any concrete use cases for using this in high scale backend queues.
Well, my original message was a bit snarky, but what I meant was that, at the very least, this project advertises the wrong features. I mean come on, "lightweight" and "high-performance"? Maybe advertise safety properties, resilience, I don't know... but the way it is built, it can't possibly distinguish itself on the "high performance" front. As for "lightweight".... you can use ZMQ for intra-process communication. That's lightweight - not using an external RDBMS.
I'm not saying this project is bad - I have no way of knowing. It might have legitimate usecases where it excels. But what it advertises can't be true.
"Lightweight" is really becoming a turnoff for me when people use it in their description. It (mostly) means "I haven't worked on this long enough to add the features all the other projects in this space immediately found were too useful to go without" or even "I haven't worked on this long enough to produce much code".
"Lightweight" is really only interesting to me in two cases: First, you're designing it for limited resources, like an embedded system, for which the standard answer is simple too large to even consider. Second, when the standard answer in the field is so "heavy" (an ill-defined term itself, but moving on) that it causes problems of its own. JVM solutions sometimes get to this point, where the act of administering the solution itself gets bogged down in merely administering the JVM.
I do not personally have the problem that my job queues are too heavy, nor have I heard anyone else complain that ZMQ or Redis are just so heavy for what they do.
This is a cool idea, but having the worker use http is just asking for problems. What happens if the http connection times out?
For a worker architecture, it can be expected for some jobs to take north of 5 minutes. Http will time out by then, but the worker will still keep on going.
I'm not involved in the project, but you do this if you want the same robustness as an rdbms, e.g. if a client gets an acknowledgement back from the queue then the data is definitely captured and won't be lost if the queue process then crashes.
I agree that it makes sense in certain circumstances and definitely depends on requirements (not only transactional enqueues which you mentioned, but also a lot of times it's a need for O(lg n) lookup/update/deletion of jobs in flight), and I've also seen a few MySQL-backed queues in production that can handle fairly high throughputs. I'm particularly and mostly interested why this project does it.
I appreciate any entry into the space but this is a really crowded space. With some very battle tested entries in the field as people have pointed out.
There are a few questions not in the readme that I think need answering for a queue system:
- Execution guarantees (at least once, at most once, exactly once?)
- Order guarantees (FIFO, approximately FIFO, nondeturministic, etc)
- Throughput compared to other systems
- Fault tolerance characteristics, how many nodes can I lose before it stops working, when it does how do I recover
As I said, I have a lot of choices in this field. I'd like to see all the data up front.