Hacker News new | past | comments | ask | show | jobs | submit login
Postgres: A better message queue than Kafka? (dagster.io)
65 points by tylermcginnis on Oct 4, 2022 | hide | past | favorite | 45 comments



Not just Postgres.

You can do exactly this with MySQL and SQL server too because they both support SKIP LOCKED.

Interestingly, the plain old file system on Linux also makes the basis of a perfectly acceptable message queue for many use cases - the thing that makes it work is that the file move operation is atomic. Atomic moves are what make queuing systems possible.

You could write a file system based message queue in 100 lines of async python, which I did here:

https://github.com/bootrino/arniesmtpbufferserver

File system based message queues can be written in any language, extremely simple and, most importantly - zero configuration. One of the most frustrating things about queuing systems is configuration - that includes database backed queuing systems. They can also be fast - I wrote one in Rust which maxed out the hard disk's random write capability well before maxing out the CPU - from memory it beat most of the common queuing systems in terms of messages per second.

Not all use cases for queues need to be able to globally distributed messages queues with the sort of guarantees needed for financial transaction processing. I would suggest to you that in fact most queues out there are used as outbound SMTP queues, which are then over engineered to use something like Celery, which is a nightmare to configure and debug.


We found difficulty with the purely advisory locking that Linux had. Also the possibility of network filesystems made it a pain.

Do you have any experience with either?


You don't lock, you move the file that it next to be processed. File moves are atomic. You move the file out of the list of files that are being picked up for processing.

Lock free.


Network file systems do not support atomic moves, but you should not run such an application on a network file system.


I also found that advisory locking has a lot of gotchas, especially when used in multithread contexts (apparently you can lose the lock because a different thread closed a different file descriptor on the same file).


What makes it atomic is running publishers and consumers on the same box (since you're sharing filesystem between those).

Also listdir is a big bottleneck here:

    while True:
        # get files in outbox
        files_in_outbox = [f'{PREFIX}/outbox/{x}' for x in os.listdir(f'{PREFIX}/outbox')]


>> What makes it atomic is running publishers and consumers on the same box (since you're sharing filesystem between those).

It's the move/rename that is atomic.

https://man7.org/linux/man-pages/man2/rename.2.html


Not really.

       However, there will
       probably be a window in which both oldpath and newpath refer to
       the file being renamed.
But that's not even the main point.

1. Move happens after email is sent, so there is a window where email is already being sent but file still exists. 2. Even if you do it before, there's still a window between os.listdir() and os.remove() 3. Complexity is O(N^2) due to listdir() + getctime() being called on every iteration.

If you just want to ensure order, it probably works fine at a small scale. But it would be unwise to run multiple consumers on a single instance, and impossible to run them on multiple instances.


I worked with a network filesystem that supported atomic renames and we based an entire large-scale production system on the idea that it would work (it did). The system supported Youtube and Google Play model training, regularly processing increments of hundreds of terabytes.


love to see this kind of hackery


I love to see effective yet brutally-simple "redneck engineering" solutions to software problems, particularly ones that straddle the line between genius and stupidity in a way that makes architecture astronauts feel uncomfortable.

I used to work at an brokerage that worked with a panel of around 12 providers, all of whom offered 5+ products that were updated multiple times per year to meet changing regulatory requirements. When a sales adviser recommended product X from provider Y to a customer, the adviser would then need to fill in an application form that was precisely tailored to that particular revision of that particular product. Bear in mind that these were complex, multi-sectioned forms. Needless to say, this created a huge workload for the devteam to keep track of all the product changes and update the UI accordingly each time.

At some point, someone on the devteam had the genius idea to simply take the PDF application forms from the provider, extract the individual pages as PNGs and overlay HTML form elements on top of them. The provider would essentially be doing our UI design for us. Add in an editor tool so the sales managers could set up the forms themselves and a tagging system so specific fields could be pre-filled from customer data we already had stored in the DB and the devteam's workload dropped by maybe 90%. Simple, stupid perhaps, but effective.


A controversial, but a very pragmatic take.

Queues are great for semi-infinite scalability, but you rarely need it.

There's numerous subtle benefits to using db compared to regular message queues that are often overlooked.

Being able to delete, reorder, or edit specific messages can be a lifesaver when things go wrong.

Most common issue with queue-based systems is getting overwhelmed with messages. Either your consumers went down. Or your producers had a bug / executed too frequently.

Being able to recover with a simple SQL query is a blessing.


Here's just a few examples from my experience:

1. Huge number of messages from test system were accidentally inserted into production.

Queue solution: disable consumers, move messages to a temporary queue while filtering them, move messages back to the old queue, enable consumers

DB solution: just delete rogue messages

2. We want to store some of the messages, but we're not ready to process them yet

Queue solution: create a separate queue for each message type, insert messages to different queues, manually move them when you're ready to process them (and keep in mind that not every queue can persist messages forever, SQS for example can't hold messages longer than 14 days)

DB solution: just skip those messages while processing

3. Consumers went down and the queue now contains a big number of duplicate messages. While it was fine to just wait a couple hours to let it stabilize, a whale customer started complaining

Queue solution: none (any hacky solution would take longer than it takes for the system to naturally stabilize)

DB solution: move whale customer messages to the front of the queue


1. Kafka isn't a message queue, but advance consumer offsets past bad data. Hit person who inserted test data into prod with a cricket bat and then hit person who designed a system where they could do that with the same cricket bat, but harder.

2. Kafka - do what you did with your DB consumers.

3. Kafka - consumers going down can't cause duplicates, what's even going on with your queue?


> what's even going on with your queue?

A message with the latest state of an object gets pushed into queue. When noone is consuming those, you essentially get duplicates: queue contains all of the updates, but you only care about the most recent one.


Throughput is often where an RDBMS falls over here. Unless you have partitions/isolation around your high throughput workloads (or throw tons of horsepower/$$$ at it) you can’t achieve the same ops/sec as a dedicated queueing mechanism.

But you do have a lot of valid points, particularly about failure recovery. My approach tends to involve a hybrid of the two if high speed is a necessary component somewhere. The system can restore its runtime state from a db, but the active mechanics of it are running on message passing of some kind.


"The write path. We built a daemon that would select log entries that were older than two weeks, copy them into a file in S3, and delete the rows from the database"

Seriously... How can you ever consider saying "a rdbms is just fine as a kafka alternative" under those conditions ?


How does Kafka solve this problem any better?

Your messages expire so if you want to archive you need a consumer to write them to disk. That sounds very similar.


Why not?


Ideology.

There's alot of people who are ideologically opposed to database backed message queues. They're usually reluctant to give detailed explanations why, because it's an emotional thing.


Because that would be equivalent to saying « rdbms are fine as a kafka alternative provided you build a compensation mechanism for all the things it does worse than kafka». It doesn’t make any sense.


> Postgres: a better message queue than Kafka?

"We didn't actually evaluate Kafka for this use case, because we know Postgres, so we used that."

Weird article angle.


I suppose they're asking the question too here


File this under "If a headline is asking a question, then the answer is NO." Honestly, I'm not even sure what point the author is trying to make besides "anecdotally and at low scale, unusual-for-the-purpose technology X solved problem Y." A near-infinite number of bad patterns can solve problems along happy paths and resolve plenty of edge cases to boot. 99.9% availability was a goal post here? There are systems where 7 9s is unacceptable. The author didn't even provide data backing the measurement goals.


This solves their problems not some theoretical other problem.

They avoided over engineering and building more complexity than required.


What are those mythical systems where seven nines is unacceptable? What are the chances you're going to work on one of those?

Google Spanner for example is up to five nines. Are there a lot of systems that need to be three orders of magnitude more reliable than Google Ads?


You're right, nothing is more important than serving ads.


Nothing is more profitable than serving ads.

I'm still waiting for an answer.


While I have no idea what kind of distributed system would need 7+ 9s, there are plenty of solution providers who at least promise it as part of their marketing. I truly disbelieve that these companies don't have any point of failure within their system that would take them under that threshold, even if it hasn't been challenged to date, but that level of availability is still the published goal. Hopefully needless to say, but 99.9% is not acceptable for most "important" (and yes, advertising is important) applications outside of US-East-1 somehow. Quick examples, no experience with any of these: * https://www.infinidat.com/en/news/press-releases/infinidat-h... * https://vindicia.com/blog/9999999-global-system-uptime/ * https://www.ibm.com/downloads/cas/A856LOWK

Bad examples? Certainly yes. Are there any good examples? Maybe, doubtful. Does any system actually need 7 9s? Maybe, doubtful.


Something I think needs to be stated: Kafka is not a message queue. If you need things like work stealing, you're going to have a bad time. It doesn't seem to apply in this case, but the terminology misuse is going to lead some poor developer astray.

It seems like they only tried postgres? so they didn't compare it with kafka at all....so yeah, you can make a message queue or log table in a rdbms.


"Many haters want you to buy their message queue product" lol


You can spew messages into sqs then have a lambda on sqs sending to pg.

Funny, we were going to use kafka too, but just sending to mysql worked just fine <shrug>. One day that will change, obviously.


It works fine on the happy path, but if mysql goes down, you lose things. Having a distributed, resilient queue has availability benefits.


> It works fine on the happy path, but if mysql goes down, you lose things.

Why do you have to lose things if the DB goes down? Agreed, untuned & unconfigured MySQL (and MongoDB) out-of-the-box can lose things due to bugs and design issues, but that is the case even when they are running. However, DBs, in general are made precisely for the purpose of safely storing things and not losing them.

OTOH, the number of Kafka setups I have seen that'd lose things when something goes down ... maybe this is not a guaranteed win for the Kafka side of arguments.

> Having a distributed, resilient queue has availability benefits.

High availability is not a function exclusive to Kafka. On the other hand, there's some functions that may come in handy to use in a queue that Kafka simply cannot provide, but DBs can. Off the top of my head: ACID, instant scalability (both up and down) of consumer groups, and the sheer flexibility (and power) that comes with a DB in general.

----

Overall, there's some merits to using a distributed log as a message queue, sure, but there are also merits to using a DB for that.


> OTOH, the number of Kafka setups I have seen that'd lose things when something goes down ...

Was the data actually acked across all replicas by Kafka? If so, very interested in how it was lost.


That's why we have an RDS cluster. We're not multi-region because, frankly, if the region barfs we can blame it on Amazon. And in any case we'll still deliver just fine as long as cloudfront keeps going.


Also, we use SQS as an intermediary between our logging and mysql, so even if mysql goes down SQS will keep things for...a while.


I don’t think this article made a compelling reason not to use Kafka. In fact it may have made the opposite point.

Wouldn’t it have been easier to just use Kafka?


From TFA Postgres was an established tool in their org they already had expertise with. The author argues that adopting a new tool would have been risky and costly.


Learning how to deploy something at scale costs time.


I don’t want to doubt Postgres’ effectiveness as a message queue (however difficult it may be) but I have to wonder if maybe they were doing something weird or wrong in the consumer for it to be ineffective?

Maybe it’s not that Postgres is a better than Kafka as much as Postgres is a better solution for what they were trying to do and use Kafka?


“ Today these are not major problems for us, and frankly, I’m not sure how we’ll tackle these problems. Maybe we’ll grow multi-region support and buy really big Postgres boxes. Maybe we’ll move to sharded Postgres. Maybe we’ll switch to Kafka or some other distributed queue. ”

LOL


Since they're already doing offload to S3 for older log messages, it seems like they could do this from the get-go and save a bunch of storage and interzone transfer costs in RDS. Depends on the log message sizes though of course.


There is also commercial products like NServiceBus that can use SQL Server as a connector to provide storage. I always used to think about doing this in the database as poor mans message solution. Perfect for small projects.


Honestly that's comparing apples and oranges. Also, that title is clickbait. It feels like a PR article disguised as a tech article.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: