Hacker News new | past | comments | ask | show | jobs | submit login

All very good points. What I like about Kafka is that you can queue up a bunch of messages without needing to be able to handle that load immediately. It lets you build very resistant patterns: if your message-senders overwhelm your message receivers in HTTP you can end up with connection failures, get stuck waiting, etc. In Kafka what happens is you now have a large backlog to work through, but at least your messages are somewhere accessible to you and not dropped on the floor.

HTTP definitely has the edge when it comes to library support. In fact, Confluent et al offer HTTP endpoints for Kafka so that you don't have to deal with the vagaries of actually connecting to a broker yourself (the default timeout in python for an unresponsive broker is _criminal_ for consumers. You will spend several minutes wondering when the message will arrive.) We use an in-house one. But that introduces HTTP's problems back into the process; you need to worry about overwhelming your endpoint again...

Regarding application patterns, ideally you're writing applications that read data from one topic (or receive messages, parse a file, etc) and write to another topic. Treating it as a request that will somehow be responded to later in time scares me and I wouldn't do it. What if your application needs to be restarted while some things are in-flight?




> It lets you build very resistant patterns: if your message-senders overwhelm your message receivers in HTTP you can end up with connection failures, get stuck waiting, etc.

I think the biggest drawback to HTTP in this space is that there's typically no coordination between clients and the server. Clients send requests when they want and the server has to respond immediately.

That becomes a big issue when you have an outage and all your clients are in retry loops, spiking your requests per second to 3x what they would normally be, on top of whatever the actual issue is.

Most of the retry stuff seems largely shared; i.e. your code should still have handlers for when Kafka isn't responding right. Kafka will only preserve messages on the queue, it won't help if you lose network connectivity, or your ACLs get messed up, or etc, etc.

> Regarding application patterns, ideally you're writing applications that read data from one topic (or receive messages, parse a file, etc) and write to another topic. Treating it as a request that will somehow be responded to later in time scares me and I wouldn't do it. What if your application needs to be restarted while some things are in-flight?

The pattern I've seen is to make the processing itself idempotent, and only ack messages once they've been successfully processed. So if you restart the app while it's processing, the message will sit there in Kafka as claimed until it hits the ack timeout, and then Kafka will give it to a new node.

As far as RPC, I'm not advocating that it's a good idea, but you could implement timeouts and retries on top of an event bus. Edge cases will abound, and I wouldn't want to be in charge of it, but you could shove that square block into the round hole if you push hard enough.


How does that still make Kafka a better choice than any of the other queueing systems out there? SQS, Redis, ActiveMQ, RabbitMQ, there are tons more queues out there that are far easier to use than Kafka.


I was comparing queues to HTTP endpoints for sending messages, I can't speak others besides Kafka and Redis.

Redis is an order of magnitude easier to work with but struggles under loads that Kafka has no problem with. Also every once in a while our Amazon managed Redis queue will have a bad failover or melt down because someone runs a bad command on it, but our Amazon managed kafka has been rock solid since we switched to it. When we ran Kafka ourselves though we definitely watched it melt down a few times because we threw too much at one broker or we made obscure config mistakes. And figuring out why a consumer isn't getting messages is always a pain, whereas redis is always a dream to use.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: