Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

You are absolutely right. That's what I said in my sibling comment so I'm not sure what's backwards. However, if your millions of little topics don't fit on a single machine - what do you do? You need a fat pipe. Hence you put your little topics into the fat pipe, send over the fat pipe to other mqtt brokers that need to disseminate egress.

Example - you have 1 topic with 1 producer and 20M consumers. Each consumer is a tcp connection. Say that you can do C1M happily, you still need 20 brokers to serve egress for all your connected clients. Now imagine that you have 100 brokers, 100M connected clients and your connections are randomly distributed over your brokers. You don't want to route every message to every broker because. So you need a fat pipe and some middle man that knows which brokers a message must be routed to because there are currently consumers connected to those brokers waiting subscribed to topics and waiting for messages. As someone who works on MQTT, you for sure understand the problem.

I have never heard of Eclipse Amlen. However, I am working with MQTT and Kafka since 2012 and have seen a nation-wide successful MQTT rollouts where MQTT and Kafka worked in tandem to solve exactly the problem you are talking about - millions of little concurrent connections distributed over a large fleet of devices for sub-100ms round trip.

It's not a competition, it's a coopoeration.



I agree there are cases where Kafka and MQTT are often used together. If you have lots of MQTT clients producing messages fanned-in to a small number of backend apps (or consuming a small number of wide fan out messages) people often combine Kafka as a fat pipe behind MQTT brokers (though there are alternatives, consuming messages from the brokers using e.g. MQTTv5 shared subs)

In more complicated situations (e.g. "outgoing" persistent messages buffered for individual clients (i.e. each client has a "message inbox"), Kafka is less obviously useful (it's an anti-pattern to try and random-seek the messages from Kafka topics as clients connect). In this kind of pattern, the main architecture pattern I see are clients partitioned across (highly available) MQTT brokers. If the messages come from MQTT clients directly to other clients (e.g. instant messaging (Facebook Messenger uses MQTT), having these broker in a cluster sharing a topic tree is very useful.

If the outgoing messages don't get buffered whilst the clients are off-line (e.g. because these are responses to client requests) then you don't need each client to be routed to a "home" broker that buffers the messages - it can connect anywhere.

It all depends on the shape of the message flow you're designing the system to support (but to say MQTT is a toy for small numbers of topics as it seemed to me that you argued up-thread seems misguided).


> So you need a fat pipe and some middle man [...] waiting subscribed to topics and waiting for messages.

Your use case is not Mr everybody use case nor the one presented in the article. Most usages of Kafka I have encountered in the wild is for notification delivery or telemetry report and of the order of few ~1000 msgs/s.

You do not need a fully distributed ordered log system for that. MQTT does the job for a fraction of the complexity and operational cost.

> . Say that you can do C1M happily, you still need 20 brokers to serve egress for all your connected clients. Now imagine that you have 100 brokers,

Even at these scales, you can find some commercial MQTT brokers going over 20M msg/sec nowadays.

With OSS solutions, you could get your way there with some HAProxy + your favorite MQTT broker behind DNS load. balancing as long as you do not requires HA, scale only should not be the issue.

It would even play pretty nicely with anycast if you want to place your brokers at edge close to your customers and do some proper partitioning.

That is currently pretty much the case presented in the article. They just advertise telemetry report (very likely not HA) injected by a time series database.

Once again, if what you need is fully ordered distributed commit log for a complex scenario of event sourcing: Go for it, go for Kafka, it has been designed for that. But it is just not the case of most Kafka instances I see deployed in the wild, these ones are generally the result of quick Google-search driven engineering.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: