That document acknowledges inherent JVM limitations: *> Java garbage collection ...

saltysugar · on March 12, 2014

You don't even bother to read past the section about the limitations that the engineers were well aware of in advance - they re-emphasize a common concern for people who question their choice of JVM.

The next line reads: "As a result of these factors using the filesystem and relying on pagecache is superior to maintaining an in-memory cache or other structure—we at least double the available cache..."

And if you don't know about pagecache, it's an in-memory cache managed by the OS and has nothing to do with JVM's memory at all.

And you forgot that C++ isn't the easiest language when it comes to designing a distributed system. Scala, as I mentioned, offers many other features that suit the needs of the team. Of course if you're a good engineer you'll know that there are trade-offs such as compiling time, but that's the same for every engineering decision.

shmerl · on March 12, 2014

They were aware and had to design around them. See my point above.

andrewflnr · on March 12, 2014

Designing around platform flaws can be worth it if there are commensurate benefits. You're very lucky if you've never had to do this, or perhaps just blind to the tradeoffs you were making.

nl · on March 12, 2014

That's like saying "we had to consider memory management" in a C++ system.

When you are designing a high performance system you have to consider everything. Different platforms have different tradeoffs, but the tradeoffs on the JVM have been well proven over time.

noelwelsh · on March 12, 2014

They didn't "work around it". Kafka was designed from the start to avoid memory pressure.

BTW, I've talked to finance people who have JVM applications that haven't had a major GC in months. Really, it's not a big deal.

kasey_junk · on March 12, 2014

Having written a few low latency systems a few of which are on the JVM, I will say that in all of those cases, allocating/freeing memory is always a slow down regardless of GC or not (cache coherence is almost always the deal breaker here). So in those systems, you simply do not allocate/deallocate along the critical path.

Is this difficult in Java? Yes. It's also difficult in C++. Just because it is difficult doesn't mean it is impossible.

So if I am resorting to managing my own memory anyway, why would I use the JVM? Because typically the code that is on the critical path is a small percentage of the entire code base, and the other advantages of the JVM (tooling, language features, libraries etc.) out way the downsides.

That's not always the case, and I don't have any specific knowledge of Kafka, but just because something needs to be low/consistent latency doesn't mean it can't (or shouldn't) be written on the JVM.

fauigerzigerk · on March 12, 2014

You are right that this can be a major problem with the JVM and working around it can be a lot of work. But you need to consider what kind of system we're talking about in this particular case.

This is a persistent message queue for log messages. Messages are coming in sequentially and subscribers read them sequentially. It makes zero sense to keep tons of messages in memory inside complex data structures as they are not indexed or searched or analyzed.

So in this particular case it's not actually a workaround. It's just sensible design and I wouldn't do it any differently in C++ either.