Seriously, the one time I was in a situation where much of the team seemed hellbent on this "just put all in Kafka" idea (without really understanding why, exactly) the arguments they came up with were not too dissimilar from what you've shared with us above. It all seemed to come down to "OMG databases are hard, schemas are hard, our customers don't understand the data they're shoving at us. But Kafka will take care of all of that for us. Because, you know, shiny."
That said I'd still like to have a more ... balanced understanding of why Kafka may not necessarily be The Answer, and/or have more hidden complexity or other negative tradeoffs than we may have bargained for.
My take was humorous but it didn’t hide anything. Kafka was built so that LinkedIn could shove all its real-time click data through a single funnel—terabytes upon terabytes. It has since been evangelized and created a cottage industry of Confluent salespeople who will give your manager a course in how to lobby their engineers into using Kafka. Have scaling problems? Kafka. Have business events that need to be ordered? Kafka! Have “changing schemas”? KAFKA!! I’m always suspicious when a company gives a product away for free tbut then charges $$$ for “support”.
I worked for a high profile recently-failed project from a company that rhymes with Brillo, and our data was just beginning to be too big for google sheets (!). However, we were also having organizational problems because the higher ups were seeing the failing project losing money so they of course decided to hire 100 extra engineers. Our communications (both human and programmatic) were failing and the confluent salespeople began circling like buzzards. Of course by the time it was suggested we we use it the project was already 6 months past the point of no return.
My advice is that if your data fits in a database, use a database. Anyone who says that isn’t scalable should have to tell you the actual reason it doesn’t scale and the number of requests/users/GBs/uptime/ etc that is the bottleneck.
To be very clear, Confluent doesn't give _their_ product away for free. Confluent Platform has many differing features to Apache Kafka that never make it into upstream.
E.g., Confluent Replicator vs. Mirror Maker 2, Confluent Platform's tiered storage has been available for quite some time (right now a bunch of people from AirBnB are doing a stellar job bringing tiered storage to FOSS Kafka, I'm hoping 3.1 or 3.2).
Actually, easiest thing to do to see the differences is grep all the subpages of this link for properties that start with "confluent":
I only had a brief exposure to it, but my impression is that it's sort of a message queue optimized for very large data (TB or more). So, for example, there's no way to easily answer questions like "How many requests did server X generate between 1pm and 2pm and how many of them were served by server Y?" because when your data doesn't fit in a single machine, supporting such queries requires a lot of bookkeeping. If you never need them, you don't want to pay for them.
Of course, when have a few megabytes of data and you route it through Kafka, then all you get is an opaque message queue where you can't see which message went from where to where. Good luck debugging any issues. But, hey, you got to use Kafka.
> for example, there's no way to easily answer questions like "How many requests did server X generate between 1pm and 2pm and how many of them were served by server Y?"
There's many ways to answer that using data streamed over Kafka - ingest it into your preferred query engine, go query it.
It's nice when you have a bunch of discrete events that you want any number of clients to work on without interfering with each other. Think of it as a fire-and-forget pub-sub. You can always have a worker dumping the queue into a database for later if you want to. It's a bit cantankerous but once you get it running it can handle messages at an impressive scale. It isn't a magic bullet to replace databases, and you can actually add schemas to the data you put in it (and it's generally agreed to be a good idea to do so; it's on our wish list because it would save us from duplicating the schema on the consumer and producer ends.)
I've worked at some places that used Kafka (including LinkedIn), although I have never been responsible for running the platform itself. I'll chip in with what I see as the negatives.
Kafka sits at roughly the same tier as HTTP, but lacks a lot of the convention we have around HTTP. There's a lot of convention around HTTP that allows people to build generic tooling for any apps that use HTTP. Think visibility, metrics, logging, etc, etc. Those are all things you effectively get for free with HTTP in most languages. Afaict, most of that doesn't exist for Kafka in a terribly helpful. You can absolutely build something that will do distributed tracing for Kafka messages, but I'm not aware of a plug-and-play version like there are for most languages.
The fact that Kafka messages are effectively stateless (in the UDP sense, not the application sense) also trips up a lot of people. If you want to publish a message, and you care what happens to that message downstream, things get complicated. I've seen people do RPC over event buses where they actually want a response back, and it became this complicated system of creating new topics so the host that sent the request would get the response back. Again, in HTTP land, you'd just slap a loadbalancer in front of the app and be done. HTTP is stateful, and lends itself to stateful connections.
Another issues it that when you tell people that they can adjust their schema more often, they tend to go nuts. Schemas start changing left and right, and suddenly you now need a product to orchestrate these schema changes and ensuring you're using the right parser for the right message. Schema validation starts to become a significant hurdle.
It's also architecturally complicated to replace HTTP. An HTTP app can be just a single daemon, or a few daemons with a load balancer or two in front. Kafka is, at minimum, your app, a Kafka daemon, and a Zookeeper daemon (nb I'm not entirely sure Zookeeper is still required). You also have to deal with eventual consistency, which can make coding and reasoning about bugs dramatically harder than it needs to be. What happens when Kafka double-delivers a message?
My pitch is always that you shouldn't use Kafka unless it becomes architecturally simpler than the alternatives. There are problems to which Kafka is a better solution than HTTP, but they don't start with unstable schemas or databases being difficult. Huge volumes of data is a good reason to me, not being sure what your downstreams might be is an option. There are probably more, I'm not an expert.
> our customers don't understand the data they're shoving at us. But Kafka will take care of all of that for us
Kafka isn't going to help with this at all. If your HTTP app can't parse it, neither will your Kafka app. Kafka does have the ability to do replays, but so does shoving the requests in S3 or a databases for processing later. I promise you that "SELECT * FROM requests WHERE status='failed'" is drastically simpler than any Kafka alternative. It is neat that Kafka lets you "roll back time" like that, but you have to very carefully consider the prospect of re-processing the messages that already succeeded. It's very easy to get a bug where you have double entries in databases or other APIs because you're reprocessing a request.
All very good points. What I like about Kafka is that you can queue up a bunch of messages without needing to be able to handle that load immediately. It lets you build very resistant patterns: if your message-senders overwhelm your message receivers in HTTP you can end up with connection failures, get stuck waiting, etc. In Kafka what happens is you now have a large backlog to work through, but at least your messages are somewhere accessible to you and not dropped on the floor.
HTTP definitely has the edge when it comes to library support. In fact, Confluent et al offer HTTP endpoints for Kafka so that you don't have to deal with the vagaries of actually connecting to a broker yourself (the default timeout in python for an unresponsive broker is _criminal_ for consumers. You will spend several minutes wondering when the message will arrive.) We use an in-house one. But that introduces HTTP's problems back into the process; you need to worry about overwhelming your endpoint again...
Regarding application patterns, ideally you're writing applications that read data from one topic (or receive messages, parse a file, etc) and write to another topic. Treating it as a request that will somehow be responded to later in time scares me and I wouldn't do it. What if your application needs to be restarted while some things are in-flight?
> It lets you build very resistant patterns: if your message-senders overwhelm your message receivers in HTTP you can end up with connection failures, get stuck waiting, etc.
I think the biggest drawback to HTTP in this space is that there's typically no coordination between clients and the server. Clients send requests when they want and the server has to respond immediately.
That becomes a big issue when you have an outage and all your clients are in retry loops, spiking your requests per second to 3x what they would normally be, on top of whatever the actual issue is.
Most of the retry stuff seems largely shared; i.e. your code should still have handlers for when Kafka isn't responding right. Kafka will only preserve messages on the queue, it won't help if you lose network connectivity, or your ACLs get messed up, or etc, etc.
> Regarding application patterns, ideally you're writing applications that read data from one topic (or receive messages, parse a file, etc) and write to another topic. Treating it as a request that will somehow be responded to later in time scares me and I wouldn't do it. What if your application needs to be restarted while some things are in-flight?
The pattern I've seen is to make the processing itself idempotent, and only ack messages once they've been successfully processed. So if you restart the app while it's processing, the message will sit there in Kafka as claimed until it hits the ack timeout, and then Kafka will give it to a new node.
As far as RPC, I'm not advocating that it's a good idea, but you could implement timeouts and retries on top of an event bus. Edge cases will abound, and I wouldn't want to be in charge of it, but you could shove that square block into the round hole if you push hard enough.
How does that still make Kafka a better choice than any of the other queueing systems out there? SQS, Redis, ActiveMQ, RabbitMQ, there are tons more queues out there that are far easier to use than Kafka.
I was comparing queues to HTTP endpoints for sending messages, I can't speak others besides Kafka and Redis.
Redis is an order of magnitude easier to work with but struggles under loads that Kafka has no problem with. Also every once in a while our Amazon managed Redis queue will have a bad failover or melt down because someone runs a bad command on it, but our Amazon managed kafka has been rock solid since we switched to it. When we ran Kafka ourselves though we definitely watched it melt down a few times because we threw too much at one broker or we made obscure config mistakes. And figuring out why a consumer isn't getting messages is always a pain, whereas redis is always a dream to use.
> If you want to publish a message, and you care what happens to that message downstream, things get complicated.
Definitely agree. The basic concept of Kafka is that the publisher doesn't care, so long as data isn't lost. If you need the producer to redo stuff if the consumer failed, then Kafka is the square peg in your round hole.
And yeah, the best use case for Kafka is, IMO, "I have to shift terabytes or more of data daily without risking data loss, and I want to decouple consumers from producers".
Our company is currently looking in to kafka and microservices. The problem we have is that the volume of actions going on has gone past what a single rails app with sql server can handle. When I look in to it, it seems like it would mostly be used as some kind of job queue where worker microservices churn through the entries in kafka to do some kind of data processing without needing sql.
But then there are blog posts saying kafka is a terrible job queue because you can only have one worker per partition and it's hard to get more partitions dynamically.
Sure, you can only one have consumer in a consumer group per partition. But partitions are cheap. And it's reasonably trivial to add more partitions should you find you need more concurrent consumers.
A very basic rule of thumb is, on an X broker cluster, have N partitions, where N / X = 0.
There's no harm in choosing something like 20 - 30 partitions for a topic, and increasing that when you need to scale consumers horizontally.
Dropping partitions is harder, but again, they're cheap, you won't need to for most use cases.
Only caveat to increasing partition count is when you're relying on absolute ordering per partition - key hashing can point to different partitions when you have 10 vs 50. It can still be done, but it requires a careful approach.
I take it you haven't consulted the references I listed second-hand? They are likely far more insightful than anything I can write here.
Before making my earlier comment, I read the start of the the Hiss and Franks paper is at https://web.archive.org/web/20140310113612/http://www.nacacn... to make sure the citation I gave wasn't misrepresenting the topic (it wasn't). Here's text from the abstract:
It "examines the outcomes of optional standardized testing policies in the Admissions offices at 33 public and private colleges and universities, based on cumulative GPA and graduation rates."
That is, UC isn't the first to do this, and we can look at real-world evidence from previous universities where test scores were optional, to gauge how useful test scores are and what effect they have.
It found: "Few significant differences between submitters and non-submitters of testing were observed in Cumulative GPAs and graduation rates, despite significant differences in SAT/ACT scores."
That is, SAT/ACT scores don't seem to affect metrics like Cumulative GPAs and graduation rates. (As I recall from elsewhere, they do correlate with first year grades, but that's a different and less important metric.)
Further: "Optional testing policies also help build broader access to higher education: non-submitters are more likely to be first-generation-to‐college students, minorities, Pell Grant recipients, women and students with Learning Differences."
That is, using SAT/ACT scores in the selection process appears to have measurable effect on the student population; reducing what is sometimes referred to as "diversity."
As to your correct observation, "it does not follow that predicts(A) > predicts(A ∪ B)", one of the issues is that college success is also correlated with other factors, including parental wealth. And success on the SAT/ACT is also correlated with parental wealth, who can afford special training on how to pass those tests.
We know this by looking at the early history of college boards, which emphasized the topics taught at prep schools (like Latin grammar) than public school, because college admissions preferred rich white male Protestants, who were likely to go to prep school.
To be clear, the SAT has done a lot of work to de-bias their tests, and I don't know enough about to topic to say anything what factors are actually involved.
But I don't need to, since the tests don't seem to be that effective in predicting college success.
Which, if I can attempt to boil it down to one sentence, would seem to be: "Tests don't seem to be a good predictor of college success, even when taking GPA into account. But they do seem to promote diversity."
And to which I would like to add, solely from anecdotal observation[1]:
"They also provide a second chance for a significant number of people. Even with a bad year, or even two, in high school -- this doesn't necessarily mean you are doomed to a lifetime of minimum wage servitude. If you have the intrinsic cognitive skills, you can easily do moderately well on the SAT, even without prepping."
I would like to consider diversity and the possibility of a second chance to be not merely incidental, but essential benefits to be striven for in the admissions process. Instead of simply stack-ranking based on the most easily demonstrable predictors of success.
[1] Including observations of some of the smartest and most creative and inspiring people I have ever met -- but again, that's anecdotal.
Umm, the other way around. "Tests don't seem to be a good predictor of college success, even when taking GPA into account. But they do seem to reduce diversity."
(You wrote "promote" instead of "reduce".)
What is the origin of your quoted paragraph ending 'If you have the intrinsic cognitive skills, you can easily do moderately well on the SAT, even without prepping.'?
I don't know how the source quantifies "a significant number of people" or "intrinsic cognitive skills". If "intrinsic cognitive skills" means "does well on the SAT or other forms of standardized testing" then of course there's no need for prepping.
And the problem is the other way around. Given the emphasis on the SAT, a single bad day (or if you have the money to re-test, a couple of bad days) - if on the day of the test! - can "doom" you.
I'm also curious about the quote because a lot of people have a good life without going to college and without "minimum wage servitude". Median pay for plumbers, pipefitters, and steamfitters is $27.08 per hour (https://www.bls.gov/ooh/construction-and-extraction/plumbers...), done by apprenticeship and/or vocational-technical training.
And don't tell me that plumbers (or Automotive Service Technicians and Mechanics, or Welders, Cutters, Solderers, and Brazers, or other careers which don't require a college degree) don't require 'intrinsic cognitive skills'.
> consider diversity
As a reminder, I used the term "diversity" in quotes, because I don't like the vagueness of the term. The specific sub-populations in the paper I linked to were "first-generation-to‐college students, minorities, Pell Grant recipients, women and students with Learning Differences".
> Instead of simply stack-ranking based on the most easily demonstrable predictors of success.
I don't know what you mean. The linked-to document about UC says:
] Training admissions office staff on the “comprehensive review” process that looks at grades, extra-curriculars, the socio-economic factors in which students grow up and other non-test criteria becomes even more important, Estolano said.
That sure does not look like simple stack-ranking to me.
> the most easily demonstrable predictors of success
FWIW, two easily demonstrable predictors of success are: 1) applicants from a rich family, and 2) applicants with at least one parent who has been to college.
I think you can see that using these predictors - and if you believe college is an important influence on life-time earnings - means the rich stay rich and the poor stay poor.
> how you get to be "America's preeminent living statesman"
Mostly by having been in the room during historical decisions, getting old, and refusing to ever go away even when your contributions proved to be disastrous.
Kissinger's plan to bomb rice production in Cambodia not only failed to stop the North Vietnamese, but directly lead to the rise of Pol Pot. It also starved millions of civilians. The guy is a monster.
The $500k fine is of course a complete joke, considering the seriousness of the matter at hand.
But "as the company enjoyed booming and historic sales with its stock price doubling, Amazon failed to adequately notify warehouse workers and local health agencies of COVID-19 case numbers", the state's attorney general Rob Bonta said.
"This left many workers understandably terrified and powerless to make informed decisions to protect themselves and to protect their loved ones."
Amazon lurkers out there - especially L6 and above - do you have anything to say about this? Anything at all?
Or are you so used to hearing such patent bullshit from your higher-ups (of the sort pasted below) that it just ... goes through you?
Nothing personal - I'd just really like to know.
"This settlement is solely about a technicality specific to California state law surrounding the structure of bulk employee COVID-related notifications," the spokesperson added.
"There's no change to, or allegations of any problems with, our protocols for notifying employees who might have been in close contact with an affected individual.
"We've worked hard from the beginning of the pandemic to keep our employees safe and deliver for our customers - incurring more than $15bn (£11bn) in costs to date - and we'll keep doing that in months and years ahead," they added.
It proves, perhaps, that writing is important to communicate and persuade others of your thoughts. But some people already know the truth i’m pointing at. I don’t need to justify it to those who know, and didn’t intend to convince those who dont - so i understand if you are not convinced ;)
The same way we ramp up on anything: by building a system that (actually) needs it. The XYZ (service mesh, provisioning, etc) will fall into place, from there. Or not -- if it isn't really needed.
But in general: learning by building stuff based on a real need -- even if that "need" is just to scratch the itch of some personal project you've always been wanting to do -- has about 10x the staying power (and "interview cred") as what one gets when starting from the mindset of "oh shit, I'm falling behind, I gotta cram on this stuff".
And on top of that: knowing when you really need X or Y, and when you don't -- is often as valuable as, or even more valuable than simply knowing how to do X or Y.
Roman plasterwork was applied in layers, and in first layer (which is quite rough) a diamond pattern is scratched. This is to increase the adhesion of the next layer (also known as the key). This is also true of lime plasterwork done through the ages.
As far as I can see only the first layer with the diamond pattern is visible one the left wall. The other layers may have not survived. Why would you bother to create a diamond pattern if you're not going to put more layers on?
In fact, the wall at the far end does show more layers, and also some decoration right in the the middle.
I wonder how they came up with that 1750 BC figure of 400 hours of work for one hour of light. Any ideas? It sounds ridiculously high, so I assume I'm missing something. What are they even talking about? Olive oil or tallow lamps? Firewood? Something else?
"One hour of light (referred to as the quantity of light shed by a 100 watt bulb in one hour) cost 3200 times as much in 1800 in England than it does today"
Oil lamps are not so bright, and you would need lots of them to reach the same brightness as a 100 watt bulb.
In other words, they still had light, but not so much and not so often.
Roman salves in some case could become full citizens. Slavery was different for different cultures treatment varied greatly from situation to situation. The closer the slave was to the rich the better they were treated in general
If you are a toxic moron who will start badmouthing their former colleagues
Or are dealing with a genuinely painful day-to-day work situation and are struggling to keep a lid on it.
I get the point that venting about the problems of one's current or previous environments is bad form, and not anything anyone really wants to hear about it. But the touch of schadenfreude in your characterization of the (often very real and painful) situations these people are helping doesn't exactly help, either.
How else should an uninvolved outsider view someone who can't keep drama out of an introduction call with a stranger besides as a toxic moron?
You can certainly voice displeasure with your current working arrangement tactfully (who's going to be able to verify what you have to say? nobody) but if you can't keep yourself emotionally under control which asked an open ended, indirect, and predictable question you're throwing out all sorts of warning signs that you should not be hired.
There are all sorts of situations in work life that are problematic.
'Badmouthing' one's former employer would more likely just indicate a general lack of tact, and it wouldn't necessarily even say that much about the current employment situation.
Not having a good fit in your current situation is just fine, you just have to communicate it thoughtfully to prospective employers, to not be cynical or petty about it, and not belabour the issue or let it dour the overall disposition of your candidacy.
Seriously, the one time I was in a situation where much of the team seemed hellbent on this "just put all in Kafka" idea (without really understanding why, exactly) the arguments they came up with were not too dissimilar from what you've shared with us above. It all seemed to come down to "OMG databases are hard, schemas are hard, our customers don't understand the data they're shoving at us. But Kafka will take care of all of that for us. Because, you know, shiny."
That said I'd still like to have a more ... balanced understanding of why Kafka may not necessarily be The Answer, and/or have more hidden complexity or other negative tradeoffs than we may have bargained for.