Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I feel like there are competent, competing visions talking past each other on the subject. There's kind of a spectrum:

1. Everything is a monolith. Frontend, backend, dataplane, processing, whatever: it's all one giant, tightly coupled vertically-scaled ball of mud. (This is insane.)

2. Everything is a monolith, but parts are horizontally scaled. Imagine a big Flask app where there are M frontend servers, and N backend async task queue processors, all running the same codebase but with different configurations for each kind of deployment. (This is perfectly reasonable.)

3. There are a small number of separate services. That frontend Flask server talks to a Go or Rust or Node or whatever backend, each appropriate to the task at hand. (This is perfectly reasonable.)

4. Everything is a separate service. There are N engineers and N+50% servers written in N languages, and a web page load hits 8 different internal servers that do 12 different things. The site currently handles 23 requests per day, but it's meant to vertically scale to Google size once it becomes popular. Also, everything is behind a single load balancer, but the principle engineer (who interned at Netflix) handwaves it away a "basically infinitely scalable". (This is insane.)

These conversations seem to devolve into fans of 1 and 4 arguing that the other is wrong. People in 2 and 3 make eye contact with each other, shrug, and get back to making money.





Logical separation, the modules, is what allows to preserve developer sanity. Physical separation, the (micro-)services is what allows you to ship things flexibly. Somewhere on the distant high end, microservices also play a role in enabling scalability to colossal scales, only needed by relatively few very huge companies.

The key problem of developing a large system is allowing many people to work on it, without producing a gridlock. A monolith, by its nature, produces a gridlock easily once a sufficient number of people need to work on it in parallel. Hence modules, "narrow waist" interfaces, etc.

But the key thing is that "you ship your org chart" [1]. Modules allow different teams to work in parallel and independently, as long as the interfaces are clearly defined. Modules deployed separately, aka services, allow different teams to ship their results independently, as long as they remain compatible with the rest of the system. Solving these organizational problems is much more important for any large company than overcoming any technical hurdles.

[1]: https://en.wikipedia.org/wiki/Conway%27s_law


> Physical separation, the (micro-)services is what allows you to ship things flexibly.

If you are willing to pay a price. Once you allow things to ship separately are are locked into the API Hyrum's law mistakes of the past until you can be 100% that all uses of the old thing are gone in the real world. This is often hard. By shipping things together you can verify they all work together before you ship, and more importantly when you realize something was a mistake there is a set time where you can say "that is no longer allowed" (this is a political problem - sometimes they will tell you no we won't change, but at least you have options).

Everything else is spot on, but I am feeling enough pain from mistakes made 15 years ago (that looked reasonable at the time!) such that I feel the need to point it out.


Scale? At what "scale" runs Linux, perhaps the most well known monolith software ever?

(And also the subject of perhaps the most well studied flamew^Wdiscussion about mono- versus microservice architecture.)

It only runs most of all servers and most of all the mobile terminals in the world. Where is that distant higher end, where microservices unlock colossal scale, exactly?

Architecture matters. Just not always the way you think. But it serves as catnip for everyone who loves a good debate. Anyone who gets tired of writing code can always make a good living writing about software architecture. And good for them. There's a certain artistry to it, and they have diagrams and everything.


Linux can scale to a machine with hundreds of cores; the largest is 1000+ cores IIRC. It's because it scales horizontally, in a way, and can run kernel threads on multiple cores. But a NUMA configuration feels increasingly like a cluster the more cores you add, just because a single-memory-bus architecture can't scale too much; accessing a far node's memory introduces much more latency that accessing local RAM.

It's easy to run a thousand independent VMs. It's somehow more challenging to run a thousand VMs that look externally as one service, and can scale down to 500 VMs, or up to 2000 VMs, depending on the load. It's really quite challenging to scale a monolith app to service tens of millions of users on one box, without horizontal scaling. But it definitely can be done, see the architecture of Stackoverflow.com. (Well, this very site is served by a monolith, and all logged-in users are served by a single-threaded monolith, unless they rewrote the engine. Modern computers are absurdly powerful.)


I also laugh at 30+ microservice projects where almost every service connects to a single oracle database with no sharding/partitioning. Inb4 aws dynamodb outage.

The number of services at my job that are just grpc wrappers for a database with endpoints that are only accessed by services that have their own connections open to the same database has been driving me insane.

Isn't that a microservices 101? No shared databases?

An important and common exception would be for performance critical reasons, as mentioned elsewhere. Say most of your code's in Python, but you rewrote parts in a faster language. There you're not necessaryily separating them for isolation or encapsulation purposes. It'd be as much of a mistake to force them to hit separate DBs.

Wasn't it two pizza teams and using the right tools for the right job?

Once you open the door to 3 it's a lot harder to stop the slide into 4. Devs love greenfield codebases and once that's an acceptable way to solve a problem they'll reach for it more and more. Especially if your setup has low friction to "just" spinning up a new service.

True, and some org-level discipline is important here. I'm all for polyglot architectures, but only for reasonably small values of poly. Like if you wrote the frontend in Python, but really some performance critical Rust, fine. Or maybe you merged with another shop that already had a bunch of Go, far out.

Resist the urge to roll out that Prolog-driven inference service that your VP Eng vibe coded after reading an article about cool and strange programming languages.


I generally think the right rule of thumbs is about 10 languages - but that includes the build system, CI system... Not everyone needs to know all 10, but you need a good set of options for that that need it. Most people should be using the same 2 or 3 of that 10, but a few teams need to do something weird and so need the other options that are overall rarely used.

That's very scale-dependent. I can imagine why a FAANG might need 10 languages, between Python APIs and Java services and Rust and Go things and a Node system and a smattering of Perl and some ETL in Scala and BI in C#, etc. A startup with less than 50 employees almost certainly does not.

Maybe I'd compromise on no more than, say, 2-3 language per department. If you're small enough that "engineering" is 1 department, then there you go: choose wisely. If you have a whole department for frontend, and one for backend, and one for ops, and one for analytics, etc., then you can somewhat treat those as encapsulation boundaries and have more flexibility.


Depends on what you consider a language. There are a lot of DSLs out there and once you add a couple of configuration files for Jenkins / Git{Hub, Lab} / Bazel / Buck and Vagrant / Docker / Bash and so on you're already at 5 languages and you haven't even added the lethal trifecta of HTML+CSS+JS for your web front-end or Swift+Kotlin if you only want to roll on mobile.

Yeah, that's a rabbit hole to be sure. I personally only count languages that you write a non-trivial amount of production executable code in. Examples of things I wouldn't count: HTML and CSS, JS in the browser, Bash used in the build system. If you wrote your webserver in Bash for some wild reason, it's included. Mise and Just and Makefiles aren't.

If you exclude HTML/CSS/JS/Makefiles that means you are giving your people an excuse to make a mess of them. You have to count everything. You can say that HTML/CSS/JS are all one since they are tightly coupled and you have to know all 3 if that is what it takes to get under the 10 limit, but you should make it clear you are breaking the spirit of the rule to do that just because of an external factor - and you are not happy about that.

If you write even one line of something you need at least one person who is an expert and takes responsibility to ensure the code written in the language is good.


Not department, project. If you can get to 10 across a whole company that would be good, but project unreasonable in the largest companies. Even with only 50 people in a startup staying under 10 will need some effort as there is always some cool new toy. (don't let them replace languages with frameworks either)

This is important. Sometimes you have multiple departments working on a single project (a large embedded system). Sometimes a department will work on multiple projects - the department needs to be careful that people don't need to know too many languages (which is hard - if often make sense for a department to develop for all apps, so you have to know the language of iPhone, android, Window)

People should stay mostly in their lanes (department), but they should have the ability to cross to others when that is needed instead of just throwing APIs over a wall and accusing the other side of using it wrong.


1 Could maybe make sense for a proof of concept, but realize that you're probably throwing it away (sooner better than later).

Start with 2, and think about the separation of deployable targets and shared libraries/plugins. You can eventually carve out separate infra for deployable targets when resource contention becomes a problem (e.g. DB load is affecting unrelated services)

3 is rarely the right first step for a small team

4 is never the right first step for a small team


Why is #1 insane (no horizontal scaling) if you only have 23 requests per day?

It's not insane. The best codebase I ever inherited was about 50kloc of C# that ran pretty much everything in the entire company. One web server and one DB server easily handled the ~1000 requests/minute. And the code was way more maintainable than any other nontrivial app I've worked on professionally.

I work(ed) on something similar in Java. And it still works quite well. But last few years are increasingly about getting berated by management on why things are not modern Kubernetes/ micro services based by now.

I feel like people forget just how fast a single machine can be. If your database is SQLite the app will be able to burn down requests faster than you ever thought possible. You can handle much more than 23 req/day.

In the not-too-distant past I was handling many thousands of DB-backed requests per hour in a Python app running plain PostgreSQL.

You can get really, really far with a decent machine if you mind the bottlenecks. Getting into swap? Add RAM. Blocked by IO? Throw in some more NVMe. Any reasonable CPU can process a lot more data than it's popular to think.


Anytime someone talks about scale I remember just how much data the low mhz CPUs used to process. Sure the modern stuff has nicer UIs, but the UIs of the past where not bad, and we a lot of data was processed. Almost nobody has more data than what the busiest 200mhz CPU on 1999 used to handle alone, so if you can't do it that isn't a scaling problem it is a people problem. (don't get me wrong, this might be a good trade off to make - but don't say you couldn't do it on a single computer)

It's not. It's kind of bonkers to pursue that when you have a lot of traffic, but it's a perfectly sane starting point until you know where the pain points are.

In general, the vast number of small shops chugging away with a tractably sized monolith aren't really participating in the conversation, just idly wondering what approach they'd take if they suddenly needed to scale up.


I'm not even sure it's bonkers if you have a lot of traffic. It depends on the nature of the traffic and how you define "a lot". In general, though, it's amazing how low latency a function call that can handle passing data back and forth within a memory page or a few cache lines is compared to inter-process communication, let alone network I/O.

The corollary to that is, it's amazing how far you can push vertical scaling if you're mindful of how you use memory. I've seen people scale single-process, single-threaded systems multiple orders of magnitude past the point where many people would say scale-out is an absolute necessity, just by being mindful of things like locality of reference and avoiding unnecessary copying.


if you have 23 requests per day the insane thing is wondering whether or not you've chosen the correct infrastructure, becuase it really doesn't matter.

do whatever you want, you've already spent more time than it's worth considering it.


most productive applications have more RPS than that. we should ideally be speaking about how to architect _productive_ applications and not just mocks and prototypes

Don't know if this is sarcasm or not. If you have 23 req/day, then there's no tech problem to solve. Whatever you have is good enough, and increasing traffic will come from solving problems outside tech (marketing, etc)

"Tightly coupled" is the sticking point. Tight coupling is bad architecture whether you have a monolith or microservices, which is the general point of the article.

Or there's 3.5: separate services where it makes sense, but not necessarily small in number. "Makes sense" would entail things like, does it have distinct resource utilization or scaling characteristics, or do you want to enable your service to more gracefully degrade if that module becomes unavailable.

(This is basically the definition of 3 without the implication that it will be rare.)

As opposed to 4 which is about proactively breaking everything down into the smallest possible units on the expectation that the added complexity is always worth it.


Lots and lots and lots of people make no. 1 work. The issue isn't whether it is a monolith or not. The issue is whether it has good or poor internal architecture.

No 1 can scale to 1000 requests per second. One machine, one db, one application. It is totally doable because so many people do it.

It's just not sexy and doesn't pad you resume. It's boring stuff that just works.


I wish our app only got 1krps. Maybe these conversations should be bucketed by the traffic volume people are handling - it really leads to completely different designs.

you can get pretty far with (2) and (3), haven't really understood the need for (4) unless you're FAANG

4 doesn't really work for serious big load applications either: nobody will be able to understand the codebase. You want 3.

During my days at mojang we did some variation of 3. We had ~250k requests/second, and handled it just fine (we had 4 nines availability forever chasing the fifth, and sub 20ms response times)

I think even among those who do see big loads, few see as much malicious traffic as we did. This was one of the arguments for a micro service architecture. If a DDOS took down our login service, already logged in players would be unaffected (until their tokens expired anyway)

Well, that was a long winded way to say, 3 is about as micro as you want to go. I've only seen 4 done once, and that site actually went under whenever they had more than 30 request per minute. (Admittedly they had made a bunch of other really bad decisions not covered in the above description, but having ~30 services on a team of 12, in order to handle a handful of requests per hour was certainly their biggest mistake.)


Facts! #1 is not insane as long as you keep your internals modular (all-in-one deployment doesn't mean ball of mud... you can avoid putting service calls into your domain objects or your data plane code). And you can go from #1 to #2 once you see the need and slice the services out that need it (such as decoupling async batch processing into a 2nd service that shares the domain and the data plane code and does not include the front-end).

In fairness, it being a tightly coupled ball of mud is part of my definition of #1 here. What you describe is basically #2 waiting to happen, just that no one's needed to do it yet.

Makes sense, I agree then that #1 as you exactly define it is insane.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: