I can confirm. I've had quite a few projects that made it to the front page of HN and handled the traffic like cake. All of them ran on 5$ digital ocean droplets.
I accept some projects are more resource expensive than others, but majority of the time you can get away with a bit of asynchronous responses + scheduler/queue to spread the load horizontally over time.
Unpopular opinion: I blame the new age devops culture that made cloud app deployments unnecessarily complicated with k8s and cool new tech (that'd get them high profile jobs.) I've never come across a devops person who'd say "Hey, that sofware is too simple to prematurely scale for sudden spikes of irrational amounts of traffic, so why not just deploy it on a cheap vps?"
I'm convinced it's the hiring that has shaped the scene.
If you're hiring, you want to be able to add/replace people as easily as possible. If you're being hired, you want to charge as much as you can.
And to satisfy those two demands, the current web stack is great. Almost like it's been built for it.
It's got very little to do with the tech itself, a lot more to do with market dynamics. That's the problem it's trying to solve.
The root cause is the decade of zero interest rates which led to companies intentionally overcomplicating their stacks to justify neverending VC rounds. Early prospective employees took notice and adjusted their skills as a result.
The dangerous part is that in the meantime we've got brand new and budding talent that actually took this charade seriously and effectively got high on their own supply, seeing this performance art as the actual normality even in a post-ZIRP world where tech/engineering is primarily there to drive business profits and not a VC mating ritual.
The only winners are the cloud/infra/tooling providers who got an entire generation of "engineers" to do perpetuate their con without even realizing it.
> The root cause is the decade of zero interest rates which led to companies intentionally overcomplicating their stacks to justify neverending VC rounds.
Extended hot take: the true customer of a company are the current and prospective holders of capital, whether VC, private investment, or public markets. Keeping them happy is the main goal of the exec.
If your owners' portfolios include real estate, you push "back to the office". If it includes container startups, you deploy on k8s. Your "strategic partnerships" are determined by how much your owners care about propping up investment A investment B.
Except zero interest rates stopped almost four years ago, but we're still seeing vcs ape into ai this last year.
So clearly the underlying cause of this squandering of resources must come from somewhere else.
I point the finger at the rising class of super rich who don't know what to do with their money. Why do they exist? Why is taxation seemingly not applying to them any longer?
> Why is taxation seemingly not applying to them any longer?
Investing returns was never taxed, as long as they don't use their wealth for consumption they wont get taxed. This is a good thing since it encourages investments over excessive consumption, building a startup is much better than buying another yacht.
That's a very simplistic view.
Like with all things, there is a point where more investment money does not make things faster or better. In fact, since pretty much everything is still dependent on people and social groups, if all the relevant people are already busy, you won't get much for your money.
And too much money chasing too few choices creates bubbles, wich is exactly what has happened. And not just in tech, it is the problem of real estate market in many places of the rich world, and similar bubbles can be found in many markets.
I believe it is actually one modern problem, a good part of the rich world is getting paid a lot more than there is really a need for; and since we glorified "investment" essentially has a way to make even more money; a too big amount of money is not doing much instead of being useful right now.
If you look at society as a whole, spending every ressource every year would be bad, but also setting too much aside for later is actually wasteful.
It's a lot like stocking a pantry, you need to put enough so that you don't run out of food easily but if you have stuff in there that hasn't been used 5 years later it shows the "investment" was too much.
Food gets stale and loses nutritional value over time (even canned) but money also gets stale in a way.
As for the startup vs yacht I would say it largely depends on how useful the startup can ultimately be to society and how many people it can employ (how much actual value is being created).
Because even though yachts have fossil fuel consumption problems, it actually takes a lot of people to make them, maintain and service them actually. It can potentially have a better social impact than a startup...
And this is the root of the issue, people getting a lot of money actually have a responsibility to redistribute it (in an intelligent manner preferably); but what is happening is that people try to get even richer even though it stops having much value to anyone.
All this money that is aping into an ai gold rush could have been taxed to fund schools and other broadly useful things.
I'm very much a free market capitalist, but seeing all vcs ape into ai at the same time does not make me think there is value in have so many super rich people in the world.
I hate it when discussion devolves into high taxation vs low taxation or high regulation vs low regulation. There must be a thing such as the appropriate type of taxation and the appropriate type of regulation. And that depends on what we want as a society. Do we want a big class of super rich that don't know what to do with their money so they squander it on an ai gold rush? Or do we want to tax them so we can put that money to use that we all benefit from?
I think you're mixing up cause and effect. Low interest rates caused investors to chase higher returns in things like VC, VC had pressure to make investments, startups had easy money, so they were less scrappy and had more funding for overengineering.
I can’t confirm but it does look like a significant portion of current software engineering zeitgeist is developed for compartmentalized careerism with a good splash of non-value adding complexity.
If that keeps food on the table and the wheels of business spinning, fine, but I’ve seen this leading to situations where a simple thing expands to a role, and then department and … oh, hi, enterprise software, it’s you!
Hipster devops was overcomplicated long before kubernetes.
In my 1st job we had autoscaling on AWS to handle peaks… except that our servers took about 30 minutes to do the upgrades, download gcc, compile all the needed python modules… We'd always reach the autoscaling limit because the new servers weren't doing anything at all. All of them would be downloading and compiling the same things.
I was very junior, but I told my boss if we shouldn't maybe use base images that already contained everything, instead of a default blank ubuntu for the servers.
The boss said no, because that wouldn't be agile.
The whole thing of using yaml files to configure servers already existed, it worked terribly.
It was basically meme development there. Including using mongodb for no reason at all, and using it badly so that every write was actually moving thousands of records around.
> I was very junior, but I told my boss if we shouldn't maybe use base images that already contained everything, instead of a default blank ubuntu for the servers.
Honestly, nowadays Docker and other OCI containers do this pretty well. Spinning up new instances and even provisioning nodes has become very easy, in addition to load balancing features those provide.
12 factor apps are also an amazingly concise way to develop and manage multiple services without going into YAML hell: https://12factor.net/
The problem is that the management layers around containers are typically overcomplicated to the point of comedy.
Kubernetes might make sense with something like K0s or K3s might make some sense but is still inherently a full time position to run on prem (with updates and observability) and will cause headaches. Hashicorp Nomad is better, but is meant for scales greater than those of most companies. Docker Swarm hits the spot (especially with something like Portainer), but nobody seems to actually care because it's not a trendy piece of tech.
The day Docker Swarm dies is the day I go back to writing PHP in a shared hosting environment in protest.
Docker made doing the right thing really easy, to the point where doing that is a no brainer. I guess there are people out there who might still use OCI containers as glorified stateful VMs, but luckily the majority of documentation and examples out there build proper images that have everything included by the time it actually runs and you have to actively go against that if you want the other approach.
The problem with devops replacing the old title “sysadmin” was that the dev part dragged in the worst thing about developer culture: the love of complexity and the tendency to build massive towers of it.
Sysadmins usually avoided complexity because their attitude toward it was more sensible: it’s expensive, fragile, and tends to actually multiply failure modes.
I also blame cloud marketing. This stuff is a gigantic money printer for cloud companies. It’s in their interest to encourage as much over engineering as possible, especially if it locks you into things like Kubernetes that are hard to run and thus usually used as services.
I swear, I've come to identify myself simple stacker. Complexity should be avoided wherever reasonably possible. But where it's not, it's best left and maintained with the devs who want it.
If you guys don't see the value of being able to click on a button on a website to deploy, perfectly, everytime over some guy sshing into the box and running git pull I dunno what to tell you.
The discussion here is not the method of deployment, but the infra architecture.
You don't need k8s, containers, and multiple cloud instances to automate deployment.
It's perfectly possible and simple to implement a button to deploy on a single machine.
Heck, on OVH or Hetzner, you can have a dedicated bare metal machine with many cores and RAM exclusive to you, cheaper than famous cloud instances. These bare metals will handle what takes sometimes hundreds of containers to handle, with a much simple and easy to maintain infra.
You can also have ugly manual ad hoc Kubernetes workflows where random devs hack YAML in prod directly. I’ve even seen people ssh into running containers and change things.
This isn’t any better than Joe sysadmin and could be worse since the complexity is higher and there’s a greater chance to do more damage with one change. Adding complexity makes bad process worse.
The last time I checked docker was libcontainer and not running lxc any longer.
Libcontaier lxc all use the same api calls to the kernel... so in effect your still correct (mostly)
They all have overhead, if you're starting to use their features... cgroups can have some very funny impacts on app performance more so if your running containers ontop of a linux install over a hypervisor vs nix on hardware.
Why would someone want to self-torture into k8s if they only need a single machine?
If you need isolation for multiple services, just use a container... I don't see an orchestration need that justifies using k8s on a single machine. For such context, it's too much hassle for very little value in return.
Setting up a testing & deploy via a CI script is basically free. AWS gives away CodeDeploy for free. Ansible is open source. I learned all this stuff working on open source stuff where platforms like GitHub give you free compute time.
> If you guys don't see the value of being able to click on a button on a website to deploy, perfectly, everytime over some guy sshing into the box and running git pull I dunno what to tell you.
Maybe clicking a button that sshes into the box and runs git pull?
PSA: please don't follow a manual checklist over ssh. At least write an Ansible playbook that does those things repeatably, or even better, script idempotent changes in an .rpm or .deb to install.
Yeah my search engine, back when it was hosted on a PC in my living room off domestic broadband would shrug off HN[1][2] without the fans even spinning faster than usual.
And like, internet search should be more resource heavy than the sort of websites that regularly do keel over to HN. Every query is like up to 50 MB in disk reads.
The shift to cloud-based workloads (with oversubscribed CPUs and mandatory networked storage) means that a lot of people lost track of just how fast physical hardware (even mid-range consumer-grade) has become.
That’s also a deliberate thing - cloud providers have consciously avoided increasing the per-unit performance of a vcpu, you still get the same Sandy Bridge performance in 2024 as you did in 2012. They actually go all the way to the extent of having AMD design them smaller, higher-density “cloud” cores that don’t clock as high, to avoid ever increasing that vcpu unit.
Not so unpopular. I worked at a startup where they wasted huge amounts of money on a massively complex set up using kubernetes (and this was the early days of kubernetes). Despite this, or maybe because of it, our AWS bill was killing us.
The irony was that the cloud was supposed to be the simplest bit. All the computation and cryptography occurred on the mobile clients, the cloud was really just to provide storage. The same team then rewrote the mobile clients to have a "beautiful" API that took 100s of times more resource than the original code.
I guess they just loved complexity for the sake of it.
Maybe that still counts as an unpopular opinion in the large, but I’ve never seen it be an unpopular opinion by the standards of really effective teams or really productive hackers I had the privilege to be around at times, and it seems to me that your good idea is coming back in a big way recently.
When I worked on teams that denominated egress in terrabits/s, TPS in millions or higher (sometime much higher), and daily warehouse ingest in petabytes it was just the default thing to spin up an instance and a hot standby (sometimes per region or something) if that’s all that was required, and containers and whatnot were used only in the context of bare-metal: you do usually want one level of indirection so containers are really useful if you’re racking metal from a small menu of SKUs.
But as for why it ever became conventional wisdom to wrap a venv inside a container on a hypervisor, often with multiple images composed on a dizzying array of low-friction SKUs?
>I've never come across a devops person who'd say "Hey, that sofware is too simple to prematurely scale...
I'd argue it's just as much the dogmatic nature of all things software-related, together with the attendant shiny new object syndrome.
See SPAs, NoSQL, micro-services, etc. There's generally a use case for all of these, but they tend to be too easily extrapolated into, "if you're not using these, you're not doing it right."
> I've never come across a devops person who'd say "Hey, that sofware is too simple to prematurely scale for sudden spikes of irrational amounts of traffic, so why not just deploy it on a cheap vps?"
I’ve lost startup jobs for basically this. As in “let’s focus on our mvp instead of adopting k8s and discussing “the definition of done” for literally six hours a week”. That one ran out of money and folded, sadly. They had tons of potential and a few bad hires.
But also on the other side, my time is expensive, engineering time is expensive, downtime is catastrophic for a new business. Some people want to spend 4 5 or 6 figures to save twenty dollars a month. Imagine my time is worth $500 an hour, if the effort to make something cheaper doesn’t pay for itself in a year then it’s probably a waste of resources.
Once when working in devops I asked my team lead why we didn't just move everything to Heroku, rather than reinventing a bad in-house version of their features.
I was firmly told not to suggest it again unless I wanted to put all of us out of a job.
That seemed ridiculous to me — we had a laundry list of ways we could help the business if we got basic platform stuff off of our plates. But I sure learned something about how incentives affect otherwise-good engineers.
The current DevOps culture comes from the FAANG guys who really are getting a bazillion requests per second. In the last decade I worked for Amazon, Avalara and Audible Magic. None of these could build an app around a Digital Ocean droplet.
But I think you're pointing out there are PLENTY of useful webapps that can run on a minimal system.
I'm just curious where the middle ground is. Cause I think we've all seen sites blow up after a reference on HN or SlashDot or whatever and by the time you get there the only thing you see is a stock error from the PHP engine saying "My MySQL Engine is Melting" or somesuch.
It would be very cool if one could write an app using {NODE|Ruby|Python|Whatever} and have the infrastructure around it notice when things spike and do some magic under the hood to spin up new containers in geographically distributed data centers and scale up a simple persistence tier.
That way you could move forward with a SIMPLE application instance and not freak out that you'll disappoint new users if there's a spike in demand. You know, sort of like what AWS Lambda was supposed to be.
Hmm... I think I might have come up with a plan for my next startup. Thank you for speaking your truth.
[Edit: To re-iterate, I think I'm saying it's just as wrong to think a small, simple app needs Amazon-level redundancy and immediate scalability as it is to say Amazon could run on a single machine. But... the "Slash-Dotted Website Goes Down" scenario is real and there should be SOMETHING the industry could do that's easier on people than to force them into a custom AWS ECS solution across multiple continents.]
sure. but there's a middle ground there somewhere. it's not just service blackout after 5 requests per second or a $10k monthly aws bill. if your budget was $500, you could set off alarms or auto-shutoff after some threshold. and turn on syn cookies if you're worried about the kiddies.
the point is... there's a middleground there somewhere and I think different people put the cost/availability tradeoff at different places.
> I've never come across a devops person who'd say "Hey, that sofware is too simple to prematurely scale for sudden spikes of irrational amounts of traffic, so why not just deploy it on a cheap vps?"
I work on the Ops side partially, and our platforms are defined by the highest level of complexity we need to support. That is to say that your basic web app can run on k8s, but that auto-scaling, messaged-based behemoth that 85% of the business flows through will not.
So I can either support k8s, or I can support k8s _and_ old-school rsync deployments to VPS'
The complexity of running a basic web app on k8s is entirely too high, but the cost of keeping an entirely separate deployment/monitoring/oncall/permissions stack for VPS' is worse. Better hope your monitoring vendor has an agent that can run in a VPS' on the OS you want, or you're back to running Nagios yourself.
If at least 1 group is going to write an app that runs on k8s, you might as well run most of the company on k8s. Otherwise you're either going to manage 2 separate stacks, and/or try to write an abstraction layer that will probably itself resemble k8s.
> I've never come across a devops person who'd say "Hey, that sofware is too simple to prematurely scale for sudden spikes of irrational amounts of traffic, so why not just deploy it on a cheap vps?"
Was a devops engineer at my previous job.
We already had k8s clusters setup, pre-made CI templates and pre-tailored helm charts (along with monitoring and much more). All those things you (a developer) could mostly clone, slightly customize and ship both to a development k8s cluster and to a prod k8s cluster (with all the safety nets already in place).
Creating (and maintaining) a single vm for a pet project is way more work than using the pre-made and pre-customized and curated toolkit.
This was at a 100+ developers organisation.
If you think you could easily get off with a single vm then you've never seen devops done right, i'm fairly sure.
EDIT: I probably fell for the bait, but the post i'm replying to really made me remember why we went on a killing spree in order to eradicate everything that was not k8s at my previous job and removing as much developer access to prod as possible. Some idiot developers think they know better, usually end up re-inventing a square wheel that breaks as soon as it's not running on their laptop anymore.
IMO the vast majority of software development happens in much smaller organisations than that. Dev Ops still matters there, and the requirements are different.
I am working in an organisation with one and a half developers. I am lobbying that the third of fourth developer concentrates on Dev Ops here.
It is very important. Look at the back up/recovery procedures at your organisation. Has there ever been a fire drill? Are you sure the back ups are sound? Can you recover? What if data corruption occurred a week/month ago. Have you a back up of the uncorrupted data?
That is a very unsexy aspect of Dev Ops, and without somebody dedicated to the job, your backups will not be any of those things.
> I am working in an organisation with one and a half developers. I am lobbying that the third of fourth developer concentrates on Dev Ops here.
not sure you're doing the right thing here. you might want to consider hiring some kind of linux guy that can can do some basic devops. or maybe hire some devops contractor that can work with you on a part-time basis and "curate" some specific aspects of your operations.
I've seen this done in the past: so you've got this consultant on retainer, and you tell them something like: "i've got this issue, can we do something about it? our constraints would be x y z" and the consultant would make 1-3 proposals (different approaches, different pricing levels, different ETAs etc) and then you agree on what gets done. The key aspect here is that a good devops consultant can get stuff done very quickly.
> It is very important. Look at the back up/recovery procedures at your organisation. Has there ever been a fire drill? Are you sure the back ups are sound? Can you recover? What if data corruption occurred a week/month ago. Have you a back up of the uncorrupted data?
Yes (to all questions). I ended up in working in heavily regulated environments. All the things you mentioned were not just niceties, but mandatory by legal requirements.
> That is a very unsexy aspect of Dev Ops, and without somebody dedicated to the job, your backups will not be any of those things.
That's basic system administration. Most devops engineers are former sysadmins.
> IMO the vast majority of software development happens in much smaller organizations than that
I guess it's okay to have an opinion about that, but this seems like something that should probably be a fact. Unfortunately, not sure I can find reliable stats on sizes of engineering organizations.
The thing is, while there are obviously lots of small companies, there are also some really big software development organizations out there. A company like Netflix has 2,500 engineers. Microsoft employs over 100,000 engineers. Walmart employs over 15,000 software developers.
You need a lot of little 10-50 engineer dev shops to add up to the combined size of the engineering orgs of the Fortune 500.
According to https://www.statista.com/statistics/507530/united-states-dis..., at least, 29.4+25.8 = 55.2% of the US "IT Industry" workforce are employed in companies with >100 employees. That's a Long way from telling us about sizes of engineering orgs though.
But still... I'd be careful assuming that the vast majority of developers are in organizations of less than 100 people.
Why do you assume that only three US landscape is discussed here?
Plus I'm not very sure those statistics are reliable.
Anecdotal evidence: I have 22 years of career and I've only worked in big organizations twice, for the total of a year. Everything else was much smaller.
But you’ve got 100 developers. That’s not “most web apps”, that’s firmly in the set of companies that need standardisation and potential scale, where the devops team makes the life of many devs far better. When it’s just a few devs and cash is limited, the business doesn’t need the complexity. It’s mostly for ego and branding.
When the company is set up well on k8s, then choose k8s. If the company is set up well on VPS, then choose VPS.
If it got neither, I'm unsure what's the better way to go on a greenfield.
k8s has nice tooling, but part of that is required because it is massively complex.
But with managed k8s providers (e.g. GKE with autopilot) you can just put in proper CPU and memory limits and essentially get yourself a VPS provisioned without having to worry about anything by k8s measures.
If you have different pricing or location constraints, then it might be better to go with different models and some custom orchestration.
Yeah, I'm an independent, selling these solutions. My minimum stack costs about $500/mo in AWS costs- and you could save 90% of that. But then you'd be paying me $20k more to set it up again when you expand to another dev team, while this way I can add your second dev team for $10k and practically zero additional AWS cost.
Going straight to overkill is good business sense for any company that's going to make it to a medium-sized business, and it's not going to be the differentiating factor for a company that burns out.
> All those things you (a developer) could mostly clone, slightly customize and ship both to a development k8s cluster and to a prod k8s cluster (with all the safety nets already in place).
How did/does the devs create, test new code and debug issues? Can they do that locally on their local laptop? If so, how?
>> How did/does the devs create, test new code and debug issues? Can they do that locally on their local laptop? If so, how?
I used to buy the idea of this when there were monoliths.
Then I had the joy of running a shop with dozens of web properties.
Local dev became untenable. My systems admin was an early adopter of Xen. That shop ran like a dream... devs could come in and have a new environment update in place and just start to work. Staying in sync with prod was never a problem.
By making systems guys keep devs fed, by making devs work closely with systems folks you get better software. Containers just hid developers shitty decision making in a wrapper that Systems folks can tolerate.
And how DO you debug... cause what you do to figure a problem out locally is not how you trouble shoot when the shit hits the fan in prod. These tools should be the same, sane and well used and loved by everyone. Local debugging is part of the problem, even more so if your service based app that lives and dies on the wire.
Then I assume you have a custom Kubernetes LB that can handle non-HTTP TCP and UDP traffic because you choosing Kubernetes and the design restrictions that comes with it does not affect how the dev solves problems?
The underlying orchestrator definitely affects how the software needs to behave and is definitely not irrelevant.
> Then I assume you have a custom Kubernetes LB that can handle non-HTTP TCP and UDP traffic because you choosing Kubernetes and the design restrictions that comes with it does not affect how the dev solves problems?
nginx-ingress-controller does that. no custom stuff required.
I think that the $5 tier is a little tight for a web app as opposed to a crud app, but 2x $40 tiers is enough for a decent amount of traffic, with one as a failover.
The problem is that containers are excellent, and IMO there's a gap in the market between "I want to run one container" and "I want a fully managed k8s cluster"
For my personal projects (that right now seem to revolve around April 1st jokes for people in my industry) - I've found a good middle ground to be K3s on a single Hetzner VM that I scale up and down if I think more/less people are going to be looking at it (i.e. between the 5 - 10 dollar/month price range)
I set this up with Terraform and some bash scripts. Infra as code is just too convenient to pass up on for something that I might not come back to for months at a time (and then ask - how did I set that up?), and containers too mean that I can play with some shiny fun technology one time, leave it in a box, and then come back to a clean-slate for the next thing that I want to play with without having to pay for a new VM etc.
The other great thing about containers is that you can have entire isolated stacks for your projects, and you can stand the entire thing up with `docker compose up -d` in a matter of seconds. Gone are the days of accidentally connecting to the wrong database with WAMP.
> The problem is that containers are excellent, and IMO there's a gap in the market between "I want to run one container" and "I want a fully managed k8s cluster"
A single container: Docker
A few containers on the same node: Docker Compose
Containers across multiple nodes with load balancing and networking: Docker Swarm (with Portainer to manage it)
Alternatively, Podman is also pretty nice.
If you need something that runs not just containers in clusters: Hashicorp Nomad
If you want to go for Kubernetes but without it being too hard: K0s or K3s or MicroK8s or RKE (with Portainer or Rancher to manage it)
I don't think I ever got around to making it self healing if a container dies, but it does support gitops style deployments through a cronjob / conf repo similar to argocd
It's been running happily on a <$10 / month aws lightsail instance for a few years now, though tbh I'd still reach for k8s for anything serious
All my teams infra runs on fargate + ECS so I'm pretty familiar (and happy) with it. Running in fargate requires knowlege of AWS: VPC's, public and private subnets, security groups, ECR, ALB's, Target Groups, IAM policies + roles. Then, when you want to add a databse, you're back into all of the above, plus the database specific ones like database subnet groups.
When it comes to health checks you have docker health checks, container health checks, load balancer health checks - all of which are configured separately.a Not to mention, doing it "properly" where your task doesn't have a public IP and is only accessible through your load balancer - [0] might be one of the most infuriating responses on stackoverflow.
Meanwhile, with DO droplets, it's pretty much "here's a registry URL, and some configuration in YAML, go for it".
A ton of the problem is people turning everything into dynamic this or that. Wordpress culture, basically.
Most pages can really just serve a jekyll/astro static-generated page and be fine. But if you shove a database and php in the middle, it’s gonna be multiple orders of magnitude slower.
Kubernetes is/was a way to fight off walled gardens from cloud providers. The other path would have been to learn the bespoke implementation of each cloud provider depending on what that employer ended up using.
Kubernetes was at the right place, at the right time just as AWS was trying to force feed people their own proprietary solution, as Azure was trying to wall off people into their own walled garden, as GCP was being Google just not giving a damn about any other usecase than what works great at a massive search company.
With Kubernetes, developers can learn one API to deploy their applications and hopefully it works on AWS, Azure, GCP, DO, OVH or a laptop at home.
So that way, developers can learn one thing and transfer their knowledge at an employer that hosts on AWS, and then another that hosts on Azure and so on.
This is in contrast to the experience of a Python developer who's mastered FastAPI/Flask/SQLAlchemy and feels absolutely lost in a Django project or an Angular developer who stares a Next.js project wondering what the heck is happening and how it all works. Neither a Next.js or an Angular developer would start off with an AWS Amplify solution if they could help it.
> With Kubernetes, developers can learn one API to deploy their applications and hopefully it works on AWS, Azure, GCP, DO, OVH or a laptop at home.
That's one of the lies developers tell themselves, because at some point you're going to need to manage Accounts, VPCs and ELBs, Certificates, Security Groups, IAM policies, and everything else. All of those underlying primitives that are required and have massive differences in behavior that are expressed differently in GCP, Azure, and AWS.
On top of that Kubernetes is itself a walled garden.
You will inevitably end up cargo culting the entire ecosystem of plugins, like Cilium and Helm and so on. All of this IaC is meaningless outside of Kubernetes. Soon enough, you have 10,000 lines of YAML configuring highly proprietary infrastructure with multiple variants for each cloud. At some point you will have to rewrite controllers to add functionality or correct bugs the upstream maintainers don't want to prioritize, and so on.
Your "knowledge" of the stack ends up being the ability to orchestrate 15 levels of templated YAML. Eventually your company ends up hiring people who only know how to copy/paste YAML, and lose institutional knowledge of how underlying systems work. You didn't break out of the walled garden, you created an elaborate prison. And Amazon and GCP and Azure love you, because you're their #1 customer. The more complex you make it to deploy a CRUD app the more they profit.
> I've never come across a devops person who'd say "Hey, that sofware is too simple to prematurely scale for sudden spikes of irrational amounts of traffic, so why not just deploy it on a cheap vps?"
It Is Difficult to Get a Man to Understand Something When His Salary Depends Upon His Not Understanding It
> Unpopular opinion: I blame the new age devops culture that made cloud app deployments unnecessarily complicated with k8s and cool new tech (that'd get them high profile jobs.)...
Ah yes, time for the annual debate on the complexities of Kubernetes versus the unparalleled genius of custom scripts that seem to work...sometimes. Because reinventing the wheel is always superior to something with a standardized API.
And let's not forget the sheer elegance of a homegrown scripts that rivals the structured approach of Kubernetes. A true testament to intuitive design.
Sure, Kubernetes might have a few minor benefits beyond 'scaling out'. But honestly, who needs the ability to manage complex applications with any semblance of ease?
> honestly, who needs the ability to manage complex applications with any semblance of ease?
The argument being made here is majority of applications are rarely complex and hence dont require managing that complexity.
A simple webservice fronted by a simple reverse proxy like Caddy running on a single "modern PC" can do wonders without any Kubernetes needing to get involved.
At Standard Ebooks we serve a respectable number of page views and ebooks each month - and have been on the front page of HN three or four times - all of it done with a single 4GB VPS. And the only reason we upgraded to 4GB from 2GB is because we needed more RAM for the server to build the extremely large Decline and Fall of the Roman Empire ebook - if it weren't for that, our 2GB server would still have been just fine for all that traffic.
More applications should consider git as a content management database. It's great architecture. Statically serving files built by a CI process running on the server is very tidy.
So sure, you can run your stuff on a single server, but you're relying on a bunch of other people running much more sophisticated services on a lot more infrastructure in order to do it.
Sure, at some point you're going to be relying on some 3rd party somewhere. We use a VPS and not a bare-metal hand-installed rack, and we rely on an electrical company and not a hand-turned crank to power our servers. As far as email goes, It's simply not possible to self-host transactional email in 2024 if you want it to arrive in an inbox and not a permanent spam blackhole; this is a people problem and not a technical one. Likewise, another people problem is that it's not possible to accept money online without involving a 3rd party service like Fractured Atlas or Stripe or PayPal.
(Moving away from GitHub towards a self-hosted Git solution, and away from Google Groups to a self-hosted mailing list, is actually on our long-term todo list[1]).
All those things don't mean one can't run one's web app on a single tiny server, like we do. I still argue that outsourcing the basic fundamentals of one's web app, like the OS, runtime, or database to some cloud service, or resorting to flavor-of-the-month frameworks or containers, or doing silly things like using Javascript to render one's entire frontend, often simply result in complexity, slowness, and bloat.
But my point is... those are all web applications too, and they don't have the option of outsourcing everything. Someone has to build a system that does more than just serve static files.
The claim that 'the majority of web applications can run on a single server' is kind of belied by the example of a site where not even the majority of sub-applications that are required to provide the full functionality of the system are running on a single server.
Of course! Stripe, Paypal, Postmark/Sendgrid/whatever are not part of the majority I'm talking about. (Although Fractured Atlas, which is merely a wrapper for the Stripe API, is; we use it because it solves a people problem, not a technical one.) There are certainly projects and businesses that will require many high-power servers and more complex technical machinery than basic LAMP. However most people on HN are not developing the next Stripe, even if they don't realize it yet.
Those other workloads don't sound particularly taxing to me. Many get very sparse traffic; hosting a donation page, web newsgroup/discussions, and user management need not drastically scale up the serving footprint here.
Those hosted services mainly are about not needing to pay the human management/ownership costs.
The real win this architecture has is outsourcing its content management database to GitHub. That's where all the complicated stuff like permissions and authentication and change notifications and approval workflows, that make up the bulk of complexity in most bespoke business applications, as well as all the tricky stuff of managing the actual files that the contributors are managing, is all happening. It's a smart decision! There's a lot involved in running a system like that reliably - outsourcing it is a great idea.
If they were to do that in house, by switching to a self-hosted GitLab, say, well... that could be run on a single machine (https://docs.gitlab.com/ee/administration/reference_architec...) at the cost of having to manage scheduled downtime for upgrades. If the user base or activity level grew beyond what that server could handle (and given that the intention of this project is to cultivate an ever growing community of contributors that might be a concern)... the next stop up is an eight node system: https://docs.gitlab.com/ee/administration/reference_architec....
Again though, that's not actually computationally expensive. It's just hard to get right & maintain, with few do-all open-source offerings ready to stand in. It's not like having 10x the computing resources at hand would highly motivate someone to bring it onboard: it's a difficult thing to manage well, but not actually computationally expensive to do so.
A lot of resources go to scaffolding for containers, runtimes, caching, etc. 4gb would make for a chugging experience if you run a Java spring boot backend for example. But an old school php + postgres stack would be fine, or modern dotnet, rust, etc. And honestly I'm not sure it matters since ram and compute is cheap for most small-medium sites (at worst maybe $30-50/mo vs $10-20/mo).
VPS services are pretty cheap these days. Hetzner, digital ocean, genesis, etc. Or you spend a couple grand to build a dedicated machine in raid6 and only pay utilities.
Of course. The entire point is that so much of that is just unnecessary for 90% of web projects. Most projects that are basically a front-end website backed by a local DB can get by on a 2GB VPS if they embrace classic web tech and don't get sucked in to whatever framework/cloud service/container craziness is hot this week. Once they start hitting millions of page views a month, we can talk about upgrading to 4GB :)
If you can give examples on how does Rust leak memory, that would be informative.
I had two services run uninterrupted for months in 2GB RAM containers. Memory increased steadily the first 5-10 minutes and then stayed there indefinitely long.
Oh man I absolutely love the work that you guys do. I'm actually in the process of learning Ebook production using the 'Step by Step' guide on your website. I'm essentially learning it all from scratch as I have little to no programming/SWE experience (I learned a bit of Lua because of KOReader[1]) but the technical side of ebook production has always fascinated me enough to keep learning. Also because I wanted to contribute more than just typos and grammatical errors (as important as they are).
We use k8s at $JOB for literally no reason other than some engineer decided he wanted to learn k8s.
It's the worst, I don't even get to see my server logs because "that would mean giving you access to the entire thing". I'm not a k8s person but surely that has to be missing something.
We could literally make do with a cloudflare 5$ plan and have left to work with. With better integration, and better DX.
* everything running in docker containers, but you aren't supposed to run that locally, only local dev\test is through unit and component tests.
* all k8s is maintained by a separate group pulling in those images through a seemingly very over engineered solution
* when it is all combined in a deployment, things break, or more scary, they don't and you need to manually test and hope it's working.
Issues take too long to track down and resolve and the reasons are very obvious to everyone but technical leadership who are constantly high fiving each other over their amazing creation that is already collapsing under it's own weight.
> It's the worst, I don't even get to see my server logs because "that would mean giving you access to the entire thing". I'm not a k8s person but surely that has to be missing something.
I'm pretty sure it's possible to configure access to logs, not to the entire thing (whatever that means). He's probably lazy and does not want to bother.
Besides, you should have centralized logging using loki or something similar. Kubernetes logging is not enough for any reasonable use-case.
I know, but we don't get any. I legit had to do logging on my end to a _database_ because they weren't a fan of something like say new relic or similar.
For small cluster you can just create service account for a user, create token for it and write it in the kubeconfig. Then assign role to this service account and that's about it.
The main issue with this approach is that you can't organize those "users" into a groups. But for a small number of users you can just create all rolebindings and be done with it.
I dunno if k8s is actually right for your company, but it most certainly is possible to configure it to give engineers limited access, including viewing logs, pod status, services and deployments, etc, but not reading sensitive data or updating anything. In my opinion, deploying production services using devops technology nobody on the team has much understanding of is what's a bad idea, not any particular technology.
Maybe without details or knowing exactly what he means. But I've worked with people that decided you don't get to see any logs on a server except for a select few through some web ui or a log aggregator.
There are production systems handling PII (finance, healthcare etc), where logs are treated as radioactive and only people with trust, training and protective gears should be touching them. This is the case where these are legal/ compliance requirements.
Non-prod of those systems can have wide open access to logs, unless someone decides to import prod data for "testing" in non-prod.
While I agree access to logs may seem too restrictive, there are some very solid reasons for that... and then we have some cargo-cult and some lack of time/ understanding which causes too restrictive log access where it might not be required.
I know what you are talking about, unfortunately it's way worse than that.
The same guy that runs the k8s is the guy that "reviews" my PRs. It's also the same guy that can't get eslint into our PR process, and the same guy that merges code from other people that just out right brick the server because it had some bad syntax.
I _wish_ my problems were caused from too strict standards and procedures. But really is just a lack of understanding and cooperation.
If your app doesn't server more requests than sqlite.org daily, you shouldn't pay more than it.
> sqlite.org answers more than 500,000 HTTP requests per day (about 5 or 6 per second) delivering about 200GB of content per day (about 18 megabits/second) on a $40/month Linode. The load average on this machine normally stays around 0.5.
If you’re just serving static content then of course you can get away with one small box.
If you have user content that’s being constantly updated and inserted you need a lot more.
You need databases, caches, and in our case elastic search (with ~10 billion documents). The data needs to be indexed 16 ways to Sunday to make sure that a user hitting the page with this filter or that sort selects just the right records.
If you care about reliability you should have read replicas of your DBs as well.
Where are your logs going? I’d expect servers for that as well.
Now alternative to a bunch of this you could just use SaaS products but that costs an arm and a leg.
How many of the request on sqlite.org go to the "dynamic pages"? I would assume by far most are on the static docs. The forum and other "dynamic" content is quite hidden.
About 10% overall (9.96% to be precise), according to server logs over the previous 10 days.
Robots hit dynamic content at about twice the rate as humans: 14.12% versus 7.8%. About 34% of traffic is from robots, from what I can tell (though to be
fair, many robots these days work hard to disguise themselves has human, so
the actual percentage of robot traffic is likely much higher.)
I don’t think this is controversial. 5rps is nothing.
Anyone saying otherwise failed to do basic capacity planning. Granted at any given time the number of junior/untrained/nontraditional engineers outnumber senior engineers so it’s not surprising things are build out of proportion, and I suspect that’s what’s behind the majority of the anecdotes.
I have been running multiple web apps with decent traffic (each ~10K requests per day) at no cost at all. Oracle Free Tier [1] for backends with Cloudflare tunneling [2] and pages [3] for frontend integration. Works fully seamless.
[1] https://www.oracle.com/cloud/free/ (4 cores, 24 GB ram for free; switch to PAYG to avoid idleness shut down, but you still pay nothing)
Somehow hearing of getting free hosting from Oracle makes me think of the advice to not stick your "thing" in crazy. It's probably fine, but I'd not touch it if it can be avoided, especially if I can just pay a small and fair fee at another provider (or, in my case, self host it)
It was my reaction at first too. Luckily, I am not running anything super important or something requiring much privacy there (it's mostly computational services that don't store data). Since I am a student with no residence (and bad university network uptime), I can't self-host, but will do as soon as I move-in to my own place.
Two multiplayer game servers for the games I built,
frp for tunnelling;
and I'm planning on squeezing more in until it gives up.
All reverse-proxied by nginx under subdomains of my personal domain address, absolutely seamless.
Similarly, other applications I built so far all also run under five euro VMs. There's no denying you might need more because you have serious peaks in traffic that you cannot handle with only one server, but do the accounting. It's a worthwhile decision to think through.
> Similarly, other applications I built so far all also run under five euro VMs. There's no denying you might need more because you have serious peaks in traffic that you cannot handle with only one server, but do the accounting.
A simple proportional-integral-derivative controller equipped onto your server resource can help you see if future traffic spikes are occurring. The question is, what kind of person is able to have that kind of perception ability and is willing to integrate that into his daily life? Not the average software developer, we can definitely say with great confidence.
You're supposed to be repurposing the controller into a reporter. Since you don't need control, you can instead just use the PID portion of the PID controller.
But you can also attach an actual automatic control for when the sensor reports a positive traffic influx prediction value.
And I think stuff like Kubernetes includes this feature. Go figure.
It's a sensor that informs its users of it how much control is needed for a networked server that can experience volatility in the form of network traffic spikes. The numerical calculus from PID controllers helps provide that crucial information. A bang-bang controller would, for example, provide very sub-optimal control, as volatility doesn't work under simple on-and-off models.
I once worked a job where I easily spent 4 times as much time fighting with Terraform files than actually writing features. This company got less than 1,000 hits per day. I think about this a lot
I've had gigs where it felt like the company was spending more time fighting self-inflicted consequences of their "best practice" "cloud-native" architecture than developing actual revenue-generating features.
So many problems they were dealing with would magically disappear if all the services were running on a single high-end physical machine (the scale didn't mandate anything bigger than that) with a standby one sitting in a different DC for redundancy purposes with incremental DB snapshots shipped to it every 15 mins.
The cloud-native, "modern" infra became a liability and impediment to business but too many people would lose face to admit it.
I don't think the point here changes all that much if you change this to two servers and a load balancer. That's still a pretty simple setup.
But I also think that many applications are far more tolerant of small outages than you imply. And a more complex setup also adds more points that can lead to an outage even though it reduces the chance that hardware will cause one.
Wouldn't you also need two load balancers then? Otherwise you've still got a single point of failure. And how do you keep the failover system in sync? It's a whole can of worms to promise 100% uptime. It's super rare that a server physically breaks and suddenly goes dark with less than a day's warning, and for most applications such a once-in-5-years event is tolerable if that means the hosting costs are divided by five as well (so far I'm ~10 years in and haven't experienced such an event yet; disks in RAID have broken but not a cpu or other SPOF)
I kinda want to know where you shop, because commodity hardware has a reputation as just utter crap. Before cloud went mainstream, I worked somewhere that had some racked servers in a colo literally catch fire. (The remote hands put them out, disconnected, and refused to touch them ever again.)
Nearly no one is competent and able to setup a true load balancer. If you have one either it’s provided by your (most likely vm based) hosting provider or you are a big enterprise.
If you are in one of those two categories then setting up something more robust than a single server isn’t much more complicated than a single server.
That is to say, I think it very much does change the equation around the original statement.
Well if you can spin up a new environment in less than a minute – does it matter? I mean if you are Facebook yes, but having worked with a lot of people building things like web shops – having a minute downtime I would say is fine because I've seen so many of these people shoot themself in the foot trying to build high-availability solutions and having constant downtimes just because they can't handle their own complexity.
If you have the ability to spin up a new machine when the old one fails, and deploy your app onto it in one minute, it’s not a big leap to also run your app on two machines and avoid that downtime altogether.
Running two instances of a stateful application in parallel forces you to consider nasty and hard problems such as CAP theorem, etc. If your requirements allow, it's much easier to have an active-standby architecture over active-active.
Most applications as a whole are absolutely stateful. Individual components of them might not be (app servers are stateless with the DB/Redis containing all state), but the whole app from an external client's perspective is stateful.
If we're talking about reliability/outage recovery, we're considering the application as one single unit visible from the external client's perspective - so everything including the DB (or equivalent stateful component) must be redundant.
Sadly this is also where a lot of cloud-native tooling and best practices fall short. There are endless ways to run stateless workloads redundantly, but stateful/CAP-bound workloads seem to be ignored/handwaved away.
I've seen my fair share of stacks that are doing the right thing when it comes to the easy/stateless parts (redundancy, infinite horizontal scalability), but everyone kinda ignores the elephant in the room which is the CAP-bound primary datastore that everything else depends on, which isn't horizontally scalable and its failover/replication behavior is ignored/misunderstood and untested, and they only get away with it because modern HW is reliable enough that its outage/failover windows are rare enough that the temporary misunderstood/unexpected/undefined behavior during those flies under the radar.
That’s a pretty pedantic interpretation of the word application. In the context of software owned by most teams, that they may decide to run on single vs multiple hosts most applications are absolutely stateless. Most applications outsource state to another system, like a relational database, a managed no-SQL store, or an object store.
And so no, most teams don’t need to worry about the hard problems you bring up.
Is it really an application if it’s not stateful? Maybe you’re managing the state client-side which makes it easier but I wouldn’t call a plain website an application, or am I missing something?
At the smallest level, even every byte of an in-flight HTTP request is still state. State, and for that matter "uptime" really depend on what the application/service ultimately does and what the agreement/SLA with the end-customer is.
The correct high-availability solution should take business requirements into account and there is no silver bullet. Running everything on a $5 VPS is no silver bullet, but neither is your typical "cloud-native" "best practice" stack that everyone keeps cargo-culting which often leads to unnecessary cost while leaving many hard questions (such as replicating CAP-bound stateful databases) unanswered.
Our environments are entirely automated on AWS. It takes about 10 minutes to get an ALB and a mysql instance just on the AWS side. We've configured ECS to have the quickest health checks we can manage, and it still takes 3 minutes from the initial call before everything is happy.
Our environments are easily repeatable, they're very maintainable, but they're not quick to start up
You'd be surprised how close to 100% uptime you can get on one machine with well-engineered software. Schedule an automated update and reboot for a time and day of the week you have the fewest customers monthly, and if you have an SSD as the boot drive, unless you are a huge company nobody will notice or care that at 4 AM on Sunday your service was down for a minute and a half.
Indeed. I come from the Windows world where things are a lot slower and there are less options for things like livepatching. So my "minute and a half downtime" is definitely pessimistic compared to what you can accomplish on a Linux server.
>Do most of your customers expect close to 100% uptime? Yes.
I think this is mostly a self imposed requirement. Banks regularly have overnight technical breaks. My country national rail has 30 minutes of downtime every night (!).
Unless your product is already global, most people won't have a problem with occasional overnight scheduled downtime.
Most people never even test number 2. If your a novel or niche product, you can afford a “lot” of downtime
For number 3, it really all depend on your load. I’ve run most early stage startups on an incredibly simple setup with a proxy placed in front of an app server.
Occasionally, our monitoring will start reporting high p99’s so we go in and bump the servers to the next tier or scale horizontally just a bit more. Eventually, that breaks down but many startups will be well into Series A or Series B point. At that point, you know your customer’s needs and can hire dedicated engineers to solve reliability and uptime.
Pretty much any provisioned vm can be split down the middle. Got 4 cpus and 4gb of ram? Congrats you have two vms with 2 cpus and 2gb of ram. Throw them into a 50/50 load balancer and never have any downtime.
Two vms on the same machine doesn't help when the machine fails.
Also, I sure hope your loadbalancers don't suck. I've worked with some that had worse uptime than my servers, and worse capacity too. Went back to putting the two host addresses in DNS, which is mostly fine, but means a lot of waiting when you want to take a host out of rotation for disruptive maintenance.
VMs going down is not an unexpected event. It's not going to happen even day, but hosts crash, network goes down, datacentres experience thermal events, etc. Your solution may work for your own system crashes, but not most of the rented platform failure modes.
Multiple machines don't automatically mean a better uptime. Outside of the regular maintenance, many problems can affect multiple instances, not just one.
(Regular maintenance meaning OS updates needing a reboot, which happens like what? Once per month for a minute?)
We know. Absolutely no one has ever said that your super basic webapp that has the latency requirements of "make sure it works", an SLO of "it's fine most of the time", and a code base under 10k lines of code worked on by one guy, needs these super complicated systems. Even FAANG will run internal services and dashboards on a single binary, but when every second of downtime can be translated into lost money, then you begin to care.
Unfortunately many of these super-complicated systems run at odds to having a reliable and fast, system even when every second of downtime can be translated into lost money.
Realistically, they are required. I don't care how you do it, but you need 1. Seamless fail overs 2. Easy horizontal scaling 3. Location based load balancing 4. A way to deploy and rollback all of these automatically. As a result, things are gonna get complicated no matter what you do. But as it turns out, using something like Kubernetes, Elixir, or AppEngine can make all of that very manageable.
#3 is debatable for many, many businesses. The other three are (almost) trivial with a decent load balancer and blue/green deployments. This is something very achievable without an over-complicated architecture.
3 is incredibly important for global businesses. You ever try using a Japanese website?
As for how "easy" the others are, the truth is that the options I provided will be easier to make, easier to maintain, and work better than your home grown solution once you need to account for multiple machines.
this is the correct take, imo. i think a lot of projects start out overly complicated with the expectation of getting huge and needing to scale horizontally in a hurry, which a monolith cannot really do well. of course these expectations rarely become reality and you end up with some huge web of microservices that two devs need to orchestrate and manage for no benefit.
at Grafana we're now decoupling a monolith primarily so teams working on different parts can manage their own deployments/rollbacks/incidents independently of "official" monolith versions. it's a very hard/expensive refactor once you're at a scale of millions of users and hundreds of engs. but really an impossible thing to predict at inception.
Most over engineering is career driven development. You may laugh at the people who unnecessarily use loads of microservices, specialized tools and libraries, exotic cloud solutions etc. But the architects and senior developers who do this stuff now have great high paying jobs. If you choose the simple cheap solution the client might love you (realistically they wont know the difference) but you wont be able to get those high paying jobs. Ask me how I know.
A $5 VPS, with almost any app, can handle the front page of hacker news. If I remember correctly I'd even venture to say hacker news runs on a one or two core box. It should be the standard
Hacker News might not be the best example, as it seems to get overloaded once every week or two, usually when a contentious/high volume topic comes up.
My impression is that your standard wordpress + Apache + mod_php + mysql on same server install cannot, without plugins that improve wordpress' caching behaviour, and that's a large portion of the sites that do fall over.
You don't even need containers. Write your program as one Go executable and drive it from fcgi. Go is fast enough that you can get a lot of work done on a minimal server. Fgci provides "orchestration" of multiple processes, plus crash and restart handling.
Also, there's much less attack surface. If your minimal Go program can only respond to specific requests, there's not the problem of an attacker targeting some unused feature of the site. Go has subscript checking, so you don't have buffer overflow vulnerabilities.
What kind of systems are people imagining when they picture 'the majority of web apps'?
In this thread, I see people citing HN (a simple bulletin board system); or sites that will get linked from HN and have to handle load - so, presumably blogs, or CMS-backed marketing sites, maybe with a sign-up form or a single product ecommerce storefront?
Sure, I can buy that you can run a pretty robust bulletin board or CMS on a single server. PhpBB and Wordpress with a Postgres backend will get you a long way.
But if you think that's what all web apps are like, you're not thinking anywhere near big enough. "I've seen some stuff that's over engineered" is not evidence for "everything is over engineered".
The real reason for cloud micro service architecture is legacy code.
Imagine you’ve inherited a 10 year old PHP monolith. The code is almost indecipherable and the business wants new features in a timely and predictable manner.
The easiest way is to implement the new features in their own micro services.
Sure it makes the complexity problem worse, but that’s a problem for someone else in five years time.
You could solve this other ways. Build a new better monolith and have a reverse proxy route between them and slowly update and move routes over. You will get through a rewrite slowly one bit at a time.
If you keep the same database schema this should generally work pretty well.
I migrated a terribly written web app this way, it worked pretty well.
Apparently I don't understand. If the code isn't poor, why throw it away for a new even more complex system? I don't know how code isn't poor, yet is indecipherable.
Regardless, the method works even in a complex environment.
No one's gonna promote you for proposing to build a monolithic app with the DB and queue on the same machine. Gotta have microservices built in Rust talking to micro-frontends built with the latest and greatest JS framework, all orchestrated by K8S running on a fleet of machines to seem hip!
Totally agree. Problem is that if you run your stuff on a single server you are at risk of being unhireable. Every interview I have been to, nobody asked about pragmatic,sp simple solutions that do the job, but they wanted to hear about microservices, k8s, Redis and other stuff. Based on the job postings, my company wouldn’t hire me because I don’t work with the stuff they are asking for. It seems a pragmatic solution for a dev to jump on these bandwagons. Right now it’s probably a good idea to push AI somehow into your systems. Doesn’t matter if it makes sense or not.
I have a small web development hobby company and I run it all of a $200 ovh server. We do like 10m pageviews, too. No problems once we figured out how to not crash during cronjob calculations.
True, I'm assuming commodity hardware. I've never had the privilege of working on really high end systems, but my impression is that you buy or lease a cluster in a box that is simulating one reliable computer.
"The more they overthink the plumbing, the easier it is to stop up the drain." - Scotty, and good IT people everywhere.
When you add all of the necessary layers of abstraction to get three computers to act as one computer, you added dozens of new points of failure. In the conventional case, yeah, you can now shut one down and people aren't impacted. But there's now a bunch of new ways it can break and take the system down and it is much harder to figure out why.
There's a lesson to be learned from the trend to move everything to Kubernetes / Docker / one VM per function.
Aside from the disadvantages of being reliant on others for timely updates and security fixes, people now believe that they need multiple gigabytes of memory to run what used to easily and comfortably fit on a single machine.
This move is being sold through security arguments that attempt to convince people that there's no real security in Unix OSes, that if one service gets compromised, the whole server gets compromised. Hmmm... That seems very familiar... Where have we seen that before? Oh, right! Windows!
Really, though, user-level security in Unix OSes has been refined and well understood for decades, yet now we're supposed to assume that people are too inexperienced to apply user-level best practices, and are therefore expected to just run everything in separate containers / VMs and not care about security because it's been delegated, roughly, to others?
That's both kicking the can down the road and opens up nefarious possibilities like wide distribution of backdoored / compromised images and containers. "Don't learn for yourself - trust us, instead" is what I see.
Really, though, I can, I have, and I do run more services on old Raspberry Pi and similar hardware than many people run on huge, 500 watt servers. It's both a waste of resources and a lost opportunity for people to participate more in their own systems' security, rather than delegating to the rest of the world where evil people constantly scheme to take things from us.
I'll happily keep showing people how to properly set up software to coexist on a single Unix OS, and will continue to call out the "containerize everything" movement for trying to dumb things down.
> websites and apps get <10 rps and 50 on a busy day.
This is very different from the majority of requests. The majority of requests are going to Google/Youtube/GCP, AWS, Azure, Meta, Netflix, and Clouldflare. I'm sure I'm forgetting someone, but you get the idea.
Medium-sized companies are also past the single server point, and a lot of us will end up working at these companies. While the poster's statement is true, it's only relevant for hobby projects (unless your hobby is devops) and early-stage startups.
A long time ago, an F100 company acquired my messaging startup. We hosted client APIs and a web application for 2M monthly users sending 1B messages/month on a single AWS medium CPU server and a SQL Server DB. The rest of our infra was entirely for redundancy and monitoring to enable our 99.995% uptime.
Post-acquisition, the AWS budget I was given to maintain our infra was almost 20x the actual cost, and based on infra that company used to host similar traffic.
You don't have to worry about spiking because it's a "turn-based" business transaction site. So if you need two hours to shift to a larger server as the number of buyers and sellers increases, you have it.
To a large extent this is true. But a lot of the complexity is introduced as a way to deal with the operational effort of running software (HA, security, platform upgrades, backup/recovery, incident response, all that stuff). That doesn't mean it shouldn't be on a single server -- but a _lot_ of the motivation is preventing the dreaded 3am call.
Majority of tech companies could run on just a handful of nerds. But VC comes in, pretty soon you've got a bunch of lawyers telling you you need HR people, more lawyers, people to run ERGs. Soon you've got server farms all over the world to handle a couple dozen simultaneous users, thousands of employees, and you've made it!
Docker (and to an extend, Docker Swarm) is a good alternative where simplicity, lightweightness and stability are key points. Using it for almost 4 years in many projects with thousands of views each day, both on single or two VMs. No failures, no surprises, no pitfalls, automated CI via Git. No devops required.
Do not forget, for many, there is certain satsfaction and bragging rights in operating large complicated environment. Hundreds of Nodes, Thousands of Microservices is much more sexy than simple few nodes cluster
It always amazed me how efficient software can be created than excessive capital is not available
As far as I can tell, much of architectural choice is driven either by the need to be entertained by shiny new toys, or the need to pad the resume with shiny new acronyms.
“Simplicity is a great virtue but it requires hard work to achieve it and education to appreciate it. And to make matters worse: complexity sells better.”
I built a site for my dad recently. There was no delusions that it was going to become the next YouTube or anything, but it does require a lot of number-crunchy stuff, so I could justify a slightly-beefier-than-the-minimum box on Hetzner.
I had to resist every urge to do an elaborate Docker Swarm or Kubernetes setup, with replicated databases and async processing across message queues across different machines. I want to do that stuff, but realistically this site is going to get like 12 users a day at most, and doing that would have drastically increased the cost for really no perceivable benefit.
I ended up just installing NixOS, and doing everything with Flakes, on a single machine with a Postgres database. It was cheap and easy, and it probably has saved me a lot of unnecessary headaches; most websites aren't going to be the next Google.
Tech debt exists for the same political reasons than bullshit jobs.
There are few incentives to simplify, and the bigger your budget the more internal political power you have. It's especially true at larger companies where it's ok to be inefficient thanks to your monopoly.
I agree w/ the sentiment, but the author doesn't seem to understand why people tend to like touching the customer at the edge. Since everyone is doing TLS these days, you have to do a TCP connect round-trip and a TLS round-trip before you can start yammering over a connection securely. So that's at least three round-trips. If you're one-way latency is 200ms, that's about a second of delay before things start happening. The benefit of the edge is you can establish a connection with an endpoint that's "nearby" with less latency for the TCP & TLS round-trips and then use a previously established connection (that does not require you to do the TCP & TLS roundtrips.)
There's still latency, but the impact of the round-trips at connection time is reduced.
Also, web-apps use horizontal scaling largely to reduce the impact of head of line blocking (where requests are queued up at the client waiting for the current request to complete.) Browsers will sometimes (often?) open multiple connections to the same endpoint to reduce HoL blocking impact. But, sometimes apps are inherently serial (you don't know what the next request is until you get the response back from the current request) so YMMV. Also... HTTP/2 and QUIC are cool, and I think most major browsers support either or both.
Amazon is also sort of (in)famous for triple-redundancy. Their apps run in at least three different availability zones (think different data centers) to avoid the impact of one data-center going down. Sure, it doesn't happen very often, but Amazon REALLY doesn't want their site to be down at all.
So sure... you COULD run the overwhelming majority of web apps on a single container in a small VM on a single machine somewhere, but it's more than just resource utilization. It's also responsiveness and resistance to single data-center failures.
But don't confuse my comments with a callous disregard for the OP's concerns. Dude has a point. I think anyone who tries to deploy a web app to AWS (or Azure or GCP or ...) is barraged with messages about how you have to deploy things in multiple data-centers and you're not a REAL developer until you've crafted your own customized templating engine for spitting out TerraForm or CloudFormation scripts that automagically deploy to N+1 unique datacenters.
There's probably a market out there for a simplified system that's "here's a simple {Node|Python|Ruby} app. Automagically distribute the backend to N different data-centers, maybe only one is up at a given time and the other N-1 are hot spares. And make the persistence tier indistinguishable from a local DB like Mongo or PostgreSQL (or some weird Prevayler-like thing.) Don't make me craft custom TF or CF to manage redundancy.)"
Please note. I'm not talking about using SAM.
To recap... I think the OP has a very valid point, but may not be familiar with all the current triple-redundancy dogma.
Sometimes API calls make 5-10 queries to the database. So it's 600ms to setup TLS and make a call at the origin server and the number of db calls is irrelevant.
Or a quick TLS connection to the edge and 1000-2000ms to fulfill because of db distance.
Maybe you can do some of the calls in parallel, but sometimes not. Session lookup plus another query after permissions are established and your edge latency savings are almost entirely gone.
Follow-up API calls will always be worse because TLS is a one time issue and db distance will continuously bite you call after call after call.
Then there's a cold start where the edge function has to link up with the database and do it's own TLS or equivalent.
Edge functions have a lot of issues you have to worry about that increase system complexity to solve.
It depends. I've had close to 100% uptime with my single server, whereas one of my customers who has hundreds of AWS servers has had some pretty severe downtime (hours at a time) due to some screw-up on their end. Complexity creates its own risk.
Most companies are too full of themselves to do an honest assessment of their uptime requirements though.
Every company out there loudly claims they need 100% uptime, which grifters will be happy to sell them knowing nobody will actually put their solution to the test and if it does fail they'll have a myriad of moving parts to shift the blame to.
If you look at the very few things that do actually need 100% uptime (real-time stock market, etc) you'll note the absence of most "cloud-native" and supposedly "best practice" bullshit. The reality there is much more boring, serious and rigorous - that's what you get when you actually need 100% uptime.
* we are not mission critical for our customers. Let’s not engineer for disaster scenarios when our customers really won’t care if we have a very very rare outage.
* our customers primarily used our services during US business hours. If we needed risky maintenance, we’d just do it a bit early or a bit la tree in the day.
There are very specific reasons why real time stock market doesn't run in the cloud and there's a lot they're doing that very much aligns with best practices. Your example doesn't really say what you seem to think it does.
Bloomberg is a large company with many different products, people/teams and priorities. But as far as I know Bloomberg doesn't actually provide order execution which is what I was referring to when I said "stock market".
I accept some projects are more resource expensive than others, but majority of the time you can get away with a bit of asynchronous responses + scheduler/queue to spread the load horizontally over time.
Unpopular opinion: I blame the new age devops culture that made cloud app deployments unnecessarily complicated with k8s and cool new tech (that'd get them high profile jobs.) I've never come across a devops person who'd say "Hey, that sofware is too simple to prematurely scale for sudden spikes of irrational amounts of traffic, so why not just deploy it on a cheap vps?"