Hacker News new | past | comments | ask | show | jobs | submit login
AWS and Azure Are at Least 4x–10x More Expensive Than Hetzner (umh.app)
131 points by JeremyTheo 3 months ago | hide | past | favorite | 138 comments



>While AWS and Azure are industry leaders, their advantages often only materialize at massive scales. [...]

Your comparisons are similar to many others out there that focus on measuring basic cpu and memory. This type of easy comparison where AWS/Azure/GCP is treated as a "dumb" datacenter is easy for alternatives like Hetzner or self-hosting to "win".

>Do you really need the advanced features of AWS and Azure right now? Or would a simple virtual machine at a reasonable price be sufficient? [...] There’s a growing movement among tech companies and startups to opt for more cost-effective hosting solutions like Hetzner. The high costs associated with AWS and Azure

Many (most?) YC startups are not using AWS as a low-level dumb data center with blank EC2 virtual machines and installing infrastructure software like Linux and PostgreSQL on it. Instead, they are using higher-level AWS managed services such as DynamoDB, Kinesis, SQS, etc :

Therefore, the more difficult comparison (that almost no blog post ever does) is the startup's costs for its employees to re-create/re-invent the set of higher-level AWS services that they need.

Sure, there's the "but you don't need to pay expensive AWS costs for DynamoDB when one can just install open-source Cassandra at Hetzner; and instead of AWS Kinesis, install your own Kafka, etc". Well, you add up more and more of those "just install and manage your own X,Y,Zs" and you can end up crossing the threshold where paying AWS cloud fees cost less than your staff maintaining it. The threshold for AWS isn't just massive scale of 100+ million users. The threshold can be the complexity and scope of higher-level services you need the cloud to take care of on your behalf so your small team can concentrate on the aspects of the business that are true differentiators. In other words, instead of employees installing Cassandra, they're adding features to the smartphone app.

If your company doesn't need any of the Big 3 clouds' higher-level platform services, it's easier to save money with alternatives.


Continuing this reasoning...

As soon as your startup does get big, it starts to make more sense to try and migrate to 'dumb' machines and save on infrastructure costs, especially if your business is low margin and your infrastructure costs are high.


The flip side is that when you are small, you probably don’t need all the fancy managed services that AWS offers. Simpler solutions can save you money and time.


I'm a big proponent of appengine/heroku and similar platforms for small startups.

You can almost certainly fit all your business logic into one or two appengine apps, and fit all your data into one database. While you have just a few programmers, the fact they're all sharing a process with eachother won't matter.

The goal is working product and paying customers ASAP, not a nicely architected microservices backend 2 years from now.

Yes, it'll end up being a mess when the company has pivoted and changed directions a bunch of times, and when you finally come to get to 50M users+ scale you'll probably have to rewrite from scratch. But by then, you ought to be rewriting from scratch, because you won't know the true requirements till you get to that scale.


Unfortunately by that time you're mired in EKS, SQS, EFS and whatever other 3-letter services, unpicking which is more expensive than months of operation on AWS.


And then all of a sudden you run into more engineering costs. Companies use platform services because one dev/engineer can do a lot more on their own and focus on delivering business value rather than twiddle knobs.

And adding one dev/engineer is _massively_ more expensive, so you seldom want to scale in that axis when the option is to, say, use a managed database or even a complete data pipeline.


I agree, that's mostly because the usage patterns have become apparent and you know what features you need and what to optimise for. That's why I prefer managed services to start and then can self-host once price or needs pushes me to.


Agreed! See also the ahrefs example


The caveat to this is, you might think you need RDS, SQS, SNS, S3, Lambda, DynamoDB, Elasticache and Kinesis - but you probably only need Postgres.


SQS/SNS/S3 are so simple, reliable, and cheap they're pretty much a no brainer. While you can probably run those workloads in Postgres, it isn't designed for those use cases and you'll eventually run into nasty limitations like managing vacuums with high churn tables and slow/complicated backups with big binary blobs.

If you have a good understanding of load up front, however, those are probably non-issues.


I know, I'm mostly being tongue in cheek - the joke is so many companies go straight to complex cloud configurations more for the vibes than the actual practical need; a single box (two for availability) and a solid db will get most sites and businesses very far.


S3 is mind bogglingly expensive compared to Hetzner.


You might not technically need it but some of the things offered by those services might be ‘nice to have’ for your specific use case. If they are not available with just Posgres+etc out of the box the few hundred/thousand $ additional costs might be entirely insignificant compared to the additional work-hours you’d need to implement those things.


one day someone will rewrite postgres in rust, and i will have to switch to carpentry full time to preserve my sanity.


You might need to do that even before it ships the complete feature set.


Unfortunately, it's a false dichotomy you present, it's not a binary choice of fully managed or entirely roll your own.

E.g., if you're running K8s (one thing I typically recommend you buy a managed one of), you can install your own Kafka in it, using an operator that does about 85% of what MSK does.

Sure, you'll need to dedicate person hours to support the operator, but is supporting that any more expensive than supporting AWS products? That you're already paying through the nose for?


It's also, what kind of startup are you? What kind of workload do you have?

If you are bootstrapping a crud app business then 1 beefy hetzner box (or something slightly more reliable) with postgresql is probably fine until you reach scale where you sell the business. You care about burn rate above all.

If you are VC backed go all in on gcp or aws because thats what you're expected to do and and what the expensive people you hire are going to know.


I agree but would slightly modify it in that if you have taken VC money, growth probably matters above all else. Don't waste time on activities not related to the product being sold.


I really wonder whether a VC would rather invest into a startup with an architect focusing on KISS or one where the architect goes all in on cloud.


You can open a ticket and make the weirdest of issues with MSK Amazon’s problem to deal with.

Same with RDS, etc.

It’s pretty great not to waste time when the lottery for the bizarrest of 0.000001% issues arise.

The operator only solves the happy path. An AWS support ticket usually can solve the unhappy path.


Sure, but you can also go to a Slack channel and get help from the people who wrote the FOSS code you're using.

For free.

Yep, if your Kafka is mission critical and crashes hard, that is bad.

But things like Kafka are _never_ a black box you just spin up and never worry about, if anyone thinks so, CAP theorem will give them an awful surprise one day.

You're always going to need someone in your team who understands the tech and how to make best use of it.

MSK won't tell you how many partitions your topic needs, or whether your retention strategy should be delete, or compact, or both.

You still need that knowledge of the "managed" service to make effective use of it.

And that knowledge sits rather close to knowledge of how the system works, so given you'll need that knowledge anyway, may as well cultivate it instead.

Oh, and the operators also solve a lot of the unhappy paths too, FYI.

I tend to describe the operator approach as "half-managed" because things like multiple-AZ stretch clusters need some configuration.

But then, maybe you didn't want a 3-AZ cluster? Maybe a 2.5? MSK says no.


> You're always going to need someone in your team who understands the tech and how to make best use of it.

> And that knowledge sits rather close to knowledge of how the system works, so given you'll need that knowledge anyway, may as well cultivate it instead.

This has been my argument forever, and it’s always met with disagreement, because entirely too many people have no desire to learn their tooling. They just want an API that they can push data into, and get it back out. What happens inside is irrelevant.

It’s extremely sad to me.


Or hold off on the academic-style pretentiousness and come back down to the real world.

At some point, we have to decide that there's a lot of knowledge expectations depending on your stack, especially as parts of your application grows.

Say you're a Python-based webapp running with Postgres, Kafka, and Elasticsearch. Your stack requires pretty decent knowledge of:

1. Postgres

2. Kafka

3. Elasticache

4. Linux (and a lot more than what many developers I've encountered seem to have)

5. Kubernetes, because it is 2024

6. Whatever frameworks you're doing with your webapp + ensuring you're keeping up with security best practices

7. + the soup involved with exposing your webapp to customers

Being able to handle any of these 6 at scale require different skillsets. It's unreasonable to expect anyone to be an expert at all of this -- in a real, tried-and-true environment -- especially with deadlines and SLAs involved.


Counterpoint: stop the sprawl. Use boring technology.

Until you’re at quite a high scale, you probably don’t actually need Kafka. There are plenty of much lighter ways to do pub/sub, including Postgres itself.

Similarly, if your RDBMS schema is properly defined and your queries are well-written, you probably also don’t need Redis / EC.

Re: K8s, if you do need it, I’m not sure why people think that it’s so much easier to run EKS than your own cluster. The only thing you get to skip is the control plane; everything else is still your responsibility. Same with Postgres – you still are wholly responsible for its schema/table maintenance and optimization on major DBaaS.

In any case, nowhere did I say one person should be an expert at all of this.


> Until you’re at quite a high scale, you probably don’t actually need Kafka.

As someone who accidentally specialised in Kafka... ...bingo.

So many companies using it who don't need the sheer scale it offers, and get to pay the complexity cost anyway, with no benefits.


> Sure, but you can also go to a Slack channel and get help from the people who wrote the FOSS code you're using. > > For free.

Relying on volunteer support of varying degrees of quality for your business sounds insane.

Also at that point the business should really be donating or contributing to the development of the software otherwise it is considered what we call a dick move.


People within my company do contribute to the development of the FOSS software we rely on :)

> Relying on volunteer support of varying degrees of quality for your business sounds insane.

Given my experiences of Confluent paid support, and my experiences of the volunteer support around Kafka, I disagree.


> For free

Not sure we agree on the meaning of this phrase in this context.


If you ever hit an issue with Kafka or Strimzi, go to their Slack, some of the most intelligent people I've ever had the privilege to work alongside will be there, helping you.

For 0 money. That kinda free.


I would prefer to say "free of charge", because that support is not actually free, it has a cost, you're just not required to pay for it.

But you as well as I know, that what the other participant in this conversation means, is that if a for-profit entity relies on support that is "free of charge" in this way, such that it can continue to profit on the back of their product support, then the for-profit entity really ought to seriously consider a voluntary donation of some kind to support the continued maintenance and support of the product.


My company contributes to FOSS projects we use :)


And while that's super awesome that someone feels passionately enough about a piece of tech - that they're willing to spend their precious resources helping others... that kind of charity is untenable. You can't expect that person to be there at 3am when systems are down and your nightly processing jobs are failing.


I am expected to produce business value at the end of the day and I wear multiple hats. Paying someone to be the expert in the room is the best value sometimes.

I’d rather focus on my expertise and mental energy in other tools that are much more significant to the stack I support.


This has not been my experience at multiple companies with AWS, even with heavy spend – your tickets have to make it through a gatekeeper who has no more idea than you on how to fix it, and more triage than anything else.


In my experience, It Depends

For big flagship services you can usually get pretty good support (EC2, S3, SQS, Lambda)

For smaller/more niche services where AWS stood up a managed version of some OSS it's more hit and miss (like managed RabbitMQ).

In both cases, it definitely helps to have an open line to your TAM and send them case numbers and they'll usually do some internal nudging to keep things moving. In addition, for projects, you can usually reach out ahead of time and get some dedicated SMEs to help set things up/train you.

In either case, hopefully you've never had the displeasure of working with Azure support.


I only have the opposite. Great support with amazingly deep knowledge at every level.


Same for the most part. Our TAMs have been great to work with and so have a number of engineers the handful of times we needed it. We've had moments of some back-and-forth at times, but overall I've been satisfied.


Can you? While Amazon support is one of the better ones, you are still asking for an hour or two of time from a support guy who has no idea about your usecases or internal systems.

They usually tend to be genuinely helpful but are a far cry from solving your issues themselves.


Given that AWS has been around for nearly two decades they have probably encountered and have a workaround/fix for 99.99% of the use cases.

Of course there’s a minuscule possibility of you having a new use case. But is that good enough reason to build your infrastructure? That is a business call you need to make.


The problem is that if you're a regular-sized company, you will never reach any support person with experience inside AWS ;) And paying for Enterprise-grade support at a medium to small scale is probably more expensive than just hiring 1 skilled operator. And in the latter case, it then doesn't matter anymore if the problem takes 1 hour our 10 hours because your employee can take as much time as needed.


That's ultimately the question. It comes down to cost and time. If you have enough scale that hiring a full-time person is more cost effective than paying for managed, great. On the flip side, you don't necessarily want to take engineering hours away from building the product you sell.


Oftentimes, when you see someone proposing "just save 70% by installing open source XYZ", they are thinking like an individual and not a business. Fast-moving startups and medium businesses in areas with high cost of labor can save a ton by outsourcing labor to AWS/Azure if they are okay with the lock-in. Of course, each case is different and people shouldn't just blindly adopt AWS/Azure without thinking about it...


Honestly most of the stuff I do is internal facing tooling with usually less than 100 concurrent and 1k peak users. For those, managing a server or two, or god forbid, a small autoscaling cluster is not a hassle.

For high-scale operations, you need to think real hard about how you do things and usually simplicity is key, and trying to do a little as possible on the high throughput parts is useful.

The costs do add up when you have professionals maintaining your Cassadra/Kafka boxes, but the same degree of complexity exists on AWS, when you try to weave together a tapestry of EC2s, lambdas, various storage services, with all the delicious complexity of multiple VPCs and networking fineries while not blowing the budget.

It's a different skillset, but not less work.


Hear hear. I get this all the time. People just don’t get that what they are paying for, say, platform services (managed databases, indexing, all sorts of data handling) is vastly cheaper than reimplementing those particular wheels - or hiring the people to manage them - and that the hyperscalers provide redundancy, automated deployment, backups, the works.

Even storage in hyperscalers is inherently redundant—and I keep getting folk who ask about setting up their own RAID array, or using their own containers and job management when there’s a dozen zero-code alternatives in each individual hyperscaler.


I can run a 64x512GiB server in my home office loaded with NVMe drives for $80/mon (probably cheaper depending on how many years you amortize the server purchase over)!


This is what we're trying to address at Lithus[1]. We're offering both the raw compute resources, and also the DevOps time needed to setup and manage the services your engineering team needs.

[1] https://lithus.eu


depends on scale - at small scale, fully managed services are a godsend but at <x> scale (esp per-service) then it pays to self-manage or use low cost or FOSS mgmt tools.


I'm not sure what the cost difference is for using higher level services but I can easily imagine it 4x-10x'ing your costs again, or worse.

Part of me thinks, man, the engineers not afraid of setting up a p Postgres or Redis really should be worth a lot more, given how absurd the prices can get. I guess the getting started costs for these services are usually manageable though; by the time the bill is big it's a "nice problem to have" because you have significant load now, and presumably customers & revenue to show for it.

More so, I think orgs are somewhat rightfully afraid of running infra because historically we have been bad at it. It's been every sys-op or devops for themselves in the world. Everyone making their own practices, assembling their own stack of networking setup, init scripts, db procedures, monitoring, alerting, resilience/reliability. This stuff has a lot of dimensions of care to it.

And even when you go the extra mile to document everything, it's still rough to hand-off ownership. A new gal joins; how long does it take to get comfortable? And how much will her style & preferences mesh with whats been string up so far? Or worse, what happens when someone quits? How load bearing were they?

And this is why I'm so humungouely excited about Kubernetes. Fleet was pretty sweet & cool & direct in the past, RIP, but like so many of the "way to run containers" option it was just that: a way to run containers. Having an extensible system, where operators keep networking, storage, databases running, where tasks like backups and migrations and high availability are built in to well tested controllers: it cuts out so so so many things that operators had to discover, socialize, and test test test test test test before. There's such incredibly good load bearing systems-that-maintain-systems (i.g. autonomic) available, that compete very much with the paid for/managed services that have done likewise for us for so long.

And it's a consistent paradigm, for whatever you are up to. Write a manifest with what you want, send it to api-server, wait for operator to make it so. Instead of having different dimensions or concerns have different operational paradigms & styles, there's a unified extensible Desired State Management that does a damn good job.

It felt like running services was in a dark ages for so long, that each.shop was fractured & alone with their infrastructure, and it was obvious why managed services were winning. But today there's a hope that we can run services, well, in a way that will be very clear & explicit if it ever needs to be handed off.


>Part of me thinks, man, the engineers not afraid of setting up a p Postgres or Redis really should be worth a lot more, given how absurd the prices can get.

But only if they agree to be on call 24/7 to support what they deployed. Ask engineers to guarantee you won’t loose data and see how they tell you to buy RDS.


Not to mention having to add additional security staff.


This.

To add, if you every want to get ISO/PCIDSS etc certification done then good luck implementing gazillion check list items which Azure/AWS/GCP have already taken care of.


Which is bullshit, because the auditors ALWAYS miss stuff, even things I would think are painfully obvious. It’s a cottage industry that allows the C-Suite to assure investors that they have taken all necessary precautions, so when they get hacked they can point and say “we were certified!”


I completely agree with you that they are mostly used as CYA. However, I'm speaking from practical standpoint where if you have to work in certain industries (banking, health, finance etc.,) the first thing you are asked is if you have XYZ certification.


It’s not a cottage industry. It is literally the law if you need to operate in some regions.


Also note: traffic costs. On Hetzner, it's almost impossible to pay for traffic. Even their tiniest machine has 20 TB outgoing traffic (and unlimited incoming). If you used it up (you most probably wont), that's another 1,792 USD of costs saved by your tiny 4$/month VM compared to AWS. (At least if I was able to use the AWS cost calculator correctly).

They will have object storage soon, but dont hold your breath for one-click kubernetes etc. So the fancier you infrastructure, the more you your startup would need to invest in time and money to use Hetzner and thus make it "not worth it".


There is a nice one-click kubernetes for hetzner if you use the terraform module: https://github.com/kube-hetzner/terraform-hcloud-kube-hetzne...

There is also a gpt that you can use that will genereate you the module block based on your requirements.



Additionally, go for the dedicated servers from Hetzner and you get a unmetered connection (eg: don't pay per GB ingress/egress at all). Not affiliated, but been happy with them since day 1.


Most cloud customers don't pay on-demand retail prices. For example, Azure VM Reservations or Savings Plans typically provide a 50-65% discount. AWS has similar plans.

For example, instead of the ancient F8 series used in the article, a modern D8as_v5 Azure instance under a 3-year Savings Plan is $115/mo.

Also, the article compares CPX41 to EC2 and Azure VMs with dedicated cores, not shared cores. The CCX33 Hetzner model is closer to the normal clouds, and costs $50/mo, so now we're at 2x the price instead of 10x the price. (Conversely, the B8als_v2 size uses shared cores and is also 2x the price of CPX41 at $74/mo)

For that 2x cost you get a lot more features, first-party and third-party support, more locations, faster networking, etc... That's worth it for most large enterprises that care about ticking checkboxes on audit reports more than absolute cost. Or to put it this way: the annual price difference is just $600, which is the same cost to an org as half a day of engineer-time or less. If Hetzner is the slightest bit more difficult than a large public cloud VM for anything, ever, then it's not cheaper. This could be patching, maintenance, migrations, backup, recovery, automation, encryption, or just about anything else.

There are other differences as well. Hetzner has a separate charge for load balancers and IP addresses, whereas with Azure they're included in the price of the VM.

The biggest cost difference is that the public clouds charge eyewatering amounts for Internet egress traffic. Azure is about 100x as expensive as Hetzner, which is just crazy.


I certainly can't dispute the other points...but to me the AWS Savings plan always felt like vendor lock-in...and sort of like a virtual "on prem", in that i have to commit to something for X amount of time (like old school provisioning hardware and have it live for X time), and then i lose the flexibility of what i thought *the cloud* in general was supposed provide: that is, freedom to scale up, down or *out*, etc. I won't fault AWS and others for making their money; this is capitalism after all regardless of the vendor. I guess maybe the cloud sort of lost its shine, and it doesn't feel as liberating as maybe it once did, and both cost and complexity are overblown, maybe?


It's the result of MBA-driven enshittification.

"In the beginning" the clouds promised to use their scale to soak up your unpredictable demand. You as the customer didn't have to think about capacity, or planning ahead, budgets, opex, etc... Just swipe your credit card and go from zero to any number you please and back again at any time of your choosing. Because there are so many other customers using the cloud with you, the unpredictable nature of your individual usage is averaged out and the cloud vendor gets a (slightly) noisy but manageable usage level of their resources. They have to work a little harder to predict future capacity needs, but you pay a premium for this.

"A little later" the MBAs realise that they can squeeze 5% more profit out of their customers with lock-in contracts that make everything "nice and predictable" instead of the stochastic noise they had to "deal with" before. Getting rid of that makes things a lot harder for you as the customer, but they don't care. They care about that 5%.

Ta-da... we're back at having to "procure", we're back at budgets that have to be planned for 3 years in advance, we're back at having to have time machines.


Bleh and yuck! I'm finding myself nodding in agreement with all that you stated, but not feeling good about it because of the reality of things! :-)


The reality is that with the maximum discount, the public cloud is still 2x the cost of comparable hosting providers (including on-prem).

More realistically, I've found that the cost is between 3x to 7x what people were paying for before.

I'm not surprised cloud adoption has slowed to a crawl. Azure and AWS won't admit this publicly because it would tank their share price, but they can't hide it from observant people. For example, they used to get the latest Intel or AMD CPUs before retail availability in huge numbers. Now? They're 2-3 generations behind because they're not rolling out new servers in significant numbers. The customers are all tightening their belts because of the global economic downturn, and one of the most expensive things they've been splurging on before was public cloud hosting.


On GCP and Azure, most folks would be better off running serverless containers via Cloud Run or Container Apps (AWS has no direct equivalent that scales to 0 and incurs no cost).

Both of these scale to zero and offer 180k vCPU/s free per month, 360k GB/s free per month. You incur billing only against the active execution time. Cloud Run Jobs has a whole separate free monthly grant as well.

You can run A LOT for free within those constraints. Certainly a blog or website. To prevent cold starts, just set up Cloud Scheduler (also free for this purpose) to ping the container every few minutes.

Use Supabase for a DB or one of the serverless options (if it works for your data use case) like Firestore, CosmosDB and you can run workloads for a few cents per month with an architecture that will scale easily if you need it to.

6 min video showing the receipts and how easy this is: https://youtu.be/GlnEm7JyvyY


The AWS equivalent to Cloud Run and Container Apps is called Fargate, https://aws.amazon.com/fargate/


Neither Fargate nor App Runner scale to zero (unless something changed). So there is always a baseline cost of a few dollars.


Lambda container would be the closest equivalent with that functionality, https://docs.aws.amazon.com/lambda/latest/dg/images-create.h...


Agreed. I ran a small GPT chatbot on Lambda through API Gateway with a dynamodb storage backend and don't recall incurring any cost (or if I did, was just pennies per month).


But if you’re constantly pinging the container (as suggested above), it will never scale to zero.


It "scales to zero" as soon as the request stops as far as billing is concerned.

However, the image remains "warm" and incurs zero cost once the last request ends. So I usually have a `/heartbeat` endpoint for this purpose and point a Cloud Scheduler job at it.

I haven't read the docs to figure out the exact heuristics of when it becomes "cold" again.


Those per-request models usually don’t pan out well. They’re conceptually simple, but you soon realize that you need at least a couple of 24/7 always on boxes and that you only really should use Cloud Run-like services for burstable workloads.

PaaS services or even VM scaling sets with volatile instances can still be stupefyingly cheaper, but that point is really hard to make to architecture astronauts.


    > They’re conceptually simple, but you soon realize that you need at least a couple of 24/7 always on boxes and that you only really should use Cloud Run-like services for burstable workloads.
This is simply not true and Cloud Run-like services offer an easy path for progressive scaling.

1. You can scale it to 0 at the outset as you build your app

2. You can set it to scale to a minimum of n instances (e.g. minimum 1, 2) to have fast response times

3. If you find a need for a 24x7 instance, take the same container image and you can launch a Compute Engine instance with the container directly and scale that way.

4. If you need more control beyond that, move those containers into GKE Autopilot or full GKE or your container orchestrator of choice.

Not only is it easy and free to get started, it provides a straightforward path to adapting the underlying deployment and compute model based on needs as the app scales without the need to pay anything until you actually need 24x7 compute (and even then, it's a matter of setting your Cloud Run service to min=1 instances to get 24x7 compute or configuring a CE instance with the same exact container).


I understand where you're coming from, but not everything speaks HTTP or works on a per-request basis, and PaaS services are still cheaper.

I've seen people shoehorning all sorts of batch processing into HTTP (backed by queues or not) and it has tremendous overhead over just having the cores and RAM there in the same place as the data.

I learned that lesson with Google Apps, and never designed anything to rely solely on HTTP ever again.


Thanks a lot,

Most people think it is easier to use EC2 than FarGate since the first is the most famous one. But actually, it is the other way around!


If it absolutely has to be on AWS, I always go with ECS via (the poorly named) Copilot CLI. Makes it waaaaay easier.

https://aws.github.io/copilot-cli/


Worth noting us large AWS customers get huge discounts, huge credit, actual real engineers on hand 24/7 on Slack, contractual service guarantees that last years and a large market of people we can leverage to build stuff in there. And a lot of the services are low to zero cost that would be expensive to run on Hetzner or don't exist and you have to build out.

YMMV but all costs aren't instance costs.


Yep, most corporations have 2-3 _named_ account reps who are available on the company Slack and will visit your office 1-2 times a year to sync up that everything is working as it should.

And they're not just salespeople, they've actually said multiple times if a feature doesn't work for us without trying to hold it wrong in a dangerous (and expensive) way.


> they've actually said multiple times if a feature doesn't work for us without trying to hold it wrong in a dangerous (and expensive) way.

Can you give examples of this? I'd love to hear more about the kinds of guidance they can give.


We were building a matchmaking service for our mobile game and thought about using Amazon GameLift FlexMatch[0]. On the surface it would've worked.

BUT, it's _very_ highly geared towards fully-online games where everyone playing the game is connected to a server all the time. Our game was asynchronous PvP where the attacker was online, but the defender wasn't.

I had a 30 minute chat with them and they confirmed that it _could_ be made to work, but it'd be extremely janky and expensive in our use case.

We ended up building our own (or actually expanding our existing setup a bit).

--

We've also got pretty good estimates on how much something might cost: we have an application that needs a specific number of writes/reads with X amount of data per write, what would be the cost on different Amazon services.

Again they came back with numbers and with many services (DynamoDB especially) it would've been either impossible or prohibitively expensive. We ended up changing the application structure to do less IOPS + more aggressive caching and ended up using plain S3 as storage.

Without their consultation (And inside knowledge about AWS internal hard limits) we would've spent weeks building a solution that will eventually fail as the data stored per user goes up.

[0] https://docs.aws.amazon.com/gamelift/latest/flexmatchguide/m...


> Do you really need the advanced features of AWS and Azure right now? Or would a simple virtual machine at a reasonable price be sufficient? That’s the main question here.

This is one of the more important points and why the point "The learning curve of a single server isn't so big, especially when compared to AWS" is sitting a bit wrong with me.

Sure, if you talk about 1 VM, I agree. And I wouldn't second guess doing this, at all. It would be my initial plan as well as long as I don't have to make any strong availability guarantees. And for this use case, I'd call AWS a bad choice. It's not a simple VM provider.

But once you start running e.g. a redundant postgres cluster for updates without downtime, the amount of stuff to know also grows, a lot. Suddenly you also need backups, tests of backups. And this is where AWS/the cloud allows you to save time, and treadmill time.


The article was originally intended at manufacturing companies, not at IT startups, that currently go "all-in" on AWS and Azure with all of their managed services, when actually 95% of their workloads are in virtual machines, and the remaining stuff could easily be handled on a single VM. Or maybe a couple of VMs and a managed postgres somewhere (e.g., maybe even at AWS or Azure).

Would probably give them way more budget in actually building applications than running the infrastructure.

Maybe I'll extend the article to include the point of using a managed postgres at AWS / Azure / fly.io, whatever, in combination with Hetzner VMs.


I have a single VM for my personal stuff, but I use Azure’s backup and automated fail-over mechanisms as well as managed services for database and data processing for this very reason.


Summary: Hetzner does offer reliable and cheap machines compared to AWS/Azure/GCP.

The pricing is more on par with Digital Ocean/Linode.


No, it's at least 4x cheaper.


For many things DO is 2x or more though. Linode I never tried.


Not sure it still applies, but I didn't like Linode because they minded the disk arrangement too much. Prescribing the layout/geometry in the UI. Something about their implementation prevents one from just loading a typical cloudy image.

Maybe, just maybe, I want to use LVM or something entirely unknown to them. Not necessarily in a privacy sense, but control.


Hetzner pro tip: https://www.hetzner.com/sb

If you're looking for a cheap one-off server, the server auction has some very good deals.


I believe that I've saved millions thanks to the fact that I stumbled on Hetzner back in the days and started using it for the company I was working on. Not saying it is a perfect service, but I very much like my money, and seeing on what kind of invoices are racked up by using these cloud services, I'm pretty confident that the alternative costs would have been 4-5x more.


This matches my experience. I ran one of my side-projects on AWS for a couple of years before switching to Hetzner - AWS was around £35 a month while Hetzner was around £7 a month, so Hetzner was around 80% cheaper for an equivalent service[0]. The other big thing was all the little costs in AWS - it took 2 months to get the AWS bill down to £0 due to all the hidden extras like backups and Elastic IP address.

[0] Full details at https://blog.searchmysite.net/posts/migrating-off-aws-has-re...


It's not the same product, even if you consider just virtual machines rather than higher level services that others commenters are referring to. Sure public cloud is more expensive but you pay for the reliability of not being bound to physical hardware. When you buy a dedicated machine from OVH or Hetzner, you get a great deal for the compute power, but if something goes wrong with the hardware, you're stuck waiting for a technician to fix it.

Take the recent Lichess downtime, for example. Their main server had a hardware issue that required physical intervention. This meant the site was down for over 10 hours, and there wasn't much they could do except wait for OVH to send a tech.

If Lichess had been on AWS, the provider would have automatically moved their workload to a functioning server, and the outage would have been much shorter or possibly avoided altogether.

For Lichess, a non-profit, this tradeoff still make sense. Their service, while important to its users, isn't critical. Nobody dies if Lichess is down and the cost savings help them keep running. But if your business can't afford downtime, the extra guarantees from a public cloud provider can definitely be worth paying for.


>Take the recent Lichess downtime, for example. Their main server had a hardware issue that required physical intervention. This meant the site was down for over 10 hours, and there wasn't much they could do except wait for OVH to send a tech.

If you not a HN person with systemadmin skills yes. But is NOT that hard to have in house RADI hd setup, with failover server. Or failover NAT gateway. AWS and cloud provider are just a rip off.


It is hard.

Lichess admins are highly skilled and I'm sure they already have a well designed infrastructure. You can see what they use at https://docs.google.com/spreadsheets/d/1Si3PMUJGR9KrpE5lngSk...

The issue was on a network equipment that they didn't even manage. You can't load balance when your core network is down. There was nothing they could do as I understand it.

More details at: https://lichess.org/@/Lichess/blog/post-mortem-of-our-longes...


Their architecture is not fault-tolerant. If one server goes down and the whole system goes down, then it was not designed to be fault-tolerant.

I have been running fault-tolerant systems spread across multiple dedicated servers (inside system with multiple DB/KV stores distributed/replicated/sharded, Kafka etc). If one server experiences hardware failure, the system will automatically recover within seconds to minutes (depending on which server/part of service failed) without any data loss.

It's not that hard. You need the knowledge, but it's not rocket science.


Even something as magical as a RAID won't make a technician instantly teleport to your server, power it down in zero seconds, swap out the hard drive and boot it back up in another zero seconds.

OPs comment is valid - physical servers might incur downtime.

But I do agree with your sentiment. "Downtime" is not an argument which should tilt the discussion towards either physical servers or the cloud. AWS data centers famously also have outages, while physical servers often have uptimes of multiple years. So what's better? It's hard to tell, but at the very least, none of these solutions is downtime-free.


No, but if you have backups and DR set up, most hyperscalers will just automatically move your workload someplace else upon failure within minutes (state management complexity notwithstanding—you need to architect for that).


The offering from Hetzner I find especially appealing are the consumer grade hardware ones. No I wouldn't host business critical services on one, but I don't have those so easy win for me price wise.


Why not? It's ECC RAM and a server grade motherboard. The only downside is 1 PSU, but you can buy 10x the amount of servers compared to AWS :-)


They're running ECC on consumer level Intel chips?


Just the amds

Plus Xeon’s ofc


I would probably host even some business-critical services on Hetzner's infra. I'm thinking of "worker"-type workloads, where each machine is 100% stateless and just serves to do some compute-intensive work. With that configuration, single-node data loss doesn't really affect you, and the CPU is plentiful and cheap with Hetzner bare metal (e.g. AX101 AMD machines).


Yeah, but where would you store state? The hyperscalers give you pretty reasonable durable storage (even datalakes). Most people don’t get storage tiering or using PaaS for workers, though.


I'd shy away from storing any non-volatile state on Hetzner. As I said, I'd mostly consider it for stateless compute-bound applications.

If I was looking to scale up an existing operation considerably and minimize costs as much as possible, I'd consider spinning up e.g. a Postgres cluster or minio on their infra, which would be significantly cheaper than RDS or S3. But it's not something that I would gladly do---the storage deals provided by hyperscalers are quite reasonable, as you say.


Why not ? If you need to have HA you could easily spin up a kubernetes cluster and still be cheaper.


Sssh please don't tell Hetzner this, I'm using them!


I just moved a personal project from digitalocean to OVH and I’m hooked. So damn simple, cheap, and powerful. Far fewer layers of abstraction.


Been using their VLE-2 offer over what Hetzner gives because it's basically the same price, but unlimited bandwidth and they use AMD Epyc CPUs, which can't be a bad thing benchmark wise (especially memory bandwidth wise).


CPX machines at Hetzner use EPYC also


Company of mine just migrated from AWS for a high bandwidth service. Full payback in month one, plus more savings as they scale.

They're leaving other things on AWS, i.e. partial migration is quite doable.


AWS is like the tool I know. I pay roughly 30$ for a CaptainRover install running on Lightsail.

Hetzner starts at 50 Euro, only has servers and Europe and is going to require a ton more work.

AWS has the right idea, they give everyone who asks nicely thousands in free credits to get started. Then 2 years in your hooked. I don't want to learn a new system.


Hetzner Cloud has US and Singapore locations as well, and starts at 5 Euro.

It will take slightly more effort than Lightsail, yes.


I'm looking at the bare metal servers. I'm always open to being wrong...

I still don't think I feel like migration though. Captain Rover isn't exactly lightweight.


Hetzner does start at a much lower price point, and has servers also in Singapore and US (but only since a couple of months)


I am actually surprised that more people haven't just create and host the same software/service on AWS for cheaper on hetnzer.

I have only stumbled on one service that do it. its a datadog alternative, so the bar is not that high for pricing.


Don’t think anyone sane that just needs cheap VM compute goes to AWS.


There's a seemingly endless supply of small to medium-sized companies doing exactly that. That's why there's consultants who offer to migrate you off EC2 onto 2-3 bare-metal hosts.


This just in: use the tool that is most cost effective for your specific use case. There is no one-size-fits-all. More to come after this advertisement


Do people want to use VMs? Imo they're much more annoying to manage than higher-level managed services. The last few places I've worked we spent our time trying to get rid of VMs and replace them with equivalent managed services.

Even with automation tools like Ansible or immutable server images, packing as Docker images and running on a container orchestrator have always been much easier.


Depends if you want to pay more in money(a lot more) indefinitely, or pay more in time up front to set up automation. If you have immutable images, I don’t see how there is much difference at all. There are many container orchestrators available.


All these "VMs are sooo much cheaper" articles are always pricing bare VMs. It'd be more useful if someone actually calculated Total Cost of Ownership for a completed solution. You likely also need to price in load balancers, database servers, DNS, backups, access management, monitoring to get a fair quote and a meaningful comparison if you actually plan to run a real application


> load balancers

These also run on VMs.

> database servers

These also run on VMs.

> DNS

This is such a tiny cost that it’s not worth mentioning at any scale.

> Backups

This can go any number of ways, with price tags all over the place, yes.

> Access management

There are plenty of free and paid solutions available.

> Monitoring

See Backups.


If we're only considering "runs on VMs" and "cost" then I can beat Hetzner by a huge margin running on the server in my home office. The point was, saying "it can run on a VM, who sells VMs the cheapest" is a largely useless comparison.


All your points skip over the need to hire people to do Ops on those VMs.


Not only that, some things like monitoring you get for free on cloud providers and setting up your own HA stack is going to be significantly more expensive.

There's almost no testing and validation needed for something like AWS RDS Postgres backups. Occasionally you store an instance and that's it.

Other things like Postgres out-of-disk-space is a 10 min fix on AWS to increment the assigned space. If your VM provider is offering SAN/NAS you may be in luck, otherwise hopefully you have a balast file or some logs you can delete to free enough space to get things running long enough to fix the problem.


Setting up HA is approximately the same cost as with anything else: N x $cost. Web servers? Run two with HAProxy fronting them. Want HA HAproxy? Run two with a virtual IP. Etc. Not hard to calculate, not hard to set up. These are all extremely solved problems with tons of documentation available.

I mentioned monitoring being required and having a wide variety of options and price points.

If your DB runs out of space in a VM that’s on you for not having adequate monitoring and alerting. If you don’t have networked storage available and somehow fail to see the predicted eventual out of disk, again, that’s on you.


And you have to hire people to do CloudOps. It is the same; if you think otherwise, you’re not at scale.

I promise you, running a DB in RDS is almost as difficult as running it on metal or a VM, except that when things go wrong, you don’t have as much insight into why.

Running a Linux server is not hard.


A free tool to compare Hetzner VMs with other cloud vendors: https://calculator.holori.com/?currency=EUR&payment_


Azure/AWS provide much more base services (multiple regions/AZs, DynamoDB, S3, SQS, etc) that are pennies to operate and aren't really targeting the cheap low end that Hetzner is.


Well that is true, but I don't use AWS or Azure because I want to run servers. If you treat a public cloud like a datacenter, you're likely to have a bad time.


Hetzner, please bring dedicated servers to the US.


...when using bare metal servers.

Hetzner doesn't have the services AWS provides, that's the reason most companies I know use AWS for.

If we could run our crap on any server, we would, but managed services are still cost-effective vs hiring our own 24/7/365 rotation of on-call ops people.



DHH + Basecamp aren't exactly a run of the mill company hiring mediocre talent.

They have the skills, cash flow, and resources to do whatever they want.


This.


> our crap

Yeah if people had less shaky stacks. But it is always easier to pay someone to run the hack.


You don't get valued $2T for charity.


These types of articles always read like “yeah you could buy a Land Rover, but this Kia hatchback over here still gets you from A to B and is only a fraction of the cost.”

It seems lost on the authors that yes that might work for some folks just fine, but others really do want the Land Rover and all its additional baked in features beyond getting you from A to B.


Analogies are not good for arguments. Most people do not need 4x4.


Great Read!




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: