I spent the last few years developing software on AWS. I honestly cannot imagine not using it. Once you get good at infrastructure as code (CDK), microservices, and how everything wires together, any other technical stack is just painful. I still believe that engineers are completely underestimating productivity gains in people that are good at AWS. There is just so much less BS in terms of systems working, everything clicks together, you arent stitching a million open source libraries to build a stack. Billions of dollars of effort are behind making it as seamless as possible.
Infrastructure as code can work on any cloud provider including smaller ones like Digital Ocean, Linode, OVH, Hetzner, etc which are mentioned in the article.
A lot of the things you said apply just as much to them as to AWS.
In my opinion using AWS is fine, as long as you use cloud agnostic stuff and avoid AWS specific services. That way if you ever have to switch, you just have to rewrite the provisioning part of your infra-as-code, everything else can be easily migrated over with some config changes.
If you are not using cloud native features you are essentially just using a virtual datacenter at which point AWS is an expensive choice.
On the other hand, if you use Kinesis, Lambdas, RDS, VPC peering, and IAM accross all of them including EC2 and EKS, then no, none of those other options can even come close. Purely the lack of system-wide IAM alone would be a dealbreaker.
This is also why being 'on the cloud' and 'cloud native' is very different. While the latter is a bit too buzzwordy for my taste, it does show the difference between "playing datacenter" and "getting stuff done in new ways, both cheaper and faster".
Do not use AWS if you just run a few virtual machines and some basic networking. That also applies to GCP and Azure. It's too expensive for what you need, and too easy to misconfigure and shoot yourself in the foot.
On the other hand, if you need API-driven infrastructure with a large gradient between IaaS and SaaS to pick from, then by all means, use AWS.
> If you are not using cloud native features you are essentially just using a virtual datacenter at which point AWS is an expensive choice.
Hence “you probably don’t need AWS”.
> On the other hand, if you use Kinesis, Lambdas, RDS, VPC peering, and IAM accross all of them including EC2 and EKS, then no, none of those other options can even come close. Purely the lack of system-wide IAM alone would be a dealbreaker.
Hence you are locked in and have zero negotiating position. Your business has an existential risk based on AWS - assuming your code is critical to the business and could not practically be rewritten because of all this special sauce you’re using.
That might be okay but I sure hope your CTO made that decision consciously.
The same is true in reverse. If your team assumes they need the optionality of avoiding cloud lock-in, your CTO had better have made an explicit choice to pay the overhead of not using these cloud services.
Sure, you can play FUD games about AWS suddenly raising prices, but I doubt you'd have any historical evidence to back that up. So you'd better be able to say why being cloud-agnostic is critical to your business.
AWS prices are already high, the danger is not so much that they will raise prices but that if your business conditions change. Perhaps there’s a new competitor and you can’t afford to splurge on cloud costs anymore, or your business needs to cut costs and focus on profitability because of economic conditions - you are stuck with this albatross.
High compared to what? GCP? Azure? A raspberry pi in a closet with a cable modem?
You generally pay for what you need and for what you can afford. If you don't need AWS or you can't afford AWS, it wouldn't make sense to use AWS. That goes for in-house software development, infrastructure management and pretty much anything else too...
In my startup we use mostly linode and handle a lot of volume. I have a few VPSs and some dedicated hardware for specific tasks. Monthly bill roughly 200-250USD all included.
The other startup used the AWS free credits. Got extra credits and lived it up. Adopted Kubernetes and all the big AWS stuff. Brought in a few devops professionals. All super smart people.
Monthly spend 7kUSD on traffic/usage that's actually much smaller than the one I have with my startup. Availability is roughly the same (if anything, my startup has better uptime).
So yes, this is an Apples to Oranges comparison. I get that. They can better scale in theory etc. But if they pay 7k right now, I can't imagine how expensive it will be when they scale.
They started just using EC2 but slowly got roped into AWS services so now they have no vendor neutrality and no viable/easy way to reduce these costs.
So none of your requirements match up with AWS services. Of course it's not going to be a great match in that case, right?
If you need 'some random compute and a bit of networking' then no, do not use AWS.
Vendor neutrality isn't worth match to a business unless you are the EFF or intend to move vendors all day long.
You don't need AWS or any other cloud for volume (traffic volume, that is), heck I'd say stay away from the clouds for volume since that is where the bulk of your money disappears anyway.
But AWS spend a lot of time selling you lambda over any of these other solutions.
If you have less traffic then you should spend less.
This sounds like they don't have anyone who actually knows what they are doing, or did any form of cost/benefit analysis which will always mean you spend more than you should.
> Sure, you can play FUD games about AWS suddenly raising prices, but I doubt you'd have any historical evidence to back that up. So you'd better be able to say why being cloud-agnostic is critical to your business.
I'm not sure if you're being serious here. It's not about rising prices but making your business completely dependent on another entity and losing any options you might otherwise have.
As for past data, history is rife with stories of companies who made the mistake of trusting a big player their technology will be around forever. So many have invested in technologies like Flash because they were 100% sure they will be there forever.
If I was in charge of AWS, I would avoid sudden price increases, but instead focused on many new features and small charges. These look insignificant, "This is just 20 buck per TB, less than our spending on toilet paper." And then these slowly accumulate and the TCO is getting higher and higher, and again some new* smart features lure you - again, at a small cost. In the end, it turns out you can't leave even if you want.
*Or just your old open source library/framework repainted in AWS colors.
It depends on the situation, but by keeping a simpler architecture there's not necessarily any overhead. AWS has huge complexity which is a tradeoff against the complexity of simpler solutions, regardless of the $billing cost.
That sounds exceptionally vague. Simplicity rarely is bound to some vendor and more likely to be based on requirements and implementation engineering. It mostly gets as complex as it needs to be.
Simple things are simple: need IPv4 networking? Get an IPv4 allocation and route it to a (virtual) port of your choice. That is no different between on-prem, managed-dc, unmanaged-dc, managed server provider and cloud provider. There might be some semantics that are different, where a physical port cannot be unplugged via an API (at best you can down/up a switch port but that's not the same) and routing may or may not be implicit, may need a firewall or routing rule and may need translation. But all of that applies always anyway and gets named differently on who you are paying for it.
If you create something very complex, but it didn't have to be, perhaps it's simply a bad implementation and not a property of the technology that was applied. There are plenty of engineers that think they can roll their own crypto libraries, write their own filesystems and their own RDBMs... but what business case for the 99% out there really requires that? Nearly everything in a business is a generic problem any business has; it's always about products and/or services, and it's always bound to some legal and financial system the business operates in. Everything beyond that is just marketing and trust building.
The only unique thing in most cases is the data you have and the degree of integration of processes within the business. And neither of those are an actual technology problem or complexity driver.
I would say otherwise. Some basic things that come to mind
- db snapshots that works
- S3
- Cloud Trail - know who and when access a server and what commands they enter in their SSH session.
- Session Manager - being able to control permissions in a single place.
- S3 + Athena + QuickSight - easy reporting at scale
- the list goes on and on
As soon as you have decent amount of data which you care about AWS became dirt cheap compared to on-perm when user right. Look at the list above and imagine how much effort you would have to put on an ongoing basis to support this.
True. But while avoiding vendor lock in is a worthy goal, it does have its costs - as in you are now on the hook for duplicating various functionalities that may not work as well together.
For example, let's take a simple producer/consumer system with some sort of queue. That's an easy enough problem. But add in some additional requirements:
* Data encryption - both in transit/rest
* Access control
* Audit logs
And it becomes much more complicated. It is a trade off. But the productivity gains available by having a common consistent solutions to issues like this... shouldn't be underestimated.
Trying to utilize all three major US cloud providers with their cloud native tools simultaneously, learning those tools and cutting IT, dev and security from on prem to those, takes a massive staff in a regulated industry.
Doing it "right the first time" in one of the big three, then building a skunk works to replicate efforts as a blackout scenario, seems much more cost effective.
The key term is Total Cost of Ownership. Lockin is a cost. Legacy stacks are a cost. Learning three times the tools are a cost.
Having to pay expensive diy-developers is also a cost (and a huge risk). And if you then need to transfer knowledge and reinvent the wheel for all the NIH-applications you're essentially becoming a cloud, but less complete and not scalable at all.
> Hence you are locked in and have zero negotiating position. Your business has an existential risk based on AWS - assuming your code is critical to the business and could not practically be rewritten because of all this special sauce you’re using.
You don't have a negotiation position based on your ability to switch even if you are huge. AWS pricing negotiations is almost exclusively based on volumes.
Cloud services are competitively priced. I'd honestly like to see some examples of people who invested the time to build cloud agnostic get actual ROE on that in terms of pricing concessions by negotiating with AWS/GCP/Azure over and beyond the additional developer time to build said platform.
Unless you intend to create your own CPU architecture, your own internet and your own human population, by your logic, you are always "locked in". Regarding negotiating positions, those are governed by contracts and by money, the same as with any other company.
There is no real lock-in other than the requirements of your own implementation. A lambda on AWS can just as easily be run on Kubernetes, but that means that on top of running the code that you wanted to run, you now have to run Kubernetes too. You could use a hosted Kubernetes control plane, or fully hosted setup, but according to your same logic you would now be locked in to Kubernetes.
It seems to me that you might have a very special use case, or simply not have had the scaled experience or requirements that made a lot of other vendors non-viable.
Business risk-wise, AWS is a very low risk when compared with pretty much anything else. What do you think will fail first, a custom in-house maintained setup (expensive, non-portable between companies and people) on Digital Ocean, Linode, OVH, Hetzner, etc or all AWS Availability Zones in all AWS Regions world-wide? (and we ignore the lack of IAM, lack of things like RDS, SQS, SNS etc)
I recommend using descriptive names rather than marketing ones. Proper naming helps you understand and think about things accurately to remove psychological bias/koolaid.
Another term which is more descriptive for cloud native is vendor cloud lock-in, which could be reasonably but descriptively shortened to cloud-locked or cloud-married maybe.
Native sounds cool but completely loses sight of the cost of lock-in, while locked or married talks to both the benefits and the costs.
Vendor lock in is an attribute, or effect, which is not the same as “another term”. You can’t start an argument about semantics only to bend this language so drastically.
Your semantics are misguided at best and malicious at first hand. FaaS is cloud-native, but is a mere abstract context a lock-in? If so, what is the point of ascribing lock-ins at all at that point? Cloud-native isn’t vendor-owned, at best it could be trademarked at the CNCF but that’s not a vendor.
I wrote a big reply to the below quote from you, and then realised were were probably arguing the same point.
> Do not use AWS if you just run a few virtual machines and some basic networking. That also applies to GCP and Azure. It's too expensive for what you need.
Currently on a contract where I've saved them 6 months of my costs over 3 weeks of part-time development on a single year long project. They saved half of my entire contract on their first foray into AWS compared to Digital Ocean charges.
When you know what you're doing AWS can become very cost-effective. Build it with $PLATFORM in mind and all sorts of savings are possible.
But the tricky part is whether the team is ready for it, or whether they'll spend more time learning the ropes than the company saves over $NUMBER years. Which I think is what you were getting at?
"On the other hand, if you use Kinesis, Lambdas, RDS, VPC peering, and IAM accross all of them including EC2 and EKS, then no, none of those other options can even come close. Purely the lack of system-wide IAM alone would be a dealbreaker."
If you use those, AWS starts rubbing its hands together and cackling in an evil laugh. I bet they even have an mp3 that plays whenever a major player uses a lock-in api product on AWS.
"aws lambda install blah blah" < I obviously don't know the api
Somewhere in Seattle: Mwahahahahahaha! Bwahahahahahah! Kwahahahahah!
Although RDS is just hosted SQL databases. It isn't a pure lock-in. Identity management, key management, firewalls, and networking all exist in some form at most places.
And don't underestimate how bad internal "clouds" are in large enterprises. AWS pure EC2 can still be a huge win, especially if that is the only one that has been "expertly" vetted or negotiated by some lofty level of management on a golf course somewhere.
> On the other hand, if you use Kinesis, Lambdas, RDS, VPC peering, and IAM accross all of them including EC2 and EKS, then no, none of those other options can even come close. Purely the lack of system-wide IAM alone would be a dealbreaker.
I use AWS, but I suspect I'm a fair bit older than you.
Those things you say "you can't come close" - they exist elsewhere. There are just software libraries. It's the reverse of "can't come close" - there is a plethora of implementations to choose from, and most are free. Lambdas is just CGI on steroids, RDS is just a database, IAM is just one of any number of authentication / authorisation platforms the most notable one being kerberos which is literally decades old.
It's true that AWS's versions of these libraries all play nicely together, whereas the others are going to require more glue. But on the other hand AWS nickel and dimes you on every use. The price of writing the glue can be high and has to be incurred up front, whereas the price of the nickel and dimes - that's a future self problem. Just like smoking and crack. Not unsurprisingly the cloud vendors use the same tricks as the smoking and crack vendors too - they give away their delights for free to early users by handing out free credits like candy, knowing once you're locked in as a heavy user there is no way to avoid paying the piper.
The price difference in my experience is typically of the order of an an extra 0. That's not a big deal with your primary costs aren't CPU cycles but rather developer time. That's absolutely the case at the start of an enterprises life, and it seems to remain true for many enterprises for their entire existence. So perhaps AWS works out for many.
But personally, I would not take it for my business. There are lots of implementations of all the things you mention, and much more besides out there. Worse, many of AWS's products seem to exist to solve problems created by using other AWS products. Need to monitor you SES reputation? Don't record it yourself - send it to CloudWatch! It's only a small cost. Oh you need to get it to CloudWatch - use SNS - it's only a small cost. Oh you want to trigger stuff from CloudWatch - just use Lambda - it's only a small cost. Oh, you need to control access to all those things - use IAM. Maybe it's true it's all only a small cost - but it a very complex solution compared to parsing you SMTP server logs and writing them to a open source database, which is all that is happening under the hood. At times it seems the only reason AWS structures things in the way they do is to induce you into buying more AWS services.
Which is why things like TerraForm exist. Some are happier to where the cost incur the expense of gluing together the many free alternatives out there, in return for the ability to roll it out to the lowest cost provider out there.
What’s the point of reducing everything to the most primitive element and ignoring the reality? You’re one of these “I’ve seen this before, nothing new” while completely missing the value proposition. Why do anything when everything already exists and there’s no nuance?
> using AWS is fine, as long as you use cloud agnostic stuff and avoid AWS specific services
IMO this is exactly when you should not be using AWS. AWS is a premium offering and you can find cheaper alternatives for the commoditized offerings like EC2, RDS (the older, non-AWS-specific features) and (maybe) S3.
It's the high-level services where the real value is in AWS - the cutting edge technologies that only exist in a specific cloud (or multiple clouds but where it's difficult to port across clouds due to subtle differences). This is where the extremely high development cost is able to be amortized over many customers, allowing you to get scalable, easy-to-use, and easy-to-operate services at a much lower cost than would be possible outside of that cloud. A great example of that for AWS is Serverless Aurora.
The real benefit of the cloud is their SaaS offerings. If you go cloud agnostic you create more on going maintenance overhead (since you now have to manage services), higher hosting costs and you still have to rewrite most of your IaC when switching clouds anyway. Contrary to common belief, IaC platforms like Terraform doesn't mean the same code works across different clouds. It just means the same language can be used to provision services. However how you define services in that language will differ from one cloud to another even if the types of services are the same.
So unless you're going multi-cloud from day one, it's much better to write your services to be specific to the cloud you're using and benefit from them rather than boycotting them because of a theoretical decision of multi-cloud that you might never actually make. Most companies that talk about going multi-cloud never actually do because the benefits of being multi-cloud are actually pretty small for most domains. Being multi-cloud is one of those things that sounds more important than it actually is.
Developer skill sets vary a lot. I know my way around linux, but I'm no system administrator. So there's a good chance that I'll do something wrong if I built products using pure EC2. Also, frankly, I hate dealing with system administration stuff. I'll take a more expensive Aurora instance over a self-managed MySQL cluster any day of the week.
I'm sure there are tons of developers out there who know a lot about infrastructure and system administration. For those people, it might be worth spending time managing infra, but my time is absolutely better spent using a turn key solution and focusing on product development.
For smaller teams, getting feature-rich products out the door can be more important than building the most cost-optimal product. Bigger teams can afford dedicated staff to optimize infra.
> Infrastructure as code can work on any cloud provider including smaller ones like Digital Ocean, Linode, OVH, Hetzner, etc which are mentioned in the article.
Back in the ‘old days’ of the cloud you’d hear a lot of talk about “commodity computing”. Compute was fungible. No salespeople, no politics. It was a utopian environment for developers.
But in the decade and a half since, the landscape has changed. Now cloud is infested with salespeople, consultants, and worst of all, third-party solutions. It’s the same old hodgepodge of half-finished incompatible products tossed over the wall that on-prem used to be, only more expensive.
> as long as you use cloud agnostic stuff and avoid AWS specific services
That is a big tradeoff, and it's not something obvious. If being banned from AWS is an important and relatively big risk for your business, sure, go for it.
But if not, then those AWS-specific services are exactly where you get all those productivity gains from, and where AWS (and GCP for that matter) shines in comparison to smaller ones, like DO or Linode, or on-premises.
I think it's all about what features you are using, and how much you want to BYOC (Bring Your Own Code).
Infrastructure as code is something you can do on smaller deployment environments as well, even in air-gapped environments.
The benefit of AWS, for me, is the reliability and all the capabilities that work out of the box that aren't differentiators for my solution (so it's great that I can offload them to someone else, if security, etc. allows).
I agree that it's a good idea to have a layer of abstraction so you can use other service providers, even though you may take a bit of a performance hit for doing so.
> using AWS is fine, as long as you use cloud agnostic stuff and avoid AWS specific services
This is the worst of both worlds. You get the expense of AWS but without any of the things that would make the cost worth it. IMO you should either avoid using their services and move to something way cheaper (OVH, as the article mentions Hetzner), or fully buy into the benefits AWS has to offer.
I'll be honest, my experience is like literally the exact opposite. I guess "Once you get good" at anything it seems easy, and learning something else seems like a waste of time. But every time I want to do something in AWS it feels like asking for a house and having someone hand me a pile of bricks and 2x4s and then charging me for 2 houses.
Totally agree. It's been a few years since I used AWS, but back then they shipped half -assed products that didn't have a consistent feel with the rest of their products and weren't feature complete out of the box.
I remember using their email service (SES) and wanting to read logs for emails that have been sent. The only option was to point log events at an SQS queue. So, you had to create one of those, assign proper roles and access controls, etc. After that, you need to dump or consume the queue somehow. I can't remember if SQS could dump to S3, but if it could, then OK: create a bucket, assign roles, etc. Now, to parse the S3 bucket contents somehow... Wait, what was I doing? Trying to read some damn log files or set up a Rube Goldberg machine for the log files?
I get it. It's a modular system, and that can be very powerful. Sometimes I just want to do something simple without going down a multi-hour (or day) rabbit hole. The inability to perform simple actions on AWS without looping in tons of their other products means indefinite job security for a lot of people. IaC and templates for actions like the above alleviate this a bit, but it's still a dreary landscape.
The important thing to understand about AWS (or Azure or GCP, for that matter) is that you're not eliminating learning how to deploy and use an OS underneath your apps, you're just using a different (sometimes better/easier, but always more proprietary) KIND of OS. Because that's what these cloud platforms are - replacements for the OS in the modern distributed world.
They can offer awesome leverage and benefit. They can also be horrible traps that can break a company. I've seen both. As Doug Comer once told me, Unix is a Power Tool. Power Tools can kill. So can cloud platforms. There is nothing new under the sun...
Ya, but you don't know it's going to cost you as much as two houses until you are done building and start using the house. Then the surprise bills start rolling in.
I think that's unfair. Your setup involving more distinct services than his has little bearing on the amount of data you're working with. Our setup is a lot simpler than yours (no postgres, lambda, mqtt) but deals with presumably a lot more data (~800Gib daily) so it costs an order of magnitude more than yours.
Absolutely. My setup is, as stated, a dev instance. We have a substantially larger prod setup, but I can't get into the details about production scale on a public forum. That said, even the dev instance is moving a substantial amount of data; nothing like 800GB/day, but within an order of magnitude.
And yes, bandwidth costs can be brutal, especially if you try to build a hybrid solution that uses heavy services that live outside of AWS.
That said, the above comment didn't get into enough detail to get an accurate read on their bandwidth or CPU/Memory/Storage utilization, but if they can run it in a lab for $5/mo (which I doubt, since that barely covers power), then it can't be much, so I think it's fair to assume it's at least back-of-the-napkin similar scale.
Most all of those services are pay for use. Which means listing out the services you use means nothing. The price would be based on your level of usage. I have a bunch of lambdas setup in AWS and it's basically free because at the moment, I'm not running them.
Our ECS containers all run continuously processing Kafka messages. Dev has variable throughput, but it's hundreds of messages per second at least. The containers all use between 256MB and 2048MB commits (they're stream processors, so they don't hammer the memory like a stateful monolithic service would).
My point was just that it doesn't have to cost over $1k on Amazon to run a combination of services which apparently CAN run in a private lab for $5/mo, however that is calculated. I can't run my full stack even without load (due to memory commit) on my 32GB, 3 year old Xeon workstation, and that would cost a lot more than $5/mo at any provider I am aware of.
Although he doesn't mention API Gateway in his list, I used to design AWS solutions for people, and it was by far the biggest mistake I saw them make cost-wise: It can get scary-expensive in a hurry, especially if you aren't careful at your cloud architecture and really understand what all you'll be paying for.
Difference is probably that you run real microservices that have small footprint and you can binpack them on very small instances together with very very small aurora cluster.
If you wrote exactly the numbers of instances and types -> we could have a real comparison then.
Is it really the same, or 'similar, but worse'? AWS is expensive, but multi-zone redundancy is expensive anywhere and even worse: it's not available everywhere.
That also means that a side-project that doesn't need that redundancy would be a mismatch in terms of requirements anyway. On the other hand, if you need a multi zone automatically available system, you're running out of choices really fast.
Based upon outages when a single AWS zone goes offline, a whole hell of a lot more than side projects aren't taking advantage of multi-zone redundancy.
I think what many fail to realize is that - surprise surprise! It isn't that simple. Many think "if my zone goes out, I'll just migrate to a different zone!" Nope! AWS doesn't have the capacity for it - as we've seen many times, AWS can't seem to handle the migratory load when a zone goes out. Sure, the other zones don't really die, but good luck migrating your workload to them.
But when touting multi zones, no one ever mentions this little nugget of information.
We have that 'issue' with many of our Spark workloads where there isn't any of our desired capacity available as spot, but we have a baseline reserved up front instances for anything realtime anyway so with a bit of planning it's a non-issue.
It does cost money, but then again, so does not running certain processes. The trick becomes calculating the intersection at which point the costs outweigh the benefits, and that calculation applies everywhere.
If you're in a single AZ you're not in multiple AZ. Migrating between AZs isn't multi-zone either. Running at 130% capacity in three AZs, that is multi-AZ (to us, in our availability configuration). If an AZ goes down (which in some regions we use has happend 0 times) we lose about 30% capacity, but since that's our margin of scaling anyway we can keep going as-is, even if there was no 15% additional capacity available in the remaining AZs.
Some sort of manual active-standby configuration really doesn't require AWS or a Cloud, that stuff is the same 90's implementation it has always been and practically boils down to attaching your RAID1 USB HDDs from one PC to another PC and booting that bad boy up as 'failover'. (yes, that's an example, and yes it's an extreme one)
If you have capacity planning, and you plan accordingly, you take service provider limits into account, just like you would with anything else. Having two power feeds into a distribution warehouse doesn't help much if neither can't handle 100% of the load in an industrial park. So while having two feeds might seem 'redundant' to a single tenant or customer, it's only really redundant if either can supply all the demand of all connected customers.
The same applies to fiber connections, plenty of fake-redundant connections that are suggested by customers to be 'redundant' turn out to end up at the same PoP and if the PoP goes down your redundant fibers are worthless. In the same logistics distribution scenario, your trucks can't deliver goods if the destination warehouse itself is offline, and now you need redundant warehouses.
That's obviously a weird thing to do at smaller scales, but the fact remains that AWS having an AZ go down is only a small piece of the puzzle, and only really a problem if you didn't plan for it appropriately.
Automatic multi-zone failover to survive az cashes without outages seems to be commonly based on optimism, as people don't effectively drill for it regularly. There are always bugs in it that need ironing out. So people just pay the price in operating costs, keeping replicas running on other AZs, and the resulting sw/distributed system/infra complexity, but but don't actually make the last steps to reap any payoff. It's because single-az outages are rare enough and that still leaves relatively common aws-wide outages you face and your self inflicted footgun outages. And drilling for the az failover can be scary and risky if you haven't done it before and are already in production.
AZ redundancy is like storage backups or backup power, if you don't regularly test it you don't have it.
This is true, but I will argue again that I am proficient enough with AWS to deploy a full stack with 10+ lambdas, DNS, certs, RDS, step functions, etc for < 5 dollars a month. You need to understand what to use and what not to. Most of that bill is from the nat gateway/private VPC, which you really dont need
>Deployed few services on ecs, nat gateway, shared elb, certificate and route53.
>Total cost 30$ per day ~900$ per month.
Ummm... I think you're doing something incorrectly. The biggest line items on that list are ELB (assuming you mean Application Load Balancer/ALB) and NAT Gateway. But with that setup, I could maybe imagine $100/month with on-demand plans. Not $900.
Well he said he can replicate it with a $5/mo setup elsewhere.
How much compute can you rent anywhere for $5/mo? How much compute can you get even buying the hardware and amortizing over its useful lifespan for $5/mo (~$300 total)?
Reading over the post, it says $5 in their "local lab". Which probably means using existing or older hardware, so the $5 is probably electricity costs or similar.
And it sounded more like a generalisation anyway. ;)
ECS can get up there if you're backing with large ec2 instances or something
either way my biggest problem with services like AWS isn't really cost but that you end up using all of their SDKs and services out of convenicance and it becomes hard to track down or move away from.. it spiders out of control so fast that sometimes its better to spend an extra few weeks doing things the harder way that's not always an issue though espcially for startups
Managed NAT Gateways can be a very substantial cost depending on your architecture. I believe the recommended VPC layout has 3 AZs so you're paying close to $100/month for a single VPC just for NAT.
Ding ding ding, ppl here dont take to account HA and disaster preventions you sometimes have to abaid to due to security certifications and regulations!
So your comparison with your $5 local lab is not just less "convenient"!
Do you really need multi-multi-region failover redundancy? Do you really need multiple NATs in every region (it's easy to share one NAT Gateway across AZs, especially so if a region is only servicing DR)? Your numbers still sound high.
x $1 + y $0.60 to make up $30 is a _lot_ of regions/AZs.
what exactly do you set up for 5 bucks a month on your local lab? Sounds more like you’re comparing what lightsail offers, which incidentally costs the same even on aws?
If you want a simple sight with just core aws, you can set one up on aws on elastic beanstalk for a few bucks a month. Add an rds for 15 a month. Why will you needlessly overcomplicate the system and then cry foul?
Even if it was 50$ per month, its still alot cheaper.
And the cluster we are talking here about is a raspberry box with 10 mashines each 2cpu 8gb, running k8s, replicated storage, my own domain and certificate, metalb load balancers, and few other things.
Its not as redundant as AWS, but it never broke in few years of constant work.
Chances of all of those machines to break at the same time are astronomically low in normal circumstances.
If I spread the cost of that setup over two years of reservation, the monthly cost will be a lot less than 50$.
$900 a month? The only thing in that list that could come close might be ECS? Are you using ECS anywhere?
If you have high compute needs I've had good experience with that and it's cheap. You can put a huge box behind it. I did that (I have ATT fiber to the house) and for the $7/month I could still use all my deploy templates and AWS CLI commands and code to do stuff.
I tried this with an AMD home machine (AMD 5900x / 64 GB ram / nvme) and it worked surprisingly well as did a dell server (beefier). I put ubuntu on them, registered them, and they came right up in the plane.
We have around 20 API GW, at least 20 dynamodb tables and more than 30 lambdas, several queues and opensearch replicated across 2 environments (dev and prod) plus additional programming environments (cloud9) and we barely come close to half of what you you are budgeting here for. Your math is wrong.
Lets take a small production cluster of 4x r6g.large (default proposal for prod) (0.167$/h) deployed in VPC in private subnet (x2 for HA)
That’s 16$ per day for instances only and 2.5$ per day for nat-gateways per day. So 18.5$ per day and ~550$ per month.
Thats ofcourse if you disable all metrics all logs all alerting etc and it does not include Volumes cost, because its highly dependent on the domain.
AWS is expensive because you get alot of administrative work out of the box (snapshots, restores, etc). Tho sometimes have to still fix in the source.
Ive personally pushed quite a few fixes to OpenSearch source, coz version 1.0.0 had a lot of issues.
$900/month – whether thats too little or too much entirely depends on the company's revenue/margins. I've seen companies that do not care about AWS spend. It makes close to zero difference to them and just thinking about the cost costs more. On the other spectrum, I've seen companies that move to bare metal servers. A couple of servers and that's all they need. $120/month from $2000/month and it mattered to them.
So yeah, what you're saying is sort of meaningless without understanding the business type/stage/revenue/costs.
That $5 a month figure - does that include your internet connection? I have a far more complicated AWS setup that costs me about $100/mo. I'm assuming the extra costs you are incurring are for data?
5$ per month is pretty impressive. What hardware would you be running this solution on? Typically if you run hardware in a co-located data center there's a fee for that, is that included in the 5$?
What did the hardware cost, and based on depreciation, how long will it last before requiring replacement? Is that also included in the 5$ you are quoting here?
I like AWS, but K8s has reduced my cloud needs to down to a Docker registry and a K8s cluster, which I can easily get anywhere. The one thing I really like about AWS is DynamoDB, which has very cheap and straightforward pricing but can also scale up seamlessly.
Exactly - I use AWS still but it’s more or less EKS and RDS only. Could be anywhere these days - and in fact our ci cluster at linode costs ~1/2 as much and is functionally identical.
The author touched on this, you can use cloud services without the complexity and billing nightmare that is AWS. It's not AWS vs DIY.
>Is it actually good as default choice to host a software system of any scale? Let us go through some arguments against going all-in with AWS and why many companies and individual side-project developers might be better off using something else.
To be clear, the author is suggesting that AWS doesn't need to be your first choice in smaller setups and side projects. I don't think they're intending to say people shouldn't use AWS under any circumstance.
There are plenty of smaller cloud service providers that can offer similar functionality to AWS that probably make more sense for smaller-scale uses. I personally love Linode, and you can definitely pull together their various services to build something without doing it from the ground up on VMs.
Maybe, but I'll just say again, after a few years of day in and out AWS development, I can build a full blown website, with queues, microservices, RDS, react app, monitoring, alarms, logs, authentication, in a few weeks. Everything is rock solid, done by the book, and CHEAP. I think you are underestimating the power of this
I guess this sounds dismissive, but it just doesn't sound that impressive. Anybody can do that on any cloud provider in a mostly vendor agnostic way with some docker-compose or kubernetes yamls / terraform scripts.
I think you've lost touch with where the non-AWS world is up to and are assuming it's still where it was in 2012. It's not.
I'm involved with a project that is built out using colocation space in a traditional datacenter, running on VMware Vsphere. Up to now, they've been doing this very manually: customers assigned to specific, named servers; servers provisioned manually (largely scripted, but not automated); networking configured manually; etc. Worse, they have two deployments in different regions of the world, but they're managed mostly by separate teams and there's a lot of implementation details that are very different. I come from an AWS infrastructure-as-code world, and it's shocking how primitive this stuff all feels.
They had considered switching to AWS or Azure in the past, but ultimately deccided to stay on this setup for cost reasons, and frankly I don't think anyone understood the benefits of infrastructure-as-code let alone taking advantage of all the other thing in the cloud providers.
I came in to help modernize it. Just as we were getting going with some Terraform-based proof-of-concept, a couple things happened: some sales opportunities came up that would require running in another region, and there were some harder questions on the disaster recovery plan (eg: "what happens if the datacenter burns down?"). The company doesn't want to invest in building out another traditional datacenter, let alone redundant ones, so we took this opportunity to discuss shifting this project over to AWS or Azure.
Once built with infrastructure-as-code on the cloud providers a bunch of these problems just disappear, even without making the app itself "cloud-native". Having a dev environment that mirrors production is easy. Having multiple multi-tenant environments in different regions -- and adding new regions -- is also easy. Recovering from a major disaster would also be relatively painless, provided you have a good backup strategy (deploy to new region, restore backups).
There's so much going on within AWS I'd suggest approaching it the other way and identifying a few examples of things you'd need, and then researching how AWS implements those things - if you need a mainstream relational database, AWS has RDS or Aurora, etc. Need to make sure your RDS database is only accessible from your private backend app? Thats a Security Group. Need somewhere to run that app as a Docker image? ECS or EKS... which may require learning a bit about IAM so you can make sure your own user / deployment user has the right permissions to manage all these resources. Its pretty straightforward to pull a thread - but there are lots of threads you could pull.
I would start by understanding the infrastructure as code paradigm, built by CDK. With CDK, I would build some kind of webapp, incrementally adding more resources (lambdas, databases, etc). The true value is that you should never modify resources you provision in the console and only do it through code. Once you start developing software like this, it will be very hard to go back to the old way.
Where does this sit in relation with Terraform? Is Terraform not able to do everything this does? I was aware of something from AWS called CloudFormation templates or something but I heard that was too low-level and idiosyncratic, also JSON-based and not expressive enough. Sorry if this is a bit confused, just want some clarification, and thought that Terraform was the gold standard.
Terraform is using a delarative DSL. CloudFormation is similar in that regard.
Using a declarative language is both a pro and con.
Pros of that solution is that you are restricted in what you can do, which often results in simplier and cleaner code.
Cons are that again you are restricted in what you can do.
Now about CDK is just your preferred language library to write infrastructure in imperative way. For example you write java code that deploys infrastructure.
When using a real programming language you have ofcourse a lot more freedom in what you do, but ofcourse that comes at the cost of complexity.
If your developers are “creative” it can be very hard to understand wth is happening in code.
—-
So yes many ppl consider Terraform as a golden standard (me included) but after many years of working with it, like anything in IT it has own flaws and you may want to consider other solutions for specific projects.
I'm a huge fanboy of CDK. I think its superior, and that its power has yet to be realized by the tech community at large. You can essentially create object oriented abstractions of common infrastructure patterns. You are correct, you need to be careful about how you build things, but it can be extremely productive. Cloudformation is a nightmare.
CDK is also declarative - it generates a CloudFormation YAML file, so cannot possibly be anything else. Declarative is a property of a programming model, not a language.
It’s possible, and very easy, to generate yaml with a bunch of for-loops and side-effects in an imperative language.
So without knowing anything about cdk, even if it is declarative, I don’t think your conclusion is applicable.
Terraform, the language, works differently. While it has variables, you can not store state in variables and depend on side effects based on the order you read/write such variables. Every assignment and execution order of resources is purely based on dependency order and not based on the order of the lines in your code.
Terraform is not a language. HCL could be considered a language, but really requires the additional context of the Terraform resource model and the individual schema pertaining to the resources under management to decode. It populates an internal resource model, before changes are effected by the Terraform core engine.
One can configure the self same resource model using JSON, which can be trivially generated - as you correctly point out - using whatever language and paradigm you like.
However, since the language doing the generation does not influence the order of operations when applying the effects, the model is declarative. This is identical to the CDK configuring the CloudFormation engine via YAML. Indeed, there is a CDK for Terraform too...
I think there's a much stronger argument that cloudformation deploying your stacks gets to being imperative than the way you produce your cloudformation file.
The file is still declarative regardless, and the functions you write to create that file don't per se have side effects outside that file
- CloudFormation is the native offering from AWS. Templates are submitted in the form of YAML or JSON documents.
- CDK compiles Typescript and other languages down to CloudFormation Templates
- Terraform uses the AWS API directly via an AWS-specific provider, and does not typically use CloudFormation stacks (though this can be useful, in some circumstances!)
- Pulumi uses TypeScript and other languages to drive an engine similar to Terraform (and the Terraform AWS provider is one of the options for provisioning AWS resources).
I'd recommend learning raw cloudformation first since it's the primitive cdk compiles to & being able to read/understand it will give someone full confidence of their stack.
I agree that there is weird cult-like thinking around AWS, when the reality is their base services like EC2 or s3 are pretty great but most of them are just mediocre. Yet, people will insist on using services just because AWS built it, mostly just brand mentality.
Running things on prem or on something like Hetzner cloud can be much simpler in some cases, and with Kubernetes now you can provide a great platform anywhere. It just isn't the solution for every use case.
Agreed that you should be careful about all the auxiliary aws services but can you actually show real proof that on perm is cheaper as a whole while doing everything fast?
For sure you pay a premium for ec2, but at the least you get good reliability. Your data won’t disappear off S3 unless you delete it. What level of rigor do you need to put in your on premise game to promise the same?
Even for my side projects I nowadays just float an Elastic beanstalk flask app and a small rds instance. It costs 15 a month but I’m fine with that cost for the time it saves me. If you want to be truly cost conscious stick to lightsail. It’s a steal all things considered.
I think even on-prem should generally use S3 for the scalable storage parts like user image hosting, scaling storage sucks. But for app, database, or compute that’s anything but a one-off? Dedicated servers destroy in terms of cost/performance. And bandwidth on AWS is an incredible ripoff, so if you’re doing any heavyweight data hosting, you should absolutely bust out the calculator. OVH boxes with dedicated 1 gbit lines are just absurdly more cost efficient there.
One of the major selling points of using cloud is serverless where it truly lets you build-once-and-forget with minimal maintenance overhead. Can you suggest comparable on-prem/Hetzner alternatives for the products I use: CloudWatch Logs, S3, DynamoDB, Kinesis Data Stream, Step Functions, Lambda, CloudFront, API Gateway, Athena? Having used these services in my day-to-day job, I just can't imagine myself ever going back to the old way of doing things.
CloudFront and Lambda, combined with S3 and a database (DynamoDB, RDS, Aurora Serverless), are quite powerful. You can do pretty much 90%-95% of web apps on these services.
I think people forget that they used to be cheap. They built up a lot of brand equity and mind share in that time and then proceeded to boil frogs until AWS gross margins exceeded McDonalds.
The first half of the post is like blaming the barman because you drank a bottle of Michters Celebration and are stuck with a $30K bill. It’s not hard. Just don’t buy shit that may bankrupt you.
And suggesting Hetzner dedicated servers as an alternative? Why not EC2 spot instances if you’re so budget conscious? It’s just Linux (or whatever you put on them).
We spent big on dedicated hardware in the early days of our biz. Sure we bought bandwidth at 95th percentile which was cheaper. But as our people became our biggest cost we quickly moved into aws and haven’t looked back. Our bill is $500K per year and we happily pay it because payroll would be way more. However it’s the richness of the toolset and on demand scalability that is the real win.
If you need to push a fuckton of bandwidth with little CPU usage you should buy your own metal, colo it and buy bandwidth at 95th percentile billing. We dug up a sidewalk and leased from Zayo to do just that for a specific thing. Most other applications belong in a cloud provider like aws.
> Our bill is $500K per year and we happily pay it because payroll would be way more. However it’s the richness of the toolset and on demand scalability that is the real win.
Yep. This is the value proposition.
How much of a FTE's time goes to maintaining a scalable, redundant, durable queuing system on par with SQS?
I can run something like 500k messages for free, with all of the same service guarantees as someone running orders of magnitude more.
If I need to run an order of magnitude more? It'll cost $3.60.
If our service blows up overnight and we need another order of magnitude more? There's no pagers, no alarms, no schedule getting blown up as suddenly someone needs to figure out how to scale our queuing solution to 10x capacity... it costs us $20 now.
Sure at some point when I'm passing like 6.5b messages/mo I'll make up a half a non-SV FTE salary that I could instead dedicate to maintaining a queuing system... but why? Most places I've worked were much more constrained with finding good people than paying them. If it's not our core value proposition, I'd rather outsource it and have people working in stuff that adds to the value _we_ provide. That will make us way more money than saving a few bucks on infra costs.
What people really don‘t get is that depending on where you are, 500k annual is only the payroll of 3-5 employees. You can quickly outgrow that if your on premise infrastructure grows.
Or in SF, that’s the cost of 1 employee. Only consider running it yourself if your compute is by far your largest cost, or if it’s a huge component of what you’re selling (for instance a cloud backup service, a video hosting service, a CDN, etc. where you’re competing with others and a lower marginal cost let’s you gain market share)
Why not dedicated servers? You are not "buying" dedicated hardware. You are renting it, just like a spot VM on AWS. (except the spot VM is temporary)
Anytime I see the comparison of cloud vs dedicated bare metal, it seems that the cloud advocates have been doing all their comparisons and calculations based on the alternative being to buy dedicated hardware and maintain it yourself in a colocated data center.
Dedicated servers still require staff to maintain. If you’re just comparing EC2, dedicated servers may be cheaper. Once you get to application offerings, such as databases, caching, etc., you now need to weigh the cost of administering those applications directly.
That’s just for one location. If you need redundancy, you’ll also need to factor in those costs. Alternatively, I can spin up a multi-AZ database in a few minutes on a cloud provider.
I have found AWS services not to be maintenance free. And I think at least for some services, the required maintenance is greatly overstated. Especially in the age of pre-built containers where updating can literally consist of changing the version tag.
My experience is the exact opposite. Keep in mind I work for a Fortune 200 company and we recently spent $120 million expanding our data center capacity just five years ago - and it's now running out of power capacity (it has 1 MW power service). Also, our lead times for getting new servers is now months. Why? Because all the manufacturers are prioritizing deliveries for the cloud providers.
That's a general problem we have.
Now for a specific project I have, I took it from on-premise to AWS and am saving over 75% than what it was to host on-premise. The majority of the savings was due to software licensing costs. It's expensive to license software on-premise. The architecture of AWS and using their services makes it so those licenses aren't required. Moreover, we've been able to add features to our product offerings that we realistically would not have been able to add while keeping the hosting on-premise. All while saving 75% over what we were paying previously.
My experience has been AWS is faster, cheaper, and better. The business loves it and we're getting more projects. They can't believe how quickly we can deploy new solutions nor how cheap it is for on-going operational costs.
Bottom line - I wouldn't recommend hosting on-premise. AWS isn't the only cloud provider in town, but I would use them over doing anything on-premise.
> My experience is the exact opposite. Keep in mind I work for a Fortune 200 company
I bet this is exactly why you find it helpful; rapid scaling is great for when you actually need to scale. I think the post is about the people who don't need to rapidly scale (most customers probably? certainly more common than a F200 company)
I think I've beaten this dead horse for years, written multiple articles and even run a service called M3O (m3o.com) attempting to replace it for new use cases. Ultimately there's a generation of developers that have found AWS invaluable. There will be many MANY more companies who will adopt AWS because it's synonymous with cloud and the most mature solution. Yet for all it's worth I still don't think it's the end game. Throughout history we've seen industries and technologies evolve. I feel like AWS is the PalmPilot of this era and we're going to see a very disruptive evolution of Cloud soon.
It seems someone disagrees with that statement but I can't edit it anymore to ask for clarification...
Cloud is a vague term that doesn't really point to a specific thing nor does it have the same meaning to different people. "Replacing" a vague concept that may or may not be a single entity seems like a hard case to make. On top of that, the concept of a Cloud didn't replace something else either.
I think we're essentially lacking a cloud operating system. Kubernetes isn't an end solution. There's no development model, there's no consumption model. If you look at every platform in existence, there some kind of OS that defines that experience and makes it readily available to developers and end users. I think cloud services are in their infancy and we'll see the emergence of a fully vertical solution including an OS that standardises the entire model as others did. What it looks like? Well I can only speculate and try out ideas myself but effectively it's going to have direct appeal to end consumers, not just devs and execs.
Even if you just look at the current layers offered by Kubernetes there are a lot of rough edges and footguns. It’s so fantastic and yet so terrible at the same time. Put simply, someone need to make a simpler, yet equally powerful, kubernetes.
That’s only considering basic usage of deploying stuff on top of an existing cluster. Hosting your own cluster is even worse and one of the reasons cloud offerings are still so popular. If anything can kill cloud it will be a one click get kubernetes in your onprem rack, no config required. Just pass in the credentials and boom, all the nodes in the network just magically join and everything works, including all the often cloud-coupled stuff like ingress controllers. Operationally it need to be as easy as, or easier than, self hosted linux+ansible and deployments need to be more powerful than docker-compose.
Amazon Linux is pretty much just CentOS rebranded extra goodies.
Not OP, but my take was they meant something even more "cloud native", something build from the ground up with cloud in mind. Like an OS based on lambdas or something weird.
Yea I don't think Amazon Linux is anything new. We need an actual Cloud OS based on Kubernetes. One that's focused on API first service development. It needs to have a service lifecycle, an SDK, a programming model that takes into account multi service development. Something that bakes in the consumption model for the end user the way an iOS, Android or Windows does. So the user experience is visual while the developer experience is one around APIs.
Lambda has potential as a proprietary delivery mechanism but it has no real development model. If they doubled down on say a JavaScript SDK and provided UI elements maybe it would work.
I'm not the person you responded to, but I think you're going to see more folks self hosting things and owning everything themselves, and enjoying the cost savings. AWS has made great strides in the last decade, but so has the open source community.
AWS doesn't really remove an ops team, it should change what ops are doing. Maybe on a very small scale it replaces the ops team that would otherwise run hardware and do updates. But once you scale up a good SRE/Devops/Production engineering team that can run the infra for the company on top of your cloud of choice is a game changer, since networking, database, system issues still show up.
It absolutely replaces an ops team for me. Well, to be fair I end up doing a small amount of ops by monitoring things but for the most part, everything I used to have to manage for scaling and deployment is handled automatically by some config files in a .aws folder in my repos now.
I don't think people account for the amount of additional complexity to integrate AWS services. There are three huge pitfalls of building on AWS: configuration & API complexity , artificial resource constraints like iOPS and vcpu budget, and vendor lock-in .
Simple integrations, e.g. among cloudwatch , dynamoDb, s3 typically require quite a lot of boilerplate configuration for networking, IAM permissions, provisioning (e.g. in Cloudformation), and the APIs to read and write are needlessly complex . Compare the API for cloudwatch to a logfile or Dynamo to redis.
If you already have unix skills, you will be infinitely more productive with a couple linode instances.
AWS (and all cloud) also has "magic constraints" like iOPS & cpu budget that you suddenly slam into. Your app typically runs fine until a certain amount of traffic reaches one of these budgets and then it suddenly hangs. If you are familiar with traditional resource estimation & constraints – this is an unpleasant and unexpected surprise.
When you run a small startup, these constraints often cause you to drastically increase your budget because the constraint is hit in the middle of business (e.g. upsizing a volume or DB instance to add additional IOPS at the cost of $20-50k / mo)
App platforms like lambda force you to rewrite your code so that it's hard to maintain and nearly impossible to test.
If you want to leverage your skills and have a predictable experience, KISS and run your services on a dedicated host.
These kinds of articles seem to come up on tech aggregators every so often. Someone comes up with a bunch of theoretical benefits of not using AWS along with some hand-wringing about complexity (which is relative to the person but is invoked as objective fact). They then offer alternatives. But they never interview anyone who actually subscribes to these theoretical benefits and operates at any scale. There's never a discussion on the needs of the products in mind, it's always blanket dogma that can be applied in any situation.
Like any choice in technology, there's always tradeoffs. One person/team's complex is another person/team's simple. Moreover there are circumstances that definitively tip the scale in one direction or the other. If egress bandwidth is important to your product, then you're better off going with an unmetered bandwidth solution. If you want to operate as lean as possible, you're better off in a colo. But discussing tradeoffs is a lot more complicated and less satisfying than saying "AWS bad" with a bit of corporate bashing so clickbait is what we get.
I worked at several startups before working at FAANG - none of them operated at any scale that they needed to be on AWS. None of them ever grew enough to need to be on AWS either.
One of them did a bunch of stuff that made use of AWS - but it didn't deliver any value to our customers, and probably made our product substantially worse. It cost $30k a day, and unquestionably was not worth the cost, and everyone on the team was fired after a few months.
I worked at a startup that didn't use the cloud and was hemorrhaging money by only using a handful of compute in a DC despite renting a whole bunch of space/power. They could have moved to AWS and saved a ton on cost, but their CTO was comfortable with renting colo space and knew how to run a colo-based tech company. There's not much you can do when people make bad decisions. At most you can understand the tradeoffs of the decision space and do your best. There's rarely a silver bullet.
A lot of companies are fine just using a handful of VPSes.
I think the biggest problem with starting on AWS is engineers are extremely susceptible to playing with new toys - for better or worse.
Engineers are going to try all AWSes new stuff, which is fine. The problem stems from engineers wanting to do stuff on the newest shiniest AWS toy whether that actually makes sense financially or much more importantly in terms of the product roadmap.
> whether that actually makes sense financially or much more importantly in terms of the product roadmap.
If design docs don't at least mention the cost - and your business cares about cost - then you should probably have a chat with the engineering dept.
At a prior FAANG job we included basic cost analysis in our decisions. You don't always need the cheaper option if it doesn't make sense, but you should probably know the traffic that shiny new toy will receive and the cost associated with it. (eg. soln A needs 25% more compute, and the internal price is X vs soln B needs 10% more compute, but will increase storage costs for caching at X per gb for ~Y gb).
I recall a case where a coworker (Senior SDE in Silicon Valley, so easily 25k a month salary) spent a month reducing our $500 SQS bill to $350 because our director wanted to see teams making smart financial optimizations. Meanwhile, our EC2 bill was $50k a month. Not everything is smart financially to focus on fixing. What is the compSci quote on premature optimization?
Yeah there's tons of options. There's the big clouds/cloud-native stuff. There's the cloud-VMs/VPSes. There's colos. There's making a DC at an office through DIA. Understanding these options is key to making a good, informed decision IMO.
FAANG doesn't need to use cloud either. Notably, Facebook doesn't (and, a bunch of stuff at Amazon was still not on cloud until _long_ after AWS was invented).
Facebook doesn't use the cloud? Are you crazy? They built their own cloud. They just don't sell it as a service.
Sure - not everything at these companies needs to be on the cloud. But when you're dealing with sensitive information, you don't want people rolling their own infrastructure.
At the end of 2021 Facebook announced they would begin to use more AWS services [0]. Now, I wouldn’t call it Facebook going full cloud and I definitely expect FB to continue to use on-prem for the majority of their compute/storage; but I think it’s worthwhile to note that even FB (which may be one of the largest private clouds) sees value in using cloud providers to supplement their on-prem solutions.
Reminds me of these YouTube videos filled with bad advice for newbies. For example, "Why I never code with ELSE statements, and YOU shouldn't either". It sounds profound to a newbie, but it's all just disingenuous dogmatic click bait.
Except...these are right. Minimizing nesting, and setting up validation failures as exit-early conditions from functions to have the main function body focused on the happy path are absolutely valid principles of clean code.
Obviously there are always exceptions, but as a principle you could do a lot worse.
FWIW I rarely use else statements and prefer early returns. Might be trauma from dealing with legacy code where you'd need an ultra wide monitor to get the end of the indentation.
Just to add here, even aside from these trade-offs...
The main alternative is: host the hardware yourself, as it might be enough.
Cool. How many people here are also great sysadmins? Probably a very small number. That's not really an alternative. And furthermore, most of the counter point doesn't really exist.
They just mainly pointed out that "yeah, devops is hard, and lots of things are done by devops that seem simple but arent" but they didn't make an argument that aws is somehow an overnengineering thing that you can avoid if you would only do XYZ and remember that you probably have small scale.
> How many people here are also great sysadmins? Probably a very small number.
I am. And there probably aren’t that many around here due to the extreme prejudice and derision Developers direct towards Sysadmins. We’re apparently a bunch of low-skill knuckle-dragging hardware monkeys, while at the same time able to do things so difficult that no developer can figure it out so they just go to the cloud instead “to avoid learning all that stuff”.
No, hosting the hardware yourself is not the main alternative.
Very few non-cloud users actually host the hardware themselves. You can rent dedicated servers or even VMs just about anywhere. The hosting company manages and maintains the hardware as part of the monthly price.
A lot of small sites can probably do great without a sysadmin, nor AWS. If you think about it AWS can be a pre-mature optimization when just running your site off a Rapsberry Pi in your basement will do.
Heh, technically t2 instances are burstable CPU so you can't control when you'll be pushed off CPU time. With a RasPi you have full control of the compute. Of course, the t2 has faster transit to pretty much any IX, is in a DC with a UPS, has an SLA for staying up, etc. so from a networking perspective it's leaps and bounds better. Depends on if you need the compute or not.
Right, so if and when you determine you actually need more compute, it's trivial to bump the instance type. Not so much in the basement, seems pretty pointless.
If you mean re-attach your EBS storage to a bigger instance type, you can just remove the SD card and insert into a bigger computer (or clone it to a hard drive).
You don't need to manually manage EBS. Shutdown instance, change instance type, boot.
I guess you could buy raspberry pi with more memory (but not more cores) and swap the sd card, but beyond that you are stuck. aarch64 rootfs will not boot on commodity x86 hardware. sd card performance and reliability is many times worse than EBS anyway.
Put nginx in front of a bunch of Pis and load balance them! Just clone the SD card and distribute to a bunch of Pis. I mean, you'll have to scale horizontally at some point even on EC2 and it's roughly the same complexity. If the value prop of EC2 is hardware abstraction, I don't think it offers that much over Raspberry Pis, unless you are going for higher performing hardware.
This feels disingenuous, like pushing a DIY narrative, similar to what the OP is doing. Whether you think it offers value or not, a Rasberry Pi at home is _far_ from a cloud VPS. A cloud VPS is behind a net connection with an uptime SLA, has redundant power, can have an SLA of its own, is not NATed, has a fast NIC, and usually is low latency to transit to because of presence in an IX. If you're equating the two, you don't understand why datacenters exist. You can _reject_ this value, but you'd need to be clear about what your expectations are out of then IMO. It's fair to make a comparison between a coloed host or a VPS in another cloud service (like DO or Hetzner) with AWS because it comes with most of the above things (just usually worse transit), but to deny that there's value in a datacenter feels like willful ignorance.
Just FYI residential ISPs are absolutely terrible. Even my ISP, a FTTH one, has uptime lower than 99%. This is aside from layers of CGNAT and your own home equipment's SLA. If you're willing to run a product on this kind of RasPi infrastructure feel free, but don't claim there isn't any value in a datacenter.
My point is that a lot of people don't need all those things. As the article points out, a managed dedicated server offers most of those benefits and the redundancy plus "easy scaling" of AWS is not even necessary if your goal is just to avoid debugging hardware issues.
It's easy to setup a home raspberry pi. It's easy to go one step up and setup a colocated raspberry pi. It's easy to setup a managed dedicated server. Auto scaling virtually provisioned hardware is great for a big company like Netflix with actual daily fluctuating demand. Most people don't need such advanced scaling features.
I previously had Sonic.net, which was one of the best fiber offerings for consumers available. 1 gbps up and down, unmetered, and was pretty much never down during the period I had it.
Maybe I don't want to learn Raspberry Pi to start my little SaaS idea. To search and click in the AWS console to get some MVP up and running cannot be beaten that easily. Can I get a Raspberry Pi site with a few 100$/month, my efforts included?
I'd argue, for an average software engineer, setting up a Raspberry pi is going to be easier than setting up an EC2 instance. It almost sounds like you've never actually done the "click in the AWS console to get some MVP" running before.
Sorry if this is too harsh, but if you can't figure out how to run an app on a Raspberry Pi then I definitely would't trust you to click around the AWS console.
Tradeoffs. If you can run your service on a residential internet SLA (or lack thereof) with residential CGNAT, and you have the ops chops to maintain Linux installs yourself, then do it. If you can run your business with a business-consumer SLA do it. AWS is much more reliable than that.
Realistic AWS non-cloud alternatives are either colo-ing in a DC, using another semi-cloud provider (OVH, Linode, Hetzner, et al.), or buying a DIA circuit for your office and running your own servers from there.
Well, you still have to patch and secure your OS on EC2. As far as EC2, AWS is really just taking over the hardware portion and making it somewhat easier to scale. But rapidly scaling is not something most apps do, even big ones.
If we're talking about serverless then I think that containerization (either running containers on bare metal, or on kubernetes) changes the value prop of serverless a lot, because if you've got a container environment you can easily just clone an off-the-shelf production-ready container to deploy your app.
As far as your host OS goes get a production ready image to run your production ready container images. It's a one time thing. Keeping the OS updated? This is not brain surgery every time there is an update. Plus you can configure it to automatically install security updates.
> Well, you still have to patch and secure your OS on EC2. As far as EC2, AWS is really just taking over the hardware portion and making it somewhat easier to scale. But rapidly scaling is not something most apps do, even big ones.
You're stuck on this theme that AWS is somehow akin to scaling. That's one benefit of AWS but it's not the only one. At small scale the margins on AWS are peanuts. A t4g micro is $6.15 / mo. The equivalent on Digital Ocean is $5 / mo. Buying your own Raspberry Pi 4 would be ~ $70 with an enclosure/peripherals, so you'd break even at ... 14 months of running your t4g. This isn't counting the power used (which would probably be minimal on a Raspberry Pi.) That overhead is nothing.
> As far as your host OS goes get a production ready image to run your production ready container images. It's a one time thing. Keeping the OS updated? This is not brain surgery every time there is an update. Plus you can configure it to automatically install security updates.
There's more to it than that. You're just thinking about running software not how packets get from a user's machine to your running software. Most residential connections don't come with a stable/static IPv4. You can update a DNS entry with your changing IP, but then you're down for however long it takes you to change your A record and however long your domain's TTLs to expire. If you pay for a static IPv4 then you've already paid for more than what you're getting from a cloud VPS. Then there's the fact that residential ISPs block tons of ports, have no SLAs on uptime, can drop your traffic without warning or recourse, etc etc.
If you're running a tiny, mostly-static site with minimal uptime requirements then you'll pay less and spend much less effort using a shared webhosting platform. They'll do all the ops for you and you get charged peanuts since these providers usually colo their own machines and run hundreds of sites on them. Dreamhost can serve a Wordpress site for $1.99 / mo with no ops work required. That pays for 35 months of running a Raspberry Pi.
A Raspberry pi is $35, for the latest model, and you definitely do not need an enclosure. You do need a power adapter, which may be around $10. But also, an older Raspberry pi can be even cheaper!
I'm not advocating for people hosting sites on a Raspberry pi in their homes, but it's certainly easy to do and if your operation is small enough and you have an extra computer lying around the cost is actually near $0.
Pretty much every ISP I've used in the US has had mostly static IPs. They usually didn't change unless the modem got rebooted. This is good enough for your minecraft server or unimportant personal website.
But of course if your residential internet is no longer serving your hosting needs, you can go a step up and get a virtual host or colocate your old computer! Yes colocation will cost a lot more than a virtual host on a PHP shared host, but you also get more computing bang for your buck.
Complex uncapped pricing is not a theoretical problem.
Cloud lock-in is not a theoretical problem.
Loss of control in case of downtime or a "ban" is not a theoretical problem.
Cloud reducing complexity is questionable and may swing the other way.
They are valid concerns or at least warnings. You're obviously right that there's tradeoffs but the thing I'm sensing in our industry is that it's increasingly seen as just the default "thoughtless" infra model.
It's simply a predictable swing of the pendulum. 5-10 years ago the same formula would play out except then, it was AWS solving all your problems, and how hardware is so hard because apparently you need to hire 3 full time employees because machines run on sysadmin tears rather than electricity so of course you need someone constantly lording over your 10 servers.
>with some hand-wringing about complexity (which is relative to the person
Are you sure it isn't relative to other solutions? I mean sure, you can learn it well enough that it's not complex for you anymore but there is still some objective measure of its complexity that relates to how difficult it was to learn.
IMO a good example here is ops. I grew up in the era where net businesses started by renting colo space or getting a T1 line. When I learned about starting net projects, I learned how to set up all the things needed to get a project going in one of these settings. To me, running ops on a Linux or BSD box is simple; I've literally been doing it since I was a kid. These days it's a lot harder to find people with this expertise. To _me_, buying a DIA circuit and setting up a mini-DC for a business is simple. To most others who experienced Linux through Ubuntu or mostly grew up using MacOS, using cloud-VMs is probably a lot simpler.
To come from the other direction, not every app is or will ever need to be running at Google scale. Most of what AWS offers just triggers large enterprise warm fuzzy feelings. Even when they're totally not needed.
Does anyone know an adblock rule to filter out HN articles telling me to buy an on-prem server to replace AWS? Seriously though, why are valueless bait articles like this so popular here? Is it collective satire that I'm missing? Is "you don't need AWS" a trendy counterculture thing? I could swear I've clicked through to equivalent articles at least five times from HN.
> Is "you don't need AWS" a trendy counterculture thing?
It's this. There's a growing counterculture that likes to hate on Big Tech, and a lot of these folks do so by making disingenuous criticisms against Big Tech. These criticisms are usually low-effort and inaccurate but the authors know they'll get upvotes and spread their screed because of how popular it is to hate on big tech in these countercultures. It's a shame because there's certainly valid, deep criticisms to offer Big Tech but once it became counterculture-popular to hate on Big Tech then people just began taking potshots where they can.
Criticizing the cloud is a popular one because a lot of engineers genuinely don't know a lot about ops and don't know what a cloud is and isn't, they just use it because a senior engineer at their company made the (usually reasonable) decision to opt into a cloud. Alternatively they're junior engineers that have only written software on their machines and don't know what the difference is between running on a local Linux machine and running a net-connected service. That makes it fertile ground to make unsubstantiated claims because most engineers don't actually understand what's happening.
I avoid it because of unsafe billing practices. As someone higher up in the comments said, "don't buy stuff you can't afford". The problem there is understanding AWS billing is an industry itself. Numerous stories of people who thought they understood something they didn't. Software bugs resulting in bankrupty.
I have to have billing predictability out of my service provider.
I'm not aware of AWS employment issues other than it being a place to burn you out, but the wearhouse side has had plenty of newsworthy incidents.
I think AWS improves infra work. Using CDK is a joy - primarily because it's clean, simple, and reproducible once you have stuff up and running. But the process of getting stuff up and running is its own kind of drug, it's a serious puzzle (unless you really really know what you're doing) which I find really rewarding to solve. I'm a pig in mud with CDK.
Compare that with traditional infra work and I immediately start to worry about how to make sure that everything is provisioned correctly, that other people on the team aren't messing things up, how I'm going to implement zero-trust type auth without something akin to IAM, how I'm going to implement monitoring without the monitoring system becoming its own beast....
I find the provided arguments a bit baffling - you would be better off without AWS because you might forget to turn off $80k of instances? How is that their responsibility or different than any other provider / service? Or, if you're using Lamda - well, you're not just accidentally developing a serverless solution, are you - you know pros and cons and have chosen to do it, knowing how it's priced and to be careful...If self-hosting is potentially a better option, I'm curious how that can be done in a simpler way, while providing all security and benefits of mentioned VPC, EKS etc...
I'm not particularly fond of AWS, but I think they're pretty transparent about pricing of various services - it's linked everywhere and pretty visible, calculators are available etc...I agree about not being able to set limits, though - but how many other vendors do it?
My biggest gripe is the UX of their Console and various services - I'd rate it 3/10 compared to what could be done in terms of design, displayed information and user paths / workflows.
> I find the provided arguments a bit baffling - you would be better off without AWS because you might forget to turn off $80k of instances?
Except that AWS (nor other cloud providers) specifically and intentionally doesn't give me a way to limit charges.
This ... is .. a ... big ... deal.
I'm happy with my site going down if I hit $1000 in charges in a month or $100 in a day.
Maybe some cryptobros broke into my instance because I screwed up. Maybe HN just threw a zillion people at my project. Maybe I just flat-out screwed up and opened an uber-expensive EC2 instance. It shouldn't matter. I don't want more than $1000 in charges in a month without me specifically and personally authorizing it.
The fact that I cannot do this means that AWS (and others) have specifically deemed this to be a significant source of profits. Who am I to argue with them?
Also, things can get pretty complicated when it comes to pricing on AWS. I'm thinking of things like S3 where at first, it seems simple. $x per gb/mo. But then you have egress charges and then operations charges, and then minimum storage time charges, and a half dozen other things that can affect the pricing.
Imo it's the author's attitude for why we have so many regulations on things. Their argument is essentially "I didn't do my reading to learn the implications of doing this thing, and now it hurt me." It's why the concept of an "accredited investor" exists.
I think I see your point, which is that sometimes safeties/regulations are necessary to prevent harm from people that don't know what they're doing. However, I'd counter your firearm example with one of the world's most popular firearms, the Glock handgun, which only has safety mechanisms that help ensure that the firing pin strikes the primer from an intentional trigger pull. You can still not know what you're doing ("what does the trigger do?") and have an ND (negligent discharge).
To me AWS are the Apple of cloud computing. Expensive, reliable, opinionated, etc. I have been in very few situations where some AWS service was the ideal tool to solve a problem, but those times it just worked.
The author claims that another drawback of AWS is that your AWS credentials can get "stolen by bad people." What? This is a risk to literally every online service. You must practice opsec and good design patterns if you have API credentials for anything online. Your secrets should be locked down with least privileges, encrypted at rest and in transit, and rotated periodically. AWS makes this extremely easy to do.
>A HN user reports a story of someone who burned 80 grand overnight by provisioning bunch of EC2 instances for testing and leaving them on.
So, ~$5000 per hour? Either a company cares about $5000 per hour, or it doesn't. If it doesn't, then it doesn't care about $120,000 per day either. If it does, then WTF were they doing running a cluster at $5,000 per hour?
And by definition, anyone who wants to launch $5000/hour or EC2 instances, isn't able to rock up to a COLO or IT department and say "I need 1,000 96 core Intel boxes for a couple of hours."
Literally any other provider that can give you that service will require that you sign up to pay for $5,000 per hour, for how ever long you want to run them. There's nobody running a service that you go inside and ask for $20 on pump number 3. You're giving them a credit card, and then you're asking for an instance limit increase.
The author thinks this is a damning indictment of AWS when a small amount of analysis shows the killer feature underneath. It's what people want: to spend cash, immediately, on compute.
So when I was working at a startup, and I asked our COO to request an instance increase (to 250 instances, backed by his credit card) I made damned sure that I didn't "leave the cluster on". And that cluster gave us 2,000 cores! We couldn't have afforded to buy 250 servers!
Finally, when this happened to a colleague at a different company, we reached out to AWS and they refunded the money. Because if a company allows a developer to launch a $5,000/hour cluster, then that company is already spending a lot on AWS, and will certainly spend a lot more in the future.
I learned recently that you can rent a 42U cabinet from Hurricane Electric for $400 a month. Granted, that _only_ includes 1Gbps internet, and you have to provide your own servers, but still. I was surprised at how cheap this was. You could probably fit enough compute in there to power most businesses.
The given alternatives include Linode/DO; I would suggest Amazon Lightsail instead. Then you have a clear on-ramp to AWS if you should grow to need it, and the pricing is on par with Linode/DO. The only real differentiator is egress cost once you exceed your included monthly allotment.
That, and AWS has a massive scale that is useful even if you have a single server. When I was on Linode, they got DDoSed and the whole service was inaccessible, even though the DDoS wasn't targeted at my server. They didn't have enough capacity at their POPs to grunt through the DDoS. You can't do that to AWS.
>Furthermore, AWS billing is uncapped, which has significant potential for trouble.
Two words: billing alarms. If you have stuff on AWS, and you don't have billing alarms for actual and predicted costs, you have nobody to blame but yourself for extra charges.
EDIT>> I see they address billing alarms but claim that they only fire after you've lost the money. This is simply not true with the predictive alarms. I get the sense that this author is not using AWS correctly in general.
I'm a big advocate of AWS, but there are some unexpected billing items e.g. moving data between AZs in the same region. Yes it's all documented and you may be able to be work around many items, but you really need to get into the weeds to cut costs.
When I was initially perusing AWS at my current company I was able to save us a ton of (recurring) money simply by properly deleting EBS volumes of terminated machines... someone didn't check a checkbox at some point (and/or didn't understand its significance): oopsie.
It's hard to set alarms early on when you're building stuff out. In any case I think it really needs to be someone's role to understand the billing thoroughly, somewhat regularly. Our billing line items post daily to me on slack, mostly I'll just check the total vs the previous week, and cloudability sends email alarms for specific things and suggests RIs etc.
But if traffic is high in the beginning of the month does this mean you should be allocated more budget? How much? Such an alarm doesn't really give you any insight into where the additional costs might be, or where/whether it's worth spending engineering time optimizing!
If you don't have a budget, then that's what you should establish first. If you can't get a handle on your budget, then no set of tools and techniques is going to save you.
The point is that the budget changes all the time for anything other than a simple application stack. If you're constantly experimenting and building new applications, you basically need to quickly glance at your daily usage to make sure you won't get a surprise in 30 days.
I've never worked at a company where the cloud budget fluctuated in the way that you are describing, and we certainly didn't run "simple application stacks." Costs and usage patterns are predictable. If you want to do whatever you want, costs be damned, then I'd ask why you're even bothering to be alerted about costs in the first place.
Pricing isn't an actual concern when you are just getting started. Why would you be worried about vendor-lock if you haven't built anything? If you run your entire solution on a server that you manage, you are also on the hook for backups, recovery, and every other problem that AWS already solved.
You're right. My personal successes came from pecking away a few hours per week, keeping costs low (or free) and growing slowly over multiple years without the need to monetize early.
> Why would you be worried about vendor-lock if you haven't built anything?
When else would you worry about lock-in? The early stages of a project involve setting up the foundation. If you build your foundation on vendor-specific tech, you've just locked yourself in or signed yourself up for a very painful transition in the future
Worrying about lock-in after you've validated that the project will be successful enough to pay the costs is also an option. Basically, do you knowingly plan for a future cost that only happens if you are successful, or do you put in extra work now that may end up being wasted work if the project fails?
"One option is to purchase a hardware server and keep it on premises" - really? I expected something intelligent in the article, but this option looks too dumb.
I know you're joking but a server under a desk is a fine alternative to a lot of compute-heavy background tasks where you can tolerate downtime without any user-facing effects.
IMO The best thing about AWS: The deprecation cycle. It's either never or some time in the distant future (and then pushed back multiple times)... mostly it's never.
You probably don't need that SUV/truck and are better off without it.
You probably don't need that luxury car and are better off without it.
Then go argue about the headache and the price and when you're done, your audience will just go back to their favorite toys. AWS it is. I will never advice anyone against AWS unless they are broke and it will bleed them dry. If you can afford it and have the skillset to run it, go for it. If you also have the skillset to go cheaper, go for it.
I agree with the second half (devops complexity) but not the first half (surprise billing). And I only agree with the first part if you're running a small-to-medium scale operation (I.E. you don't have 2+ devops engineers).
It's not that hard to avoid the infamous "AWS surprise". Just pay attention to the costs of things, and check the billing page regularly until you're comfortable with how your infrastructure is affecting the cost.
Devops complexity is a trickier topic. If you're running a startup or a small tech team, AWS can quickly mire you in delays as you become the AWS expert (or delegate one of your engineers to become the AWS expert). It's a lot easier to just use DigitalOcean or something.
That said, AWS/GCP/Azure scale in a way that other providers don't. Not in terms of technical scalability, but in terms of organizational scalability. Once you hire 2 or more devops engineers, it's likely that their expertise in AWS will pay dividends, and hamstringing them with something that isn't industry-standard is going to frustrate them and you.
"You probably don't need to pay your power company" could be a similar headline. It's true, we could all deploy generators to power our homes(and handle fuel logistics etc), or go fully offgrid with solar panels and batteries. Why don't we do that?
Let's see - there's the equipment cost, then there's the 'installation' and 'maintenance' costs, often performed by contractors. This often makes the return on investment not worth it. It would make even less sense if we had to constantly tinker with the power solution (or keep someone on call to do that, on our own payroll).
Let's say we have an aluminum plant in the middle of a desert somewhere. It might make sense to operate a power plant tailored to our own requirements. Or pay a company to build and operate one, as it's their expertise.
Somehow, these calculations look different whenever executives try to pitch their "on-prem" or "colo" solutions. I've yet to see a spreadsheet where their own staff costs are called out. Or the potential costs because someone has to rack and stack new capacity, rather than that being automatically handled.
AWS is the 'power company'. They take care of stuff so I don't have to. If I'm pulling too much power, that's on me. Linode would be a company specialized in generator rentals. Some logistics taken care of, I still have to worry about a bunch of stuff. It might be what my shed in Alaska needs.
Now, a discussion can be had on AWS pricing structure (cough network egress cough). I would expect to pay a premium. Sometimes they are reasonable, sometimes they aren't (Hi, NAT GW).
Plenty of companies don't factor staff costs into anything. In my first job out of university, absolutely crazy requirements for tiny things were accepted because it would just mean the project would take a few weeks longer.
As this seems to be a repeating class of posts, the reality generally boils down to the same thing every time:
- use requirements to find out what you need, not buzzwords
- comparing services with significantly different properties while ignoring those properties isn't helpful
- playing datacenter in a virtual environment isn't the same as cloud architecture
If you just need a virtual machine to do some stuff and loss of availability, reduced durability and lack of redundancy doesn't really matter, then you can get that pretty much anywhere. The same goes for DNS, object storage and SQL-based storage (ignoring database features). As soon as your needs for availability, durability and redundancy increase, your choices of service providers decrease because not every service provider provides the same servers at the same level. Integration of services is a whole different game as well and equally has fewer providers as the level of integration increases. Also: don't conflate integration with lock-in.
Ok, if you simply use EC2, S3, maybe a single RDS and/or Elasticache. Ok maybe I agree there.
But the real magic comes when you learn to architect entire applications out of their pre-built patterns.
random examples OTOH : fan out messages to a number of lambdas when a SNS topic receives a message . processing + backing up a kinesis event stream w/ lambdas + s3. There are many many cross product integrations that make development about as simple as connecting the two ends of a (data) hose. In these sorts of cases you can trade $$ for velocity pretty easily. Yes it will cost you a bit more to operate, but far less than the humans you save.
Also on the argument about unexpected traffic leading to unexpected bills. AWS is most applicable for businesses that have a strong correlation between traffic and revenue. If you make a dollar per million, then it's fine when you wake up to elasticity unexpectedly handling a trillion. Not so much if you don't have an actual business.
Layers of testing. You're right that unit tests wont cover the entire system though.
Typically the most important customer features will also have other suites that test end to end (API Call goes in, eventually S3 file is generated or something like that) .
Umm ... No. This blanket statement is certifiably false that it only applies to a company running maybe 1 or 10, or at most 100 physical machines. My work needs across the world presence, networking, fault tolerance and a boat load of storage and compute. I cant shift all those to on-prem or a cheaper, alternative provider.
Although you make some points, looks like you never owned or built an environment. At one point we had around 11 data centers, and it was a freaking nightmare to build, run and maintain them. Nothing wrong with doing it, there is a hidden cost to it.
Leaving cost aside, once we migrated to AWS we could focus on our business more than the environment.
I don’t know about that. AWS makes me a hell of a lot of money selling solutions to people who don’t need it and could solve the same problem cheaper elsewhere.
I can’t say I fully agree with the article but I do think that someone needs to close the gap between the hyperscalers (AWS firmly in the lead in that group) and smaller infrastructure providers like OVH, Hetzner, etc.
I think you could restate this as “smaller providers need to step up their game” as well.
As far as closing the gap, I’m working on nimbus web services [0] and that’s exactly what it’s for — I want to build the middle layer (hopefully a bit less chaotic than AWS’s) so that people on smaller clouds have access to some scale/abstractions.
It's like running Oracle RDBMS. You probably don't need it. But if you do need it, you cannot hope to build it yourself. You need to know what you need.
Almost everyone who really does need it (and is not just ridiculously inefficient) eventually moves off of AWS, including most large startups. Revenue makes it possible to hire and build whatever you want.
Dropbox is not a relevant example to the original comment IMO - they clearly need cheap storage above all else; it should be no surprise that someone can self-host something so single-minded as a storage application without all the cloud provider baubles.
Dropbox makes sense. They're an egress-heavy business. Other than media companies (Netflix, Disney Plus, et al.) there aren't that many egress-heavy businesses, so AWS continues to make sense.
I thought dropbox moved their 30PB+ data lake ONTO aws to get off of Hadoop or something because trying to do this on-prem, even with tons of tech talent and money, was not working.
They complained about onprem requiring 3 YEAR forecasts for capacity planning given their scale.
Here is what they said in 2020 for benefits of AWS:
---------------------
Hosts 40 PB of analytics data and supports 1 PB of data growth a month
Optimizes costs by moving cold data to Amazon S3 Glacier Deep Archive
Uses Amazon EC2 Spot Instances for 15–50% of compute capacity
Doubles compute footprint using Amazon EC2 Spot Instances
Enables the testing of new technologies without damaging data or affecting users
Improved performance by six times for some job types
Deletes hundreds of files in a few seconds compared to 30–40 minutes
Runs more than 100,000 analytics jobs and tens of thousands of one-time jobs daily
That's definitely not what we did at $TRENDCORP. We still use AWS all the time. The cost of netops dwarfs whatever discounts we'd get on colo costs. We're not an egress heavy business so we aren't eating bandwidth costs either. Do you have examples demonstrating your point?
extremely incorrect. The benefits of cloud grow as the company grows. I've been in a few mega companies that from colo infra to cloud infra, it was a huge cost savings and helped enormously with product development speed.
My medium-sized company saved a considerable amount of money by moving to Azure from a managed dedicated server company. I wouldn't call the previous situation "a complete clusterfuck catastrophe" by any means, although I would say we were paying too much. Our Azure bill is just starting to approach what we were paying five years ago to the previous company, with far more things going on now plus all of the flexibility that The Cloud™ has to offer.
...some examples please? I literally can't even think of one, but can absolutely think of numerous companies that have moved to AWS (and other public clouds) and remained there.
Most companies I know that were using Oracle didn't actually need it, they just didn't trust open source yet. One actually paid me to move their stuff from Postgres to Oracle "because we need enterprise support." This was in the early to mid 2000's.
Does anyone actually need it when software like MySQL and PostgreSQL exist? Mature RDBMS already exists that you don't have to build yourself. Does Oracle have something special?
They are probably talking about building the cluster yourself, not the RDBMS itself. aws (and most other cloud providers) have this cluster management handled for you.
AWS billing has been the main concerns in few projects that I worked on. I took some on perm APIs and moved it to cloud(lambdas, api gateways, route53 and such) from a on prem LAMP like single server application. Not matter how much I demoed the advantages about security and high availability and scalability, it wasn't good enough. I was always asked "How much would this cost us?" and everything else comes to halt.
Depends a lot on your situation. AWS is a way to trade money for engineering time. Sure, you can run your own OpenSearch/PostgreSQL/Kubernetes cluster “cheaper” than their managed offerings.
But that’s the keyword: _managed_. If you use their (or other cloud provider’s) managed services, you can just turn it on and expect it to be there, available and running all the time (or at least to however many 9s the promise).
You don’t have to worry about configuration, management and maintenance on the whole infrastructure, software upgrades, backups, etc. They take a bunch of worries off the table.
Sure, they don’t do it for free. But if you (like many companies these days) lack manpower in the engineering department, using managed services can be a way to use the people you have more efficiently.
So yeah, approach AWS with caution, be sure you don’t set things up in ways where your costs could suddenly explode. Do the math vs. DYI infrastructure, but don’t forget that your own engineering time isn’t free.
Right now I'm very in bed with AWS lambda, but I'm working towards the point where all my tooling can also deploy to something like firecracker VMs so I can have the peace of mind of knowing I could roll my own cloud with the same tooling
If you're small enough of an operation to get by with one physical server, then yeah you probably don't need AWS. If anything, use DigitalOcean. I agree on that aspect. Anything larger than that though, the value of AWS starts to grow.
The tide has totally changed. I saved a massive amount of money moving from colo to aws ~10 years ago.
I think there's a mentality about it. When you're somewhat resource-constrained just due to the nature of having to get hardware, it makes you do some upfront design. This magic capacity lets teams punt on the problem.
Like, "We saved a day not setting up nginx, and now we pay for every image we deliver," is not the same as "we saved $100,000 this quarter because we don't need a full-time sysop team." It's worth taking a beat to consider things at these margins.
I hope I can scale my company to the point we need to use aws or equivalent offerings. Currently just using VMs, managed database, and services like firebase.
I use Digital Ocean, GCP and AWS. I think each has a little bit of a different use case and as a dev we should be up for each of them. I can run a k8s cluster on digital ocean for like $20-30 a month and spin up multiple services for experiments.
However, there is not a great S3, Cloudfront equivalent on DO so I need AWS.
For ML/AI stuff, you need Google. There are probably a ton of these little variations.
(I've been pondering using it for file storage for apps that I want to design to be relatively easily shiftable to AWS later if they hit a scale such that they need it, so I'm genuinely curious what you think)
> Is it actually good as default choice to host a software system of any scale?
I think so - or its competitor, Google Cloud Platform. Anything beyond a simple software system that you might deploy on heroku up to the complexity of something that requires on-prem infra (e.g. Netflix's content servers, although they're also a big AWS customer) would be a good candidate for AWS.
If you want to easily hire people/ get experience for yourself, just use AWS.
AWS is practically an industry standard tool. Of course you might be able to find a slightly cheaper way to do things, but training people will be harder.
The only time AWS doesn't make sense is when you hit the scale where it's not longer cost effective.
I, a single developer, have an automatic CI/CD pipeline running from Github to ECR/ECS that deploys multiple high-availability services to serverless Fargate tasks with no downtime, as well as numerous executables running on cron jobs that pull the latest container and each run their own command on a separate instance of a serverless task. These are all automatically torn down when they complete, and I'm billed for the seconds or minutes they take to execute. Total bill is about $50/month.
I set this all up with Terraform, and never touch the AWS UI. The code is 99.9% ignorant of the fact that it runs on AWS.
I used to self host, and then used Media Temple, and then used DigitalOcean. Maybe I'm ignorant of the latest offerings from those providers but creating, let alone maintaining, something like the above without a cloud provider like AWS or GCP would be a significant overhead.
The downsides the author points out are unpredictable costs (this is a risk, but manageable) and vendor lock in (this is low if you know how to code your services using abstract interfaces that hide the underlying vendors).
In a few years there will be a booming consulting business town up around making legacy updates, modernizing or getting off AWS. Goes for GCP and Azure.
These articles are pointless, they completely miss any nuance to you and your solution.
Who are you? A hobbyist developer? A small team of 3 devs? A large company with multiple teams of devs dbas sysops?
What's your use case? An occasional Cron job? A small WordPress site? A distributed HA app that manages a huge ingress with strict uptime SLAs?
Where do you want to spend your effort? Happy to apt-get a few packages and call it a day? Need to create and maintain a load balanced postgres cluster with low lag and cross region replication and backups?
There are a lot of considerations that go into choosing a provider. Making blanket statements like this just feel click baity.
Also AWS violates GDPR, because as an american company it must obey the CLOUD act. But most EU companies simply ignore this issue hoping that nobody will notice. But it is just a matter of time until this will explode in several scandals.