That's a whole lot of text for "I had a 14 GiB VM image publicly linked and people discovered it", and none of it has much to do with AWS or Cloudflare.
Presumably the author would do much better with a VM or something from OVH, they'll just shut you off or limit you before it becomes a problem (not that they would care about 30 TiB).
I would say the main story here is that AWS overcharges on traffic to such a ridiculous degree. My SaaS regularly consumes 50+ TB in traffic, but I pay €150 in monthly fees for that.
For someone coming from my perspective, that would be a huge and unpleasant surprise to be billed $2600 for a service that I know costs $180 elsewhere.
Edit: especially so because people keep repeating the mantra that one should use the cloud to save money by not paying for what you don't use. Obviously, my 3 Xeon servers with 64 GB RAM each are way overpowered for sending out some GB of static files, but I wanted to have a bit of redundancy. But with my setup, there should be plenty of obvious inefficiencies for "the cloud" to eliminate
=> It feels like cloud should be cheaper than my dedicated servers. But it's not, and that is the unpleasant surprise.
IMO, it's not so much that they overcharge as that the service was created and priced for a different purpose. It's possible to serve huge files to tens of thousands of people, but it's not really what it's meant for. Naturally, the pricing isn't really optimized for that, and a service that's designed for that can be an order of magnitude cheaper.
Seems to me S3 is more designed for highly reliable data storage. If you wanna store a bunch of data in a way that's highly reliable and resilient to hardware failures and also not have to worry about managing RAIDs and clusters of servers for the data size, S3 is just the thing, and probably priced pretty fairly.
It's also a capable and flexible service. It's possible to use it for things it isn't really designed for and have it behave pretty well. Well enough that you can mostly ignore that the other thing isn't really what it's designed around. The "mostly" is important, though. If you start hitting any extremes while using it in a different way, then you certainly can run into some pathological cases in the pricing structure and pay way too much for something that would have been cheaper in a service dedicated to that.
S3 is much cheaper than the egress traffic. As such, this isn't so much about S3 because you'd be paying the same price for that traffic if you were using EC2 instead of S3.
Looking purely at the traffic price, I believe the word choice "overcharging" is warranted if you can buy the exact same product elsewhere for 90% less.
If serving static files to thousands and millions of users is not cloudfront is for then I am not sure what else would a CDN be used for .
You don't need to manage any storage to get your bill to drop, S3 can be fronted by your cached proxy (minio or just plain nginx works easily) it is trivial to setup on OVH or hetzner or any other VPS provider.
You get all the benefits of S3 without the egress bill of a major cloud
“Things I should’ve known but didn’t.” Did you know that “The maximum file size Cloudflare’s CDN caches is 512MB for Free, Pro, and Business customers and 5GB for Enterprise customers.” That’s right, Cloudflare saw requests for a 13.7 GB file and sent them straight to origin every time BY DESIGN. Ouch!
It's another case of misunderstanding and misusing a tool/service in my opinion. Cloudflare (and Cloudfront) are intended to be website CDNs. Their use case is _not_ serving 13.7GB objects - and so they didn't. (Whether "passing those requests straight back to the origin every time" is a better/more customer friendly thing than " blocking those requests since they're not being caches due to an object size rule" is a good question...)
If you want "inexpensive large file hosting/serving", you don't use a combinations of "a service with 11 nines of durability" and "a web-infrastructure and website-security company, providing content-delivery-network". As you rightly point out, duct taping this together on VPS providers competing on low-cost bandwidth is the "right thing" for this use case. The gold-plating of 11 nines of durability and a CDN with over 200 globally distributed POPs almost (perhaps "should have"?) cost this guy $2.5k for "taking the easy way out without thinking through the consequences".
In my head, it's like he had a Tesla Model S in the driveway, and needed to transport 6 tons of lead bricks across town, and thought "I know, I'll just load them up on the back seat and drive 50 miles!" and then getting surprised the car needed expensive repairs afterwards.
It is not the size of file really the problem if he had consumption of 30TB he will still be charged that much .
The b/w costs are very high on the cloud , 5-6x higher than it is outside, I don’t think there any value they provide that justifies it for vast majority of users .
I am pretty skeptical on anyone at all in the world needing 11 nines of durability. You are 100’s of times more likely to have a meteor strike .
All the major service providers have gone down recently. Cloudflare went down just this month . At best they are delivering 4-5 nines practically speaking (loss of one PoP or region is still down )
CDN is exactly the service which needs to be significantly cheaper at scale, if you really need 200 PoPs then you have at minimum 10,000s of users at which and you will have consumption
If the industry’s position is they only want/can service enterprise customers for whom current pricing is within the budget, then their online pricing models/ marketing are incredibly misleading on who they are targeting .
To me it looks like their business models depends on predatory pricing prosumers y hooking them on with the ease of use.
This is very much like credit card industry. If everyone paid on time , CC providers will never make money, predatory loan pricing at 36% or higher is their real revenue source.
Paying on time / reading the ToS only lets them off the hook legally , morally they are both praying on vulnerable users .
Yup that's exactly what I was saying. Cloudfare is designed for website resources, mostly smallish images and bits of JS and CSS. They would probably regard 5MB as a rather large resource. I'm not gonna blame them for having no idea what to do when somebody tries to run a 13.7GB download through them.
They charge so much for it as a disincentive to use it for CDN purposes. It is not architected to act as a CDN for people, and if they don't overcharge for egress, people will use it as a CDN because it is easy.
It's dubious that they don't want you using S3 for one of the advertised use-cases but even if we accept that, the logic wouldn't apply to EC2 since there's even less of a chance of that impacting other users. Building your own CDN on EC2 isn't going to produce a different workload than many other activities which are actively encouraged.
If you're using it properly you can even boot VMs off of it, in the same region (S3 data transfer in region to EC2 is free) like this article is talking about. Just don't enable public read on your bucket to avoid a surprise bill.
Always thought that a great statup idea would be to interact with dedicated server provider and offer on top of those cheap servers AWS like services.
The most common services like S3, Kafka, RDS (pg and mysql), Redis should be enough to cover most use cases.
With k8s and a dedicated smart team, it would be a possible adventure.
Moreover it can work great also on-premises. Several old medium business have their own physical infrastructure and they are not yet ready to move to the cloud.
> Always thought that a great statup idea would be to interact with dedicated server provider and offer on top of those cheap servers AWS like services.
The problem is the business model; managing all these common services is a big deal and takes serious engineering effort to keep online. You then fairly quickly should enter into relatively high prices to make this cost-efficient.
The cloud providers, however, have a trick: they can eat these costs and make serious $$$ on the actual compute, storage and bandwidth.
If you’re trying to compete with AWS et al based on price for bandwidth etc “but with common services such as S3”, you’re gonna have a really hard time convincing a customer your expensive managed Kafka is reasonably priced. And then you’ll just find yourself competing on just VPS servers...
In addition to my own products, I offer consulting to medium sized companies and my sales pitch is usually:
How much additional profit would it generate if I could cut your cloud costs by 20%?
Usually, the savings are even higher, but 20% is low enough to be easily believable, yet large enough for them to invite me to learn more.
The reason I do it manually is that many companies are afraid of the workflow interruptions usually associated with moving off the cloud. That's why I first analyze their deployment to calculate possible cost savings and then I offer them to hire me to take care of the migration. I'm pretty sure those companies would not feel comfortable switching to a different cloud by themselves.
For S3, it's not what most of the customers think they're buying, but what Amazon is actually selling is a storage service with 11 nines of durability:
It's "Simple Storage Service", and it does that one thing (storing practically unlimited amounts of objects) spectacularly well. Like 11 nines well. It doesn't do "inexpensive egress charges" well. Amazon never claimed that though.
I think a lot of people don't care about those 11 nines, and would love to buy something with perhaps only 5 or 4 nines of durability ("Maybe I'll need to upload it again once every few years, that's cool!") but which optimises for cheap bandwidth costs instead. Right now, I think that thing is an inexpensive VPS with an unlimited traffic (Hetzner were offering "unlimited traffic at 1GB/sec for the first 20TB, then throttled to 100MB/sec" for under 10EUR/month last time I went looking. And they're well above "the bottom of the barrel" pricing in cheap VPS land.)
i don't know about the US, but in europe a ton of these smaller, local parties exist. (usually called something like managed infrastructure or thereabouts).
Some of those companies are even more eyewateringly expensive than the story in the article.
I worked on a project a few years back where the client were paying $10k/month as a "managed service" fee, on top of about $4k/month worth of platform at full on-demand prices. I showed the client how they could have it all running on reserved instances for Prod and spot instances for dev/staging for under $2k/month - but no, somebody had signed up for $14k+ per month just to have someone to blame/shout at 24x7 if something went wrong. (And that company was ~85% likely to call me and blame it on the app before they even bothered looking to see if the platform was working...)
This is part of AWS's business strategy. Overcharge to an absurd extent (90% margins) and then if a surprise bill happens you can just refund it for 10% of the cost. This cost is a rounding error if it allows you to maintain your margins. It's worth it even if only a single person is willing to pay the regular rate.
Not the OP, but it could be Hetzner. They're in that price range, with that kind of hardware. I myself have been a happy customer there for almost 20 years now.
OP here, yes I use the hetzner marketplace for used dedicated servers. I have found that using hardware that's 6+ months old actually decreased failure rate. Plus it's cheaper and they waive the setup fee.
I've gotten servers from Hivelocity in the past, they like(d) to offer servers with the "enterprise" equivalent of these CPUs, I'm not sure if Intel still sells them, but there used to be a "xeon i3" basically, usually 4 core CPUs clocked to the moon, that supported ECC. They were really good!
Of course the cost of data egress for AWS does not add up to nearly what they charge for it.
But it's totally unfair to say they're "overcharging". The pricing is set up to encourage using the service properly. For example, did you know you get unlimited free (very fast) transfer between s3 <-> ec2?
You are free to use S3 however you want, but the pricing is set up such that people use a proper cdn backed by s3, do as much as they can between ec2 <-> s3, etc. instead of making s3 the backbone of their public site.
If you want to use it in a way it's not intended it will cost you more $$ which is how it should be.
>But it's totally unfair to say they're "overcharging". The pricing is set up to encourage using the service properly.
That's a very charitable take. A more cynical take is that they do it to encourage lock-in. If you already have your data stored on s3, they can get away with overcharging on compute because they know that if you switched to GCP, you'll end up paying more because of the egress costs.
>but the pricing is set up such that people use a proper cdn backed by s3, do as much as they can between ec2 <-> s3, etc. instead of making s3 the backbone of their public site.
I'm not sure what your point here is. Cloudfront costs 8.5 cents per GB of egress, meanwhile ec2 costs 9 cents per GB of egress. half a cent doesn't make the egress charges any less outrageous. That's not even including the per-request charges that cloudfront has, which probably can make it cost more than 9c/GB depending on your usage profile.
No one prices to "encourage proper use." They price for a number of variables. And I do agree that arguing they're overcharging is unfair; however, SU-pricing standards include triggers for excessive use as part of the customer guarantee. If Amazon has no problem automating the product, they can automate some billing features, like notifications and bandwidth restrictions when a customer is nearing excessive use based on the population use of a particular service.
Amazon's systems should have notified at least twice and required direct confirmation from the customer for their continued use.
We don't just make business decisions like this based off our self-interest. In this case, a failure to be resourceful on Amazon's part gave the customer one of two options: track everything manually minute-by-minute, or be surprised by an excessive bill.
AWS does have budget notifications that you can set up with a few clicks. It doesn't come by default because AWS doesn't know what's the right amount for those notifications. I already get enough emails from AWS for every account that I make.
I can actually think of possible/real workloads that would burn 2700 usd out of blue in two days and those customers would not be happy if AWS blocks their account because AWS thinks they did something wrong.
AWS Support quite often happily refunds these amounts and you don't need to post to Twitter. I've seen them refund a lot more.
Why haven't they done something to prevent this mistakes? They already refund these things, it's just those refunds is a drop in bucket compared to services that enterprise customers are paying for, and AWS engineers are busy building for them.
AWS gives you a lot of power compared to what you can do on your average hosting provider, but sadly there's also a lot of room to shoot yourself in the foot if you don't know what you are doing.
This is just making excuses for a business model built around taking advantage of small mistakes. If they actually cared they'd allow you to set billing circuit breakers on setup.
It's difficult enough and error-prone. What customers want is a bulletproof way not to exceed a defined spending limit. We've been asking for it for many years, and the answer has been "Yes, we hear you, we are working on it", and then nothing ever happens, because this is a strategic business decision that would affect their bottom line. Why should they do it if they don't have to?
Network throughput is a finite resource. And spikes from noisy neighbors could absolutely degrade throughput performance for everyone else.
So it makes perfect sense for me that part of Amazon's pricing calculus would be based not just on the cost to deliver the feature but the amount of utilization of the overall network they expect for the use cases their pricing supports.
I thought the same. This guy isn't even doing business via his site as far as I can tell. How is it that you set up such a monstrosity of services instead of renting a server for a couple bucks that comes with a decent amount of traffic per month and just caps your link or throttles if you go over it.
But I guess simply running Debian on some server and keeping that up to date just isn't cool anymore. Everyone running a blog about their dog needs Cloudflare, S3, heroku, micro services and docker. Obviously with no limit on what kind of bill they will generate should something go wrong. You can just vent on twitter and generate enough attention that the vendor will make an exception for you and pay you back to limit the negative publicity. Who wouldn't prefer that to the tedious work that is maintaining a Linux install on a server?
I mean the “monstrosity of services” is a bucket holding files and a CDN. It’s 2 services. I do this for my site (that gets no traffic and hasnt been updated in years) and it’s quite nice to not have to admin a server.
Now that said it looks like his job is in cloud evangelism too so I’m sure a large part is that he wants to maintain a personal account to learn/hone skills. I’d recommend anyone do that, getting a look at aws without whatever your company is doing on top of it is pretty awesome for learning. Just don’t host massive S3 images and shoot yourself in the foot ;-)
What do you guys do when admining a server to make it so time-consuming/bothersome?
If you just host static content you don't even need to care about security updates. Just deploy a new instance, "apt install nginx", done, forget about it. I would say less work than setting it all up on AWS.
When is the last time there was an remote exploit for nginx serving static files or in Linux kernel? You dont even need to worry about security updates in this scenario.
> I’m sure a large part is that he wants to maintain a personal account to learn/hone skills.
Yeah, he seems to be responsible for OpenShift (marketing?) at Red Hat and he only learnt how expensive AWS is when he got a bill? Bit embarassing.
Not making any arguments about the author, but there is a bit more work to a static site than just apt install nginx. You need to make sure your firewall is setup correctly, users are added and keys managed, ssh config has correct settings, services start automatically, certificates. Either need to deal with certs on your own or get a cdn. Either way, more work than apt install
apt install nginx will set the users for you and it also starts automatically on start, ssh default config is perfectly fine, keys are set up automatically when you start the instance - any cloud provider/VPS service/even kvm with 'uvtools' does this for your (and even if not, it's one ssh-copy-id command).
But yes you need to run, for example, a certbot as well to setup certificates.
so it's:
apt install nginx
apt install certbot
certbot certonly -d your.domain -d www.your.domain
# again, when you apt install certbot it installs the cron/timer automatically for automatic certificates renewly and nginx reload
Still less work than going the CDN route I would say.
It really seems to me you don't trust default linux settings and need to take care of everything and fine tune everything but you trust CDN providers thus making the CDN route less work for you. If you trust default settings of a linux distro, it's less time consuming setting it up yourself on linux. Yes that's how user-friendly it really is today.
I don’t know but on a site like a Hacker News I would expect an attitude where people like to hack on this kind of stuff, and not everyone wants “the most boring” type of setup.
The problem is that, at least in the case of public clouds, there’s a real risk of your bill exploding. I guess the author learned a lesson here, but I don’t think it’s the right attitude to start blaming the guy for “it makes no sense to tinker with AWS services” over here. Learning is probably one of his goals.
I wish AWS and other clouds would cater to this crowd (which I'm a part of). I just closed my AWS account to remove this exact risk. I love hacking around and playing with the various services—it satisfies my own curiosity and also helps me at work—but it's not worth putting my financial security at risk to keep these accounts open. More than once I've stress-checked my AWS bill in the middle of the night because I was worried there'd be some hidden bill or somebody hacked in and blew up the network traffic or compute or whatever. It's just not worth the loss of peace of mind.
I wonder if a potential solution is to have two billing modes that get hardlocked at signup (or require a key or something to change): one is the standard model with alerts etc. The other is a personal model that kills all of your stuff when you go over some limit. I would feel much safer if the latter were in place.
This is pretty much what the Azure Visual Studio subscription does for you. It sets up a playground with $150 monthly credit and automatic spending limit of $0. Hard lock here is a credit card being supplied, which is optional. I wish more cloud providers would offer such plans.
This credit is not allowed to be used for production workloads and they can automatically spin down your workload at any time. Also the $150 credit is only for enterprise VS, for pro it’s only $50/mo.
Because they want a hard guaranteed ceiling on charges, not after-the-fact alerting for spikes in chargeable activity.
AWS does not support a 'pre-pay' model, and to my knowledge there's no water-tight way of capping your costs. Yes, you can build an watchdog to nuke all your instances if you go over-budget, but there's still the risk of missing some unexpected source of costs, or misconfiguring your watchdog, or perhaps not getting there in time, etc.
AWS could support pre-pay, but they don't. I think it's a reasonable criticism. There are plenty of horror stories about surprise AWS bills. [0][1]
Reminds me of that time we accidentally left a really large Redshift cluster online, for two weeks, before somebody noticed. It was around $12.5k if my memory serves me right.
Never managed to get the money back. They always seem to focus on building tools around reporting (ie “budgets” being just a report, rather than an actual enforceable budget).
But I still can’t escape the fear of accidentally triggering something that costs a lot. It still happens sometimes, a sudden $1k EFS bill being the latest.
I would expect an attitude towards efficiency, driven by measurements and objectivity, engineering and professionalism, not a reckless, for fun, why-not, hype-based one.
This strikes me as a pretty narrow view of the hacker mindset. Sure, "hype-driven" is not exactly hacker-y, but "reckless," "for fun," and "why not" all sound exactly like hacker attitudes to me. A prime example would be Claude Shannon, who built tons of useless overcomplicated things just for fun in addition to coming up with information theory.
Yep, the author publicly shared a 14GiB file from his own S3 bucket. AWS would be completely justified in charging him for it. He can just thank his lucky stars that they decided to let it slide.
As for Cloudflare, hosting multi-gigabyte files is not what their service is for, I can’t see how you can blame them for having a limit on how large files they cache.
> I made a poor decision to distribute a trial Windows 2019 SQL Server virtual machine images (fully patched with all necessary drivers and VM extensions) in the form of a qcow2 file.
The article doesn't use the term "piracy", but I'm curious what Microsoft's license says about public redistribution.
It would perhaps legally be more acceptable, and more efficient in bandwidth, to create a binary patch that makes an official virtual machine image into what the author offered ("fully patched with all necessary drivers and VM extensions").
I'm not a lawyer, but generally a software license can set limits on redistribution. Redistribution outside the terms of the license is likely copyright infringement.
Some sources define piracy as redistribution outside the terms of the license. That implies that you can pirate free software, if you violate the license.
Without getting bogged down on definitions, I believe MS could issue takedown requests to anyone hosting their free trials, when the license forbids it.
> Presumably the author would do much better with a VM or something from OVH, they'll just shut you off or limit you before it becomes a problem (not that they would care about 30 TiB).
That's actually what I love about these dedicated providers. Often I prefer to be surprised by cutting the service instead of a gigantic invoice. For some business-critical applications I can understand the need to have it scale (and price) accordingly, for many other use cases it is better just to have the serve switched out when the traffic limit is hit off.
> That's a whole lot of text for "I had a 14 GiB VM image publicly linked and people discovered it"
I really hate long-form articles that feel the need to explain the weather, someone's clothing, or their family member's eating habits.
COME TO THE DAMN POINT. This is the Internet. 99% of content is garbage, and I'm not interested reading through pointless content-free filler that could be generated by a neural network just to find out if your article somewhere contains useful/interesting information.
It actually makes perfect sense. As a sysadmin deploying applications on Kubernetes, you're used to have all the details of compute/network/storage hidden away from you. (Until shit hits the fan, that is.)
I thought it wasn’t that, and it was a combination of cloud flare didn’t cache his 13gig file and also this 3555 error bug that he didn’t seem to really explain?
Disclosure: I work at AWS where I build cloud infrastructure
My hypothesis is that the client connected to Cloudflare and performed a HTTP range request for a portion of the 13.7 GB file. For an unknown reason, Cloudflare did not preserve this range request to S3 as the origin. It transferred the entire file, returned the range requested by the client, and dropped all bytes transferred into /dev/null because it is not caching.
The end result is that Cloudflare pulled down 30 TB of data while delivering 67 GB to clients in the one month period shown in the screenshots from the blog post.
I agree with this - Cloudflare's pricing and function is clear here, and it seems perfectly reasonable and expected to have a cutoff size that they don't cache at the edge. I get the OP was pissed about a big bill and felt like lashing out, but it's unfair to complain about Cloudflare when they've done nothing wrong.
Anything the cache doesn’t have, causes a cache miss, which passes it through.
So when you request the large file, the CDN (cache) doesn’t have it, so it immediately passes through to the original source. The details of how this is implemented don’t really matter in the sense that they are not going to host the large file(s) in the CDN, so they will always fall through.
I think this is less about the bill, and more about the customer service of the 2 companies, as well the "hidden" provisions in the ToS that are not clear in CloudFlare Marketing or normal setup docs.
The limits aren't exactly hidden in the ToS, the author just wasn't aware of them. If you don't pay money for a service what kind of support can you expect on a holiday?
Not only are they not hidden, they should really be common sense. That he seems surprised that Cloudflare doesn't cache 14GB files is surprising. There is some common belief that every service has infinite capacity in every dimension.
And the price for what he did would have been extreme. Serving a 14GB is something like $1.19 USD on CloudFront.
It sounds like he was operating in tiny land and expected bills to be commensurate, so it just seems irrational to expect a low or free tier of another service to do this for nothing.
If you literally put a massive file on the public internet, what the actual fuck are you complaining about? The quality here is such shit.
Let's snipe at amzn, everyone hates them because they're winning, because of hn readers dependence on AWS to run their shit web apps... It's like: ope! It's been a day since we had a "amzn scammed me" post. Nevermind that the people complaining are either a) too ignorant and footgunned themselves, b) minimizing or lying about their own complicity, or c) outright scammers themselves.
> Cloudflare was the least helpful service I could have imagined given the circumstances. A long term user and on and off customer thinks they were attacked for two days and you don’t lift a finger?
> File this under, “Things I should’ve known but didn’t.” Did you know that “The maximum file size Cloudflare’s CDN caches is 512MB for Free, Pro, and Business customers and 5GB for Enterprise customers.” That’s right, Cloudflare saw requests for a 13.7 GB file and sent them straight to origin every time BY DESIGN.
I don't really see how Cloudflare has much blame here. He's an "on and off customer" which I'm guessing means currently "off". They only cache a limited number of file extensions (qcow2 isn't one of them), and it's all documented.
AWS always seems pretty generous in resolving these cases at least.
I would not be surprised if AWS actually allocates marketing dollars towards covering bills like this.
In the long term this is a brilliant plan because it helps prevent people from blacklisting the provider.
Imagine someone gets hit with $3k bill on their personal, feels wronged, goes to work and makes effort at their employer to move off AWS.
I don't know about most HN readers but I'd probably fall in this category and past places I've done work for were +$100k/month corporate bills with AWS.
Heroku really screwed me over once and the reps I spoke with were complete and total assholes and acted like I was in the wrong when it was their system that totally dropped the ball. I now advocate against them at my workplaces and I know it's cost them at minimum tens of thousands of dollars, though I get that isn't much money for them.
There was a fraudulent and malicious DMCA claim against one of my sites. Heroku without any notice took down ALL of my sites, one of which was a part of my livelihood. They claimed they emailed me with 24 hour notice but that absolutely did not happen. I had to pound on their door to make them reverse it, they victim blamed me the entire time and had a super snarky and shitty attitude, told me I should have been more proactive about the notice I never got.
I cannot in good faith ever recommend a service that would take down a website with zero notice and be so uncooperative in fixing what was ultimately their fault. It cost me money for that site to be down and I had to cancel my entire day to deal with them, and if the stakes were bigger it could easily wind up costing my employer hundreds of thousands of dollars.
My thoughts exactly - if I felt they burned me over $2k on my personal account they'd quickly lose a couple orders of magnitude more than that on corporate.
Given VPS providers give you terabytes of bandwidth included even with cheap VPS', I'd say that a $3k bandwidth bill from AWS costs AWS pretty close to $0
> I don't know about most HN readers but I'd probably fall in this category and past places I've done work for were +$100k/month corporate bills with AWS.
It's happening in this article too: Cloudflare wasn't in the wrong here, even the bill wasn't from Cloudflare, and the author is already publicly advocating against them. I feel quite uneasy about that.
That strikes me as small, because CDNs are often used for software downloads and updates. Think Steam, Windows Update, OSX, Xbox Live, PSN, etc, etc. Those files are regularly larger than 5GB.
Obviously those customers have negotiated deals with the CDNs, but if there's a 5GB hard limit it sounds like Cloudflare doesn't want to compete there.
I'm not sure why I'm getting downvoted, I don't think my comment was inflammatory.
Cloudflare has always been different than a generic CDN - their tagline is "The Web Performance and Security Company". Most of their features are focused on those things, and they don't really support video streaming / etc..
I agree with you and it's a mental model that Cloudflare isn't a swiss-army knife CDN.. They optimize for fronting web services (And do that very very well).
I use them on my personal site - but on the corporate side where we need TCP acceleration, edge serving of binary resources and POP presence in China - we turn back to crusty ole Akamai.
For giant files, you can at least break the file up into pieces. If nothing else, it's a greater technical challenge to build a caching system that scales to infinite GB. Why would Cloudflare invest in such a system when it's of dubious utility in the first place?
Off topic but on the same line. One of the most annoying things about being a consumer in the US is the ubiquitous unknown-until-last-minute pricing.
You rent a car, you don't know what the total is going to be. You go to the hospital, you don't know how much you're going to have to pay. You book a hotel and don't know the total until you check out. You go to a restaurant and even if you order just one thing and saw the exact price on the menu, that's not going to be the total. You go to the grocery store, see all the prices on the items, add them up, and then when you go pay, surprise!
Have you ever bought a house here? You go through a 1-2 month (minimum) process of providing financial documents from the past six months or more to mortgage originators, and no one can tell you how much to bring to closing until 24 hours before. It’s insane.
It is crazy. Not a house, but friend had a baby in SF/Bay Area. Neither the hospital, doctor or insurance would tell them how much the delivery would cost, not even an estimate. It took a whole year after their hospital visit to get the bill, it was $85k for a c-section and 3 days at the hospital. Fortunately their insurance took care of most of the bill. But can you imagine getting a surprise bill for $85k a year after the fact? Or not having insurance? Terrifying!
We have two kids (in NC, US)), both were c-sections, we had insurance on the first one, but because she was born in Feb, we had to pay the deductible twice, because even though it was the same pregnancy, it was "over two billing years". And even with insurance we still ended up paying the rest of the bill over 3 or 4 years.
Second kid was with no insurance, I think we'll finish paying that one when he turns 10 years old.
I don't know about other areas in the US, but around here you can at least setup a payment plan with the hospital and they are 0% interest rate.
Don't get me wrong, I'm not a fan of insurance companies, but they seem pretty up front with the fact that deductibles are based on date of service rendered.
They are, but it's still dumb. It puts extra demand on the medical system in December as people try to squeeze in their elective procedures in a year where they already paid their deductible, and it incentivizes you to not get care early in the year in the hopes that "it just goes away" and then sometimes it gets a lot worse and costs a lot more than if you had just gone in the first place.
Deductibles should be a rolling 12 month bill. If you have something major in December you should be good until next December. This would eliminate all of the issues with deductibles rolling over on Jan 1. It would even bring extra profit to the insurance companies because people might decide to stick with their provider another year since "I already made my deductible until November".
I narrowly avoided that issue when my kid was born late in the year. But yeah, you end up paying almost double for the same thing because of the deductible and out of pocket max resets in the new year.
That said, I was able to take advantage of it by making sure my surgery was scheduled before end of year. Ended up paying something like $200, rather than $1500 if it had happened in January.
Random comment - UCSF will now provide a billing estimate for procedures. They actually look up your coverage and figure out the out of pocket for you based on deductible, co-insurance, etc.
I did this and it was pretty close.
Of course, they don't make any guarantees that's it accurate. But hell, it's a start.
If they couldn't provide an estimate because unexpected things happen and they can't predict all services that will be required, that'd be one thing. But they can't even provide a list like, "if you need an aspirin, that costs $X. And if you need..."
Well that is actually a pretty tough one. How are they supposed to predict how long the delivery will take or whether the mother will need a c-section? Even if the hospital were able to perfectly predict all of the procedures and line items they still would not be able to tell you what you, the individual, would ultimately pay. Even if they know your insurance information they still would not know how close you are to hitting your deductible or annual out of pocket max. There are a million different things that eventually go into figure out what the end "customer" will actually end up paying.
And just because they "billed" your insurance $85k doesn't mean they were actually paid $85. Billable vs allowable and all that mess.
However, if you went into the hospital and asked for the cash cost of a routine procedure they very likely would be able to give you a close approximation to what you would end up paying.
> How are they supposed to predict how long the delivery will take or whether the mother will need a c-section?
In their case it was a scheduled c-section. In any case they can at the least give you a range: min-max.
Additionally, hospitals keep very accurate track of their c-section rates, so even if they are not able to predict yours in particular, they can definitely tell you what your odds are ahead of time.
And, there are other countries and healthcare systems in which they internally take care of the stats/metrics so that they make a decent amount while you pay a reasonable amount that they tell you before you choose to get the procedure.
The hospital doesn't tell you the estimate of anything because they are afraid of the liability (or maybe just accountability).
It's just lazy or maybe too convenient, to say it can't be done. I mean if already health insurance companies can give you a fixed monthly amount to pay, then they know very well how much it's going to cost them, why not tell us?
> And, there are other countries and healthcare systems in which they internally take care of the stats/metrics so that they make a decent amount while you pay a reasonable amount that they tell you before you choose to get the procedure.
That's a difference in the financing system, which hospitals have about as much say in as consumers do.
The insurance companies (including government payers like Medicaid and Medicare) don't just control reimbursement, they control what providers can bill to customers, too. And that includes (by usual and customary charge rules) influencing what they can and need to charge to customers that aren't even being reimbursed by the payer in question.
>no one can tell you how much to bring to closing until 24 hours before. It’s insane.
In some ways, yes.
In other ways, it's insane that people are able to borrow 5-10x their average annual income for an item (the house) that they have very little expertise in analyzing. In that sense, it's a process that is surprising it works at all.
That doesn't sound right. You can definitely get a million dollar loan with $250k down on a $200k salary. I'd say it's pretty common. My friends own homes and fall into that category.
Ok, fair, but for the 80%+ of the U.S. that aren't high income earners 5x won't be as easy. I obviously didn't include a ton of nuance in my original post.
Your bank should be upfront about their lender fees, points and origination costs.
Third party fees are either fixed or a simple percentage of the sale price.
The rest can be tricky, but should not be a deal-breaker for a new homeowner. Basically you are just paying the expenses for a short while up front. Interest through the end of the month, real estate taxes through the end of the quarter, Home owners and mortgage insurance and real estate taxes for the escrow account to cover 2-3 months.
The last time I attempted to get a quote on how much a medical visit would require under a PPO, I couldn't. The price was literally unknown, and seeking medical attention meant unknown, unlimited liability.
The FTC article you cite is actually in favor of the type of transparency lamented by the comment you are responding to.
> The staff comment explained the risk that the latter type of transparency might harm competition by enabling competing providers to coordinate or collude on price
Where "latter type" referred to "plan structures and contracted fee schedules between health plans, hospitals, and physician service entities." (The "former type" was "actual or predicted out-of-pocket expenses, co-pays, and quality and performance comparisons of plans or providers" which is what would effect parent's experiences; the FTC "encouraged" that type of legislature.)
I don't necessarily agree with the FTC here, but their comment isn't covering the lack of information that causes consumers have no idea what the bill is until they've already incurred the bill.
Ah, that's a fair point, "unlimited" was a bit strong. The last PPO my employer offered has a maximum on the order you cite. (It does have a few exceptions and situations it doesn't apply to, however.)
My own experience here is driven by having gone in for what I expected to be a $50-$100 visit (a doctor took a look at my vocal chords); it was $4,000. How that is justified, I still don't know. (Thankfully, insurance covered some of it, but for such a minor visit, it was still way more expensive than I expected.) I don't know that "well, it can't be more than $10,000" is much comfort when you're walking in for something that seems minor, such as lingering effects of a cold (this example) or an infected finger (I was unable to obtain any pricing for this, prior to the visit).
A few years later, surgery would have cost me ~$300. (But that was under an HMO. And I got a quote, in advance!)
(While my initial post specifically mentioned PPOs, as that is what I'm used to, there are also those without insurance, which doesn't have the max out-of-pocket of an insurance plan.)
Are you saying that if you get a $500k[1] bill from the hospital, "most, if not all health plans in the US" will cover at least $490k?
[1]: Given that a mere c-section (30min, super standard procedure) + 3 days at the hospital can be $85k. It's probably pretty common for hospital bills to exceed a few hundred thousands for anything more complicated or requiring a long stay at the hospital.
Yes, that was actually passed as a part of ACA, which set requirements for maximum out of pocket for ACA-compliant plans.[1]
For 2020, it's ~$8k for an individual and ~$16k for a family. Most non-high deductible plans are a few thousand for an individual and high-deductible plans can be close to $10k.
Yeah. So when you don’t have insurance or your insurance plan has an issue and doesn’t cover the absurd bill you got that is incentivized to be high because of this system, you can be royally screwed. I think medical is the biggest reason for bankruptcy.
Around 10% of people don’t have health insurance. I’d be surprised if there aren’t some crappy health insurance plans around for another portion of people that ends up not covering enough when things go bad.
Price transparency can harm prices. Competitors can use it as a mechanism to signal appropriate prices to each other in order to reduce pressure to reduce prices. It's highly industry/context specific but transparent pricing does not always mean better pricing. In fact, in limited contexts, price fixing can lead to better pricing.
I think Westinghouse was the big FTC price-transparency-can-be-bad case.
That's orthogonal to the post's issue, though: he paid exactly the price he was quoted for services rendered; that he didn't understand the services he was consuming isn't really Amazon's problem but it's great they jumped in to help a $0 account (which, of course, was self-serving)...
Transparency to customers cannot be harmful in a free market. Nobody is asking for the full cost structure to be published.
Price fixing will only happen if there is no actual competition for customers, otherwise being the cheaper service is the easiest tool to increase your share of the market. Besides, isn’t that already the reality with prices being fixed in partnership with insurance providers?
Agreed. Pricing transparency in healthcare has been wonderful for pricing (see The Atlantic's Fallow's article).
My point was that pricing transparency is not a panacea. It seemed that some people were reacting as though pricing transparency is a magical salve.
AWS: In my experience, AWS pricing has been transparent and great and they generally take steps to keep you from foot-gunning yourself. This article discusses a case in which the author foot-guns himself using a predictable mechanism AWS can't really prevent. I'm unsure of what else AWS could do to protect him from himself... (Anyone can throw out simple solutions without understanding S3's implementation complexity: they could seriously throttle their own product! ... but that might be seriously complicated and might hamper the value of the product to others. Besides, if you buy a M1A1 tank, you should be very careful about using it: it could use tons of fuel... He bought an M1A1, left the keys on the dash and then was surprised when he found got a huge fuel bill...).
For those who don’t know, the issue in the US with not knowing the hotel or car rental costs is typically taxes. Different places have different taxes and those are often not shared at the time one books.
I’ve been using fixed price services (they exist) just so I don’t run into what Chris did. They aren’t evangelized as well.
That's just the excuse for doing it. But tax rates are known ahead of time, they could just add the tax to the price and put it on the sticker, just like pretty much every other place in the world does.
In my opinion, it's a very deceiving practice that could definitely be fixed if businesses really wanted to fix it. But there's no incentive, because if you show your prices including tax and fees, they'll seem more expensive than the competition.
In the case of hotels, my experience is that with a lot of them you book online, pay the total, but then when you go check out, they've added some extra fees. Sometimes this happens before booking, you browse the options and choose based on price, but when you go check out, the total has changed because they've added not only taxes but some other fees as well. My most recent experience having this issue was with AirBnb just two days ago.
Taxes are different in France and in Spain, and all the prices in every place will show you the price of what you're doing, tax-included. One can move freely between the countries, use the same currency, and settle down in either place (given you're in the EU). This works despite the languages being different, and the governments being different. It also happens to work in quite a few more countries too.
I think I'm missing something when people use the excuse of "taxes are different in different places" to say "we don't show taxes out of habit, or because it requires us to change our signs".
The store/business still knows ahead of time the exact tax rate for what you are buying at the location you are buying it. So that really is no excuse for not disclosing that info pre-purchase.
In Brazil, the tax rate also changes city by city, and the price shown to the customer already includes that tax (together with the state tax and the federal tax).
If they rental agency is paying those taxes then they are aware of them, have them entered into their CRM, and can inform you anytime by pulling a simple DB query. There is no excuse in 2020 to not know exactly what the price is and keep that from your customers.
I deal with reputable rental companies. The rental price is based on demand but everything else is a known quantity. I’ve never had the quote priced higher than the online quote. And in a few cases I’ve had it go down by using certain credit cards or loyalty programs.
There are a ton of fees added on to rental cars, especially when you rent at the airport, in addition to taxes. I wouldn't have said 2x the base rate but you can definitely hit 30-40% uplift. When comparison shopping rental car pricing, you absolutely have to know if it's the base rate or the all-in price.
Hotels have some of this but the overall uplift is usually a lot less--at least until they start adding "resort fees" to otherwise discounted rooms.
I'm not even a tech person and I know that AWS has an option to set billing alerts for whatever amount you want. Damn sure I'd be setting those up if I wasn't 100% sure I knew what I was doing (is anyone ever 100% sure?).
Problem is, it's alert after the fact. And it doesn't actually stop.
Unless you have scripts in place to nuke everything (which you will have to develop yourself since aws doesn not supply them), you have to manually login, and try to shut things of, while bill goes chaching
I always use ADAC (German AAA equivalent) to rent cars, especially for the US, and prepay them. They then send me a PDF that tells the car rental clerk in German, English and Spanish that they are to provide exactly what the voucher says and no further options, like prepaid fuel or supplemental insurances (ADAC rentals include 1M EUR in liability and comprehensive insurance).
The only suspense left is whether I get the compact I signed up for, or some monstrous SUV because they’re all out of more sensible options. A few hints about how bad I am at backing up anything larger than a Camry has gotten results.
Aside from things you order in the room or charge to your room, what is changing on the hotel fees? Typically they are quite straightforward, unlike rental car quotes. You can even pre-pay.
I use the small glesys.com and pay for the box with a capped bandwidth.
If my blog or apps are slashdotted (or HNed or whatever we should call it nowadays) they just load slower, degrading gracefully, and never stop or return an error.
They used to have transfer based pricing but then moved to bandwidth based, reps said to simplify billing.
Their new KVM service gives you 1g burst and 100mbps average by default.
https://glesys.com/vps/pricing is very straightforward.
At my previous employer we migrated to them and were very happy with the VMware infra they provided.
I don't recommend them. We used to host an online text game there (MUD) and when one of the players decided to DDOS it, Hetzner just pulled the plug. I moved to soyoustart.com (it's OVH brand) - in following month they sent a couple emails about the server being attacked and that they're trying to mitigate the attack, but game kept working.
Yes they are quick to just drop all traffic to hosts being attacked, that should be noted. Ovh is putting in a little more effort but will also nullroute your IP if it's getting too bad. Generally the hetzner network seemed more stable and less congested to me though.
> Now that I’m aware of the 512 MB file limit at Cloudflare, I am moving other larger files in that bucket to archive.org for now (and will add them to my supported Causes).
...
> I don’t feel like archive.org should be my site’s dumping ground since it can turn a profit if it gets popular. archive.org is a stop-gap for two files for the time being.
I'm trying to understand... he has decided to burden a charity with his distribution expenses?
There's some symmetry to the way the Internet Archive assumes consent to reproduce other people's content, and people assuming consent to take advantage of the Internet Archive's mirroring. I don't know if this is in line with the Internet Archive's terms-of-service, or the law.
In this case they'd be getting stuck with a pretty big bandwidth hit.
He published a 14GB file and one day there were 2700 downloads resulting in ~30 Terrabyte of traffic.
He had the file behind CloudFlare, but since CloudFlare does not cache files larger then 512MB, all the traffic went to his S3 bucket and Amazon billed him $2700 for that.
I think the most important part was left out:
> AWS employee discovered that 3655 partial GETs to the object might have actually been delivered as full file requests
[...] Use of the Service for serving video (unless purchased separately as a Paid Service) or a disproportionate percentage of pictures, audio files, or other non-HTML content, is prohibited.
So 500MB limit or not, the author is already violating CloudFlare's terms of service.
Even if you read the ToS, do you really expect people to remember this? CloudFlare blends into the background when setup properly, and you don't have to think about it. If you upload a big file to your site as part of some other work -- especially personal -- most people are not running through the ToS of every service that might be involved thinking about compliance.
Terns of service are also legal "CYA" documents. If CloudFlare was actually serious about that restriction, there'd be a technical limitation in place that would, for example, serve a 503.
The issue here, imo, is that CloudFlase deceptively hides these things in the TOS, and does not make it clear in the signup process, documentation, and other marketing about the limits of the service.
if you look at the marketing is appears to be with out limit, now "unlimited" anything when it comes to technology should always through a red flag but CloudFlare as used this position to corner the market for CDN and DDOS services.
I think it is deceptive to hide these limits of the service deep in the ToS which they know every few people actually read
They don’t hide anything. Cloudflare has always been aimed at protecting websites and applications, not file distribution. In fact I just went to their site and could not find a mention of file hosting or even “CDN” anywhere.
You are referring to thier DDOS service, here we are talking about the CDN Service
and clicking through about 5 pages of marketing on the CDN, including the pricing page, as well as the FAQ I see no mention of file size limits, or file type limits
Amazon gets away with high bandwidth pricing because almost all their customers are businesses with high revenue per byte served. If you want to serve large assets economically you have to look elsewhere.
Bandwidth on Oracle Cloud is $0.0085/GB with the first 10TB free each month, so this would have cost only $170. Alternatively bandwidth on Backblaze B2 costs $0.01/GB, but is free out to Cloudflare, so this traffic would have been completely free.
I do recommend you spend some time looking at the quality Oracle provides. Their regions plug into transit providers (NTT, Level3, etc.) and their peering footprint is damn near non existent (they seem to be in the process of trying to fix it). Only reason I bring this up is try to send traffic to an eye ball network at peak on Oracle vs. Google/Azure/AWS at peak and you can see the difference in terms of packet loss / throughput. This is because those eyeball networks you have to be directly connected to since they run their transit hot at peak.
Agreed. I'll be honest that I am not a fan of Oracle as a company with their actions. But I'm more than willing to take advantage of their 2 free VMs with a total of 10TB of free egress traffic.
The fact that the cloud allows hobbyists, small businesses, and massive enterprises ~equal access to services is amazing.
However it means sometimes things like this happen where a product’s incentives (serve any content at any cost) are wildly misaligned with a huge percentage of users needs (I’d rather my site, or preferably just the costly resource, be down than pay $2k).
There’s endless tuning non-enterprises can do to get our ideal behavior: but that’s the difference between pre-cloud and post-cloud computing. It used to take monumental effort to build high scale high availability systems. Your $5/mo Dreamhost site would just die under load instead of charging you thousands. Now enterprise use cases are supported by default and it takes careful tuning to opt out.
Actually not even the typical $5/month cheap VPS offering would die under the "load" of 30TB of static content HTTP traffic being served in 2700 requests over a month. That's just a laughable drain on CPU resources that can easily be handled by even the cheapest virtual server offering, because it doesn't even get a single core of a modern CPU into the double digit utilization percentage range.
The only thing that could happen is they cap your data transfer at some point. But there are cheap VPS providers out there offering several TB of gigabit speed traffic and throttling instead of a hard cap when you reach your limit.
Ah sorry, I used Dreamhost as they were one of the biggest shared hosting providers back in the day. No VPS, just a user account on a big host. They’ve likely evolved considerably since I last used them 10+ years ago.
Point being their product was targeting me and designed appropriately. I forget the details but I know there were caps that were ample for my meager needs but would prevent this sort of accidental overage.
My point is that compute has become a commodity like electricity but without the built in fuses. My residential box can’t pull industrial amps.
I think the main problem is that there isn’t an easy way to set a hard monthly limit on these services.
I use a bunch of “freemium” services like S3 and Google Maps API and I’ve never paid a penny. I use them because they don’t cost a penny for my very limited usage, but I’m not looking forward to the day I mistakenly and disastrously exceed their free tier.
This is the nightmare scenario with a personal AWS account. AWS billing setup makes it impossible to know that there is a giant bill until after the fact. I wish there was some way to limit the bill and just have everything shut down at that point for hobby projects.
Setting up billing alarms is easy for personal accounts, it's actually only a nightmare if you go through a reseller (which is common in government contacting).
The author had an incident starting June 23rd, but didn't know about it until he got his bill July 7th, that's potentially ~14 days someone could have been abusing his account. A billing alarm would have reduced this to hours or minutes.
I would be interested how you suggest stopping a bill before it happens otherwise.. should AWS disable your website because you got posted on hacker news and now have a bill over your $10 limit? If AWS needs to stop your billing at $10, then it might need to shut down your EC2 instance and destroy your data...
If course they could offer the ability to configure the response, but if someone doesn't take the 2 minutes necessary to put a $25 billing alarm in place, what are the chances they will go through the effort of service/object based abuse policies?
At the end of the day, the issue here was that the user posted something online that people could abuse.. I don't think any CDNs cover 30 TB in free tier...
>should AWS disable your website because you got posted on hacker news and now have a bill over your $10 limit?
Maybe that should be an option?
Probably not a good idea for a business but if you're just using AWS to learn and otherwise fool around with, there's a good argument that it would be nice to be able to have a hard circuit breaker for at least stateless services.
I've actually heard people argue that they consider everything they put up on AWS is ephemeral so a hard circuit breaker should have the option to burn everything down but that seems like it would create its own set of problems for many people. Disabling data egress and EC2 seems as if it would go a long way towards stopping most of the unexpected bill stories.
And, yes, I'm aware of billing alarms and even setting up Lambda functions to take actions but, especially if you use S3 to host files, it would be nice to cap expenses at mostly your storage costs. I was doing some research for a very small non-profit that needs some hosted storage and I think Backblaze B2 is a better choice for them for this reason.
I certainly would like the "don't book me over $X" option, but the devil is in the details. Perhaps just turning off where's would help, but if you account is compromised and 50 servers are running trying to mine crypto, the hacjer won't turn off those servers just because it's no longer submitting results... So AWS should probably stop all your servers.. which means users might lose data... Makes it a difficult choice
AWS billing alarms and cost explorer are not realtime services, you can expect a 6 hour delay. In this users case, this could be after $500 has already been spent.
AWS does not do enough to prevent harm, even if a user is diligent with alarms.
Do i get now a higher liablility because the alert alerted me and i now need to be aware of it?
so i'm going on holiday and i'm fucked?
You know, a billig alert is a nice thing, its not the solution. Host a few things on AWS, tell me your endpoints and your bill just might be already in deep shit before you read your bill.
The only reason Amazone did not charge the full price is because it's good marketing for them. I'm not sure if this also would have happened if the person would be less good connected/have gotten less visibility.
> so i'm going on holiday and i'm fucked?
That's why AWS is super problematic for any company which isn't big enough to make sure there is always someone reachable by billing alert 24/7.
A more extream case happened btw. a while ago with google serverless. Small startup with limited budged did a software error which massively increased data queries and got good publicity when they lunched and then was instant bankrupt, well except that google was also not sharing them because the bad press that would have caused.
> You know, a billig alert is a nice thing, its not the solution. Host a few things on AWS, tell me your endpoints and your bill just might be already in deep shit before you read your bill.
I agree, I'm strongly in favor of soft- and hard limits. (Per "project", with a dynamic limit of x-times of what the cost had been previous month, etc.).
> herwise.. should AWS disable your website because you got posted on hacker news and now have a bill over your $10 limit?
YES!!!!, if you ask them to do so.
$10 is a arbitrary small amount, but what if it's a web app and we are now speaking about $1000? And tbh. for many especially young people just $100 alone is much more then they can afford.
Or better have also a option for a dynamic hard limit based on a multiplier compared to normal traffic.
Even besides private persons there is a good reason to have a hard limit for a lot of businesses (or parts of them). Especially if it's on non-essential things people don't rely on which have a limited budged. (Like a lot of things provided for marketing only.)
Heck even for payed service providers a hard limit can make sense given that it's high enough, as a form of last defense against intentional or accidental DOS like situations (I mean in times of auto-scaling some DOS situations became cost explosion situations instead).
The problem with billing alerts is that (assuming they work and get delivered in time and are set to an sensible value):
- You need to read them in time.
- You need to be able to react to them in time.
Both NOT trivial in for private persons and also small companies which just got started (or do mismanagement or are to tight on budged).
Like what happens if you get it when you are on a party? In the cinema? Asleep? In the hospital? On holiday currently outside of reception?
Somehow the idea that you will react in just a view minutes and then fix the problem, too (or shut the service down). Is a bit to optimistic.
Through yes in the given case he might have ended up with "just" a view $ in the best case or ~600 or so in the worst case. Still I think he would have been happy with service discontinuing once the price reached a view hounded $ (let's just arbitrary say 250).
Sure, you can set up billing alarms and all, but the point is, for personal hobby projects, it shouldn't be this easy to screw up.
I've used AWS extensively professionally at my current and previous 2 or 3 companies, and I've had a personal AWS account but recently decided to close it just to be safe. I actually wondered if I could just remove my billing credit card, so I contacted AWS support about it, and they said you have to have at least one primary card on there. So the only option was to close the account. I only use a mix of Digital Ocean and Netlify and Heroku for personal projects so I might as well shut down my personal AWS account just to be safe, and I did. It's just not worth keeping one open, the risk is too great.
This is why for hobby projects I try to stay away from pay per use services when possible even tough they would be cheaper. Just the peace of mind of having predictable fixed cost is worth the couple bucks I might be paying more.
That said, unless that spike happens in a single day, you should be able to at least set a budget alarm to warn you. I think you should also be able to trigger shutdowns from those alerts within aws, but I never did.
AWS is an amazing service and in our past been very accommodating for surprises as long as it seems we know what we are doing and are going to mitigate the cause. I’ve rarely seen this allegiance to the “customer” by an enterprise company. They really do care and they figured out the recipe to make caring scale.
What’s odd is the touch points are cold. Ticket system support, phone call back etc. it feels like it’s going to be robotic canned replies but they figured out a way to make the people on the other side smart enough to understand the issue, empowered enough to do something about it, empathetic enough to want to resolve things “fairly”.
The first real issue I ever had was last week though, truth be told, it was mostly on UPS. (They didn't make a scheduled delivery pickup, then their tracking system eventually said they picked something up even though they didn't. Amazon did issue a refund when the item was "picked up" so I assume it's all on UPS now.)
The problem is that I was utterly unable to talk to a human at UPS. I even went to a UPS Store but they were powerless to do anything.
The thing is that Amazon's automated chat bots and so forth just kept referring me back to UPS.
I had this same problem on the other end of things. UPS truck pulled up out front with my package, package is scanned as "left at front door", walks out with a couple packages for my neighbors, goes back to the truck and just drives off (and no, I checked, it wasn't one of the packages he left for the neighbor). It appears that the driver decided to save some time and scan packages on the truck as delivered before getting out and then forgot to actually deliver it once he got back on the truck from the neighbor's house. Either UPS doesn't have an easy way to undo erroneous delivery notifications or the driver just decided to dispose of the package when he realized what happened and had a leftover package, either way, they never attempted to deliver the package again.
As far as Amazon was considered, the package was delivered and the only option was to return it. There's tons of options for problems with a package, but nothing for "UPS stole my package". I guess they want to make it harder for people to claim that a package was stolen but I can't see why they would go so far to hide that given how prominent actual package theft is. Eventually after going through Amazon's support chat one of the responses was just a generic refund without returning the product but in the end I think that just took that money straight from the vendor for a package that UPS lost/disposed of because of a driver cutting corners to save time.
The loss rate for UPS is about 25% here (DC) and their support is in name only. They won’t even let you file a complaint for (IIRC) a week after the alleged delivery date because they know how often that’s completely fictitious.
The USPS is generally rock-solid, although during the pandemic I’ve had a couple of packages show up the morning after the stated date due to the carriers being slammed (described as like Christmas but without the temporary staff).
Amazon appears to track this: after a few lost packages they never use UPS for us. They use their carriers and USPS for all sizes so I’m assuming there’s some careful price arbitrage going on.
Yeah, I got a very strong sense of the computer is the reality as far as UPS is concerned. I made numerous attempts to punch through to an agent at UPS but they forced me to enter a tracking number first at which point the call just dropped back into the automated tree starting with "Your privacy is very important to us."
Cloud services need to have cost caps, plain and simple. This isn't cloudlflares fault, it's Amazon's, and it's the authors. Cloudflare should be detecting overall data transfer, but there's plenty of cases where terabytes of traffic is entirely expected. We know Amazon won't fix their service, so perhaps cloudflare could impliment bandwidth limits.
I think backblaze does it well. You can set limits for each type of API transaction and you receive email and sms notification when you get close to the cutoff limit (70% and 90% IIRC). You have a page with all the costs and how close you're to the limit if any. 2FA was also straightforward to setup. If your account is compromised the limits can still be lifted but there is limited interest in compromising a backblaze account in the first place and the 2FA makes it unlikely.
I'm the sysadmin for a small nonprofit so my organization qualify for the $3500 yearly azure credits but the inability to set spending limits makes me not use azure. If I make a mistake or worse an admin account is compromised the azure bill is potentially infinite.
With azure I think you can mitigate the issue somewhat by having a superAdmin account without limits and set quotas for everything else but I still don't feel at ease with that.
My organization is a music festival so the infrastructure really has to work and not stop one day per year. I can keep an eye on everything during the festival and monitor spending. If things stop not during the festival people are a bit annoyed until I can look into the issue but nothing bad happens.
In a perfect world I'd like to have a way to setup a limit where you need to go through support to increase it. I'd really pay for that. I think it's really too easy to receive a nasty surprise bill.
Backblaze B2 also lets you put hard caps in place which can be nice for some situations where money is tight and the application isn't critical--and especially if you don't really have IT people monitoring.
I have said it over and over and will repeat it happily:
IF your Services doesn't has a proper limit, you do make yourself suddenly liable to a much higher risk than before and you have to be aware of this.
It is the same shit when you rent a car: Do NEVER rent a Car without proper insurance.
I'm working with GCP professionally and i have used AWS in the prev company. I do ask my manager if i can use it to try a few things out and its fine but i will not put my credit card behind an account with unlimited cost risk (its limited probably but you know what i mean).
And its not even simple; Everything costs you money. Storing data, receiving data, pushing data, making api requests etc.
And what i find always quite surprising: How often people, even on hn, present simple file based apis where you can upload images and edit them or upload files and download them again or offering free services and that with AWS as a backend.
I just might be to long in this industry to see all those pitfalls of exploits and risks everywhere but i have the feeling that obvious respect against cloud service billing is neglected by most.
I believe it's the same effect as micro-transactions in mobile games.
It's really easy to justify paying $1 for a small upgrade while you're playing. And only afterwards you notice that those $1 added up and have financially ruined you.
In the same way, $0.08 per GB (the effective price in the article) sounds really small and easy to justify. And we forget how they can accumulate...
Yes, but it's hard for a third party to weaponize that against you ;-)
You want to destroy a small startup with a free alpha version and AWS (or similar) backing?
Sure go ahead and send them tons of _legit_ (looking) traffic. This will first mess up their bill for this month and then mess up their statistics for the next month (when all the user they got disappear at once)...
When I was first looking at aws, I received a billing prediction alert that was ridiculous high. I could not find the culprit (I only had an ec2 instance and some other random services I looked into). In the end I deleted my account, to avoid this unexplainable billing. Next day, I got an email saying I erroneously got the alert. Damage was already done, but at that time, I was only exploring what the cloud had to offer, so no real damage done.... until a couple years later when I had to do some real aws for my work. Because I deleted/disabled my account, I cannot reopen an account with the same email address.
It’s always frustrated me how it’s not possible to set a quota and turn off services at quota.
Logistically I know this is hard for water or power, but it should be feasible for cloud computing. But I think this is an area where it’s not in AWS’ interest to set up that kind of billing control.
> It’s always frustrated me how it’s not possible to set a quota and turn off services at quota.
Which services you want to turn off or alter use of as you cross various thresholds vary, so what you probably want is a billing information services that sends alerts at configurable levels that trigger programmed actions that are more complex than a simple shutdown.
It would be easy to stop all spend at a threshold, but that means your entire set of apps stops working and all retained data (which usually has an associated periodic cost) vanishes irretrievably, which might be okay for an account being used for only toy apps but is going to be a business-ending error for any serious account
There's a good reason why cloud providers don't provide a simple “nuke your account at a particular spend threshold” option, and the fact that people who haven't thought things through think that they want such an option is actually a factor in not providing it.
If people can't keep track of spend with alerts to know that they need to take some kind of controlled restriction of services, they are going to regret unexpectedly having their whole set of services and data nuked more than an unexpectedly large bill in most cases.
Initially I tried to set up something as simple as when I got near the limit for free tier, it shut off to keep me at zero. Couldn’t do that with built in. And billing data isn’t real-time so I couldn’t even figure out a script a shut off.
I would lime quotas for individual services like “only spend $100 for EC2 and shut everything off when I hit it.”
My point is that I would rather have my stuff nuked than get a $1000 or $10k or $100k bill.
Setting up notifications and triggers is sort of possible but 1) that requires a lot of work for something that should be built in, I think; and 2) notifications aren’t in real-time so by the time I get a notification that I’ve gone over my $100 threshold I might already be at $1000.
I can compensate for this with scripts and third party services, but this would be so much easier if built in.
Unix has had disk quotas for decades right? Imagine if sysadmins left it up to users to monitor and control their usage and just charged overages. It’s so much more work for the user than when the system does it, or at least offers it.
My Linux host offers bandwidth quotas with similar cutoffs. I would never want an “unlimited “ quota where I got billed by the transfer and it was up to me to turn off. Some may want that, but not me.
Be aware that even with build-in threshold limit application will not be real time (due to parallelism, especially if multiple data-centers are involved).
> but is going to be a business-ending error for any serious account
That is simply not true for every use case. We were using it for various backup jobs, until a misconfigured one racked up a big bill. This particular job was a second backup. I have no problem at all with that failing! Now I am scared to let the various engineers and consultants that make up my team use it, because I can't afford the time to check personally what they are doing.
Google App Engine used to do this (maybe still does?). Turns out that for most production services, having everything come to a screeching halt at the very moment your site suddenly goes viral is not desirable behavior.
What you want is billing alerts, so you can be notified if usage is suddenly (say) 2x usual.
> What you want is billing alerts, so you can be notified if usage is suddenly (say) 2x usual.
While billing alerts can help you against this kind of (semi) accidental overage, it won't do anything if the account gets pwned. The first thing the pwner is going to do is turn off the billing alerts.
A hard quota that required manual reentering of the credit card number (preferably in conjunction with 2fa) would help prevent that.
You don't set caps low enough to stop if you want that kind of traffic. You set them higher to the point you're not willing to pay and can't be affording.
But with the blunt instrument of a billing quota, how do you distinguish between traffic going 100x because you're #1 on Hacker News and everybody wants to buy your product, vs somebody pwning your VMs and using them for a DDoS botnet?
Good question, that's up for for the developers, devops and sysadmins to decide. If they don't want it, then they dont. If they do, then they have to be smart with it.
Why does it have to be a blunt instrument? Why not an email saying “you’ve reached 80% of your specified quota and are about to be shut off” and if you’re #1 on HN and seeing incredible sales numbers you can just go change the quota?
Or just not set a quota on your business site, but do set a quota on your personal profile page that makes you no money.
> you’ve reached 80% of your specified quota and are about to be shut off
Because that might be very quick. Maybe you've got a 100x sign up boost in your SaaS app and you're missing out on the profit of your live?
Also, what do you want to shut down? EC2 instances, which might loose you data? Maybe no new instances, but your SaaS app needs those? Shut down all traffic? Delete your S3 files? All of those could be the possible reason for exploding cost and suddenly cutting any of those could be fatal.
Nothing is going to be perfect but if you have those concerns you can just opt out.
So many people seem to forget that you’re not being forced to use any of the options available to you. You can choose not to enable that. But right now you can’t choose the opposite.
What you’re describing is called “implementation details” and they don’t negate the utility of the feature overall.
It doesn't work for every use case, that doesn't make it worthless. I.e. if your homepage has (as here) 30TB traffic in a short time, that's not just a "frontpage of HN".
First what do you scope it to: Data, CPU, API calls, one or multiple projects. This needs to be designed carefully.
Then what do you base it one: Money or the unit used (e.g. data volumen). The later is much easier.
Then how to you technical implement the trigger. With many technical solution scaled to Amazone like services it's hard to stop at an exact limit, either you do some interpolation and maybe shutdown a bit before the limit is reached or you do trigger the shutdown when the limit is reached but then slightly overshoot it.
Lastly how do you do the "shutdown" on limit.
But in the end all is quite doable. Just not trivial. But not super hard either.
I think it is underestimated how complicated it is to deal with cloud services. You can do a lot with minimal training and doc reading. But these wholes in the formal understanding of how everything operates creates these kinds of vulnerabilities.
Everything can have side effects in the cloud. You can set up a cheap EC2 type T feet, and without managing your cpu usage, be charged a fair amount in unlimited burst credits (which is the default for terraform for instance).
You can set up lambda triggers and quickly do a proof of concept for an app, but forget to correctly dimension your mem usage and be charged more than you need.
Cloud requires careful policy and topology consideration. There are many simple blocks that forms a complex mesh with opaque observability of potential vulnerabilities in both access and billing. Cloud is nice but it requires time and care. And with the shared responsibility model, you are responsible for that.
Saw the mention of PTSD and running towards danger and immediately thought, “oh nice hopefully I’ve found a kindred spirit”. This matters because I share little in common with my peers in this field. So I read Chris’ About page and...
Do any other (combat) veterans smell something wrong with an Air Force Tech Controller (3C2X1) making statements like ”like back in the old days, when something would go bang or boom, and I’d run towards it” in a civilian venue? You know exactly what I mean, and we see it all the time.
If you aren’t a veteran, especially with a job even remotely related to “running towards things that go boom” please just give us some space on this one. Thanks.
AWS bills are un-auditable. I'm convinced every org is being over-charged for bugs in their billing and tracking software. I've asked on multiple occasions where charges randomly started appearing (despite no infra changes), which weren't there the month before, and no one was able to answer on the AWS support side.
1. The big cloud providers charge enormously for outgoing bandwidth. Most of us know this, but unfortunately it bites people a lot.
2. If you host big files on these clouds with no limits or warnings, it's just a matter of time before this happens to you.
This is why I don't run hobby things on these clouds. Any hobby project may have backends and services running on them, but NEVER anything user-accessible such a webserver, S3/GCS bucket, or similar. It's just too much of a "click here to bankrupt me".
For a business it's a different matter. You are making money, and you're spending money to do so. You still need to have a DDoS plan for your outgoing traffic, but it's much easier to solve these problems if you have revenue. Revenue buys time and people.
Nicely written article. Interesting and one should definitely be aware of how certain services are charged before using them. This would be a good lesson for all of us :)
On a different note, Recently I was looking to learn AWS concepts through online courses. After so much of research I finally found this e-book on Gumroad which is written by Daniel Vassallo who has worked in AWS team for 10+ years. I found this e-book very helpful as a beginner.
This book covers most of the topics that you need to learn to get started:
Not directly related to S3 traffic bill, but overall cloud cost management.
Maybe some are unintentional, but still very painful. My experience with AWS & GCP.
- AWS CloudWatch: expensive service, virtually unusable, hard to turn it off.
- AWS overall: finding and cleaning up resources is messy. The order of creating & cleanup is not same. Closing an account is a painful process. GCP Project structure is way easier.
- AWS EKS: You create a cluster, then a node group. Deleting a cluster fails if there is a node group. You go ahead to delete a node group, it complains because of "dependencies". While you're randomly looking for a "dependency" the $ clock is still ticking. You should delete the network interface before you could delete the node group, and only then the cluster. This does not sense because if the network interface was created implicitly by the node group, i should not be responsible for deleting the network interface. There should be a symmetry in create/delete operations.
- GCP GKE: You create a cluster, then delete it. Cluster gets deleted - kudos, usability much better then with AWS EKS. But it turns out lots of LoadBalancers and Firewall rules are left over and still appear on the cloud bill. Those are implicitly created and should be cleaned up implicitly by GKE.
I asked AWS to set a limit on my spending. They said that they did not want to do that "not to break my business".
I want them to - I do not care if my site is offline vs. having to pay a huge bill. That should be a choice.
So I moved away from AWS. It is crazy that companies agree to such a racket (not the pricing - but the fact that you cannot set a limit).
I considered to use a virtual card with a limit on it - they could not grab more than the limit and just sue me across the pound or remove my account. But I refuse to play these games with a company who does not give a shit about billing.
A good alternative to this ever-present risk is to use a dedicated virtual private server that is unmetered. This would make mistakes like this (and yes, it is a mistake - it is his fault he didn't read the cloudfare details and publicly served a large VM image) impossible.
This also (especially) applies to startups that might suddenly take off at any moment (but don't expect to.) AWS is a ticking time bomb of unexpected charges. You never know what the Internet will bring you. Go for an unmetered VPS and have 1 single well-defined charge that doesn't change. That's what I do on my side projects.
[1] I previously asked Dan, the moderator here, if I can share in this way and he said it's okay. I don't have other affiliation with that company and have found it good. The last time I posted this I got 80 visitors and no complaints (and got upvotes), so I figure it is a good resource for people.
> Moving it back to AWS from GCP bumped the AWS bill to an average of $23/month. Not too bad given the site’s traffic.
I've checked the traffic, it was 2.3k users for entire June, like 75 user per day at average. It is effectively nothing, why author thinks it's okay to pay 1 cent per user per month to hosting provider? $5/mo VPS can handle two orders of magnitude more.
Totally agree. I have alerts for 50% and 120%. If I get the 50% in the first week I'm digging in then. If I get the 120% I'm watching it like a hawk from then on (and letting accounting know).
The article says they would not have helped but it doesn't say why... Maybe because it's delayed by a day?
Having setup alerts would still have reduced the bill.
Still assuming that something like 2h pass before you can react to an email is quite reasonable which would still have been ~150$ on the big day, which is ~6x of the normal _monthly_ cost in 2h...
And that is assuming the alerts are send real-time, which they are not.
This is why I'm deathly afraid of using any major cloud provider.
External traffic is effectively unlimited, and a number of possible reasons (popularity, misconfigured script pulling something in a loop, someone intentionally generating traffic to hurt me) have the possibility to throw me into arbitrary amounts of debt, with the only recourse being hope that the cloud provider will be merciful.
Even if I have alerts set up: someone pulling 10 Gbit/s can generate over 100 TB per day, at $80-100 per TB. If I don't check my e-mails for weekend, I can be $30k in the hole before I notice.
I racked up a $120K+ Google Cloud bill via an unsupervised and poorly-coded script which used the geo coding API. It didn’t take much to get Google to waive it. This happens all the time I’m sure
For a private person it's much better to use something like a private vm or a dedicated server, from Vultr or 1984hosting (Iceland) you can get a vm for just 2.50$ (only IPv6) or 5$ (IPv4 and 6) or a dedicated server from Hetzner, OVH, Scaleway (Arm64) for like 30$, some have unlimited traffic (mostly the dedicated servers) with a 1Gbit connection. NEVER use stuff you have to pay without knowing whats coming (count's for private use an small business)
Yes AWS is crazy expensive, their business model is simple yet super effective and they are swimming in money, power to them, make custom services with custom APIs, make it very cheap/free for low volume use and then crazy expensive when the usage goes up and the customer is vendor locked-in into their services + push enough money into marketing to so it is everyone's first choice.
Not only AWS is very expensive but also rather hard to use and all their forms and services pretty difficult to navigate as well, it put me off the cloud hype for very long time until I actually discovered reasonably-priced* cloud providers like DigitalOcean (or linode, vultr,...) with also very easy to use platforms.
* of course still pricier than dedicated hardware/VPS, however the premium for hourly billing and infrastructure maintenance is reasonable
Can someone argue that the complexity of the cloud does not easily surpass the time and effort of just setting it up yourself? And for once all of that time and effort into setting it up is actually valuable and you get much better insights into your own operations.
And the alternative is paying someone to lock you into their ecosystem.
- usually not even needed on real hardware. Focus on a sound architecture (which will reward you in everything you do). But depending on situation VMs are an excellent choice and it is easy to spin up more
- that is an antifeature
- seems trivial compared to keeping up with all the cloud gotchas
- possible anyway
- sure, but you probably don't need expensive hardware to start, or go with VMs
I had a $190,000 AWS bill for an account only used for static S3 hosting.
And guess what, I didn't write a blog post about it. I just went to support, said remove the charges, they identified the services that created the issue so I could kill them, and they removed the charge.
Look at that, no fan fare. I had no emotion about it whatsoever. Maturity.
I can see some utility in writing about it. his blog post is quite literally "heart palpitations", "shock" "panic" - his words - and tweets, over a $2,000 bill.
I'm not in my ideal financial circumstances, I just wouldn't have freaked out over that. I just would have handled it. And then blown them up in public forums if things didn't go my way. Customer support in the 21st century.
“I had no emotion about it whatsoever. Maturity.”
Doesn’t sit right with me. Wanting to share shouldn’t be condemned as immature. Besides, some of us want to read tech drama :)
The maturity aspect is knowing how to navigate a situation, not a reference to merely sharing. "heart palpitations", "shock" "panic" - his words - and tweets are not part about not knowing how to navigate a situation.
I'm not in my ideal financial circumstances, I just wouldn't have freaked out over a surprise $2000 AWS bill or a $190,000 AWS bill, fully intending not to pay it if I felt my activity did not warrant that.
There are no absolutes in guidances of behavior, only consequences. And the consequence here is potentially having your AWS account deleted in a month or two, with several remedies in between.
Good reminder to everyone including myself to use services that have budget alerts or spike protection. And to stay away from aws, even 23$ for static website hosting is a bit too much in 2020. At fairly priced hosting without fancy name, 30tb traffic can cost <10$
It's amazing that amazon would let him run that massive instance for months without any successful payment. I would imagine other vendor would suspend the account after a few days of late payment. For example, DO was notorious on suspending and deleting data after just a few days of late payment and it caught many people off guard while on vacation (I think their policy related to late payment is much more relaxed now after it was blowing up a while ago).
The description sounds like that person is at fault. They have emails coming in for three months about thousands of dollars in charges and don't read them :(
Copying a 12GB file and sharing it publicly on S3 (how does AWS make money with S3, anyone care to answer?). I agree it’s expensive, maybe outreagously expensive. It’s great he got a refund! Not so many would be that lucky. And we can all learn
The article shows again the importance of reaching out to AWS support for issues like this, they would rather have a long term customer then a one time score, they are really forgiving of one time mistakes.
TL;DR: Don't leave 10+ GB VM images open to the world on S3 unless you want pay everybody's bandwidth bills when they spin up a new instance using them. And set up billing alerts!
It's good customer service and leads to more profit in the long-run.
If they didn't refund him, maybe he'd jump to a different provider, maybe people would see this and not sign up for AWS in the first place, etc. By refunding him, they likely keep a customer, don't scare away other potential customers, etc.
It's a quite small sum in AWS terms, loosing a customer could easily cost them more in future revenue. And especially for bandwidth I'd assume their actual cost is way below the billed cost, so refunding it is cheap.
AWS is actually one of the few organizations I’ve dealt with where the CSRs seem to be empowered to do stuff like this without you having to raise a storm on social media first.
In my anecdotal experience, just submitting a support ticket explaining what got screwed up and you made a mistake and how you’re going to rectify it so it doesn’t happen again is usually enough to get the charges refunded. And even if you don’t submit all that, often they’ll just prompt you for it.
For example, when a legacy account got compromised and rang up like $20k in charges running a bunch of servers to mine some sort of crypto, we asked for a (totally undeserved) refund and they just had us rotate all our access keys and refunded it.
The support is part of the reason why I prefer dealing with AWS. I know from my time managing over a million bucks a year in AdWords spend that the reason Google’s support is non-existent isn’t because you’re not a paying customer...
Because AWS honestly cares more about having a happy loyal customer for the long term than making a few bucks in the short term. That’s why. It’s a rare mindset among large companies these days.
Exactly. It’s amazing how good customer service can be when the profit margins are high enough. It’s really only dumb companies and those with too much power that tend to combine high margins with bad customer service.
He made it to the HN front page. (Maybe after the refund, but still I guess he has some track record, blog, twitter, knowing people at AWS, whatever.) I seriously doubt whether AWS would refund everybody else making stupid mistakes. If they did they would advertise it as "no billing surprises".
Presumably the author would do much better with a VM or something from OVH, they'll just shut you off or limit you before it becomes a problem (not that they would care about 30 TiB).