Hacker News new | past | comments | ask | show | jobs | submit login

Especially in a world where so much data now "lives" in the cloud. Between my dropbox, github, google photos, etc. Very little of my data only lives on a hard drive. The stuff that does lives on a Synology NAS and is mirrored to S3 Glacier weekly.



NAS mirrored to S3 Glacier. How much does it cost?


In my experience this is very cheap. I take it the parent is not retrieving from Glacier often/ever, which is where the significant costs go. It's a decent balance for disaster recovery.

I sync my photos to S3 (a mix of raw and jpeg, sidecar rawtherapee files) across a few devices so Glacier is prohibitively expensive in this regard, but I still pay <$100 a year for more stuff than I could ever store locally.


I did the math, and for me Glacier is great for backups where homeowners insurance is likely to be involved in the restoral. It was ferociously expensive for anything less drastic.


I'm trying to figure out the costs. To back up my NAS at full capacity, I need 10TB of storage. Using S3 Glacier Deep Archive, that seems to cost $10/month per full backup image I keep. That's not bad.

What's confusing is that the calculator has "Restore Requests" as the # of requests, "Data Retrievals" as TB/month, but there's also a "Data Transfer" section for the S3 calculator. If I add 1 restore request for 10TB of data (eg: restoring my full backup to the NAS), that adds about $26 for that month. Totally reasonable.

However, if "Data Transfer" is relevant, and I can't tell if it is or isn't, uploading my backup data is free but retrieving 10TB would cost $922! Is that right?

This is what has always deterred me from using AWS. It's so unclear what services and fees will apply to any given use case, and it seems like there's no way to know until Amazon decides that you've incurred them. At $10/month for storage and $26 if I need to restore, I can just set this up and I don't need to plan for disaster recovery expenses. But if it's going to cost me $922 to get my data back, I've got to figure out how to make sure my insurance is going to cover that. This isn't a no-brainer anymore. Also, what assurance do I have that the cost isn't going to be higher when I need the data, or that there won't be other fees tacked on that I've missed?

[1] https://calculator.aws/#/createCalculator/S3


Glacier pricing can be hard to grok...

With Glacier as it is usually used you dont read data directly from the Glacier storage, it has to be restored to S3 where you then access it. That is the restore charges and the delays, so you can pay a low rate for the bulk option that takes up to 24 hours to restore your data to S3. But the real cost is the bandwidth from S3 back to your NAS/datacenter/etc, which brings it up to about $90USD/TB.

Other fees would include request pricing, some low amount per 1000 requests. So costs can go up a bit if you store 1 million small files to Glacier vs 1000 large files. There is also a tipping point (IIRC about 170KB) where it is cheaper to store small files on S3 than Glacier.

Depending on your data and patterns it can be better to use Glacier as a second backup which is what I do. All my data is backed up to a Google Workspace as that is "unlimited" for now. The most important subset (a few TB) also goes to Glacier. Glacier is pay as you go, there isnt some "unlimited" or "5TB for life" type deal that can change. If Google Workspace ever becomes not "unlimited" or something happens to it, I have the most important data in Glacier and its data that I have no qualms paying >$1k to get back.

But for me restoring from Glacier means that my NAS is dead (ZFS RAIDZ2 on good hardware) and Google Workspace has failed me at the same time.


Cool, thank you for the details. None of their marketing or FAQs for Glacier mention that getting the data back means going to S3 first and then paying S3's outgoing bandwidth costs. As deceptive as I expected.

I'll check out Google Workspace; that sounds like the right level of kludge for me, since this is the first time I've ever bothered to try to setup off-site backups. I only started using RAID a couple of years ago.


It makes more sense sense when you think of Glacier as a tier of S3, like Infrequent Access/etc which it is now. There used to be a Glacier standalone service but you had to upload a blob and track the UID to file name mapping yourself. That skipped S3 but was far more complex.

Workspace make more sense for large amounts of data where you can take advantage of the "unlimited".

Backblaze B3 might be what you are looking for as an in between option.


Are you sure about the "restored to s3" bit? Their SDK seems to fetch directly from Glacier.

Note that the official name is "S3 Glacier", so from AWS's public perspective, it is S3.


> However, if "Data Transfer" is relevant, and I can't tell if it is or isn't, uploading my backup data is free but retrieving 10TB would cost $922! Is that right?

That's right. AWS charges offensive prices for bandwidth.

There are alternate methods to get data out for about half the price, or you can try your luck on using lightsail and if they don't decide it's a ToS violation you could get the transfer costs to around $50.


The $922 sounds about right. That jibes with my estimates.

There's another (unofficial!) calculator at http://liangzan.net/aws-glacier-calculator/ you can toy with.


Thanks, I'll check it out.


how would that process look like in practice should you need to get to call the insurance guys? as in, would you claim on the cost to retrieve the data or ? (this question is general, regardless of the actual country)


I honestly don't know. I've never had to use it.


I don't have automated mirroring set up, but I have insight.

I use a Windows free tool called FastGlacier. I set up an IAM user on my AWS account for my backups, and use those creds to login. Then it's drag and drop! You can even use FastGlacier to encrypt/decrypt on the fly as you upload and download.

Glacier is cheap because the retrieval times are very slow - something like 1-12 hours depending on the tier.

I have about 100GB of critical data. Personal documents, photos and some music I don't want to have to search for if the house burns down. It's something like a dollar a month. Less than a cup of coffee.


Deep Archive is super cost effective, $1/TB/mo. For the house-burns-down scenario, I don't mind the 24hr retrieval time


Glacier is about $1USD/TB/month just for storing data. If you need to retrieve it ends up being about $90USD/TB, most of that is bandwidth charges.


That means that if you store the data for much more than a half of year, Glacier becomes more expensive than storing on tapes.

Of course, tapes require a tape drive and its cost would require a lot of data to compensate the cost, but at a such high cost of retrieval it would not take much data to equal the cost of a tape drive.

Glacier is OK for a couple of TB, but for tens or hundreds it would not be suitable.


> Of course, tapes require a tape drive and its cost would require a lot of data to compensate the cost, but at a such high cost of retrieval it would not take much data to equal the cost of a tape drive.

But the less you expect to use it, the less this matters.

So I'd put the break-even point a bit higher. Tape is good for 100TB or more but for tens it's hard to justify a tape drive.

Also it's important to remember to get those tapes offsite every week!


Few bucks a month. More if I needed to retrieve something.


But Dropbox, Github, Google photos etc rely on massive piles of hard drives.


Sure, but the comment was about personal storage. Fault tolerance at the edge is less important in a cloud world.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: