It's still pretty expensive for archival storage without instant access requirements. Can somebody please do a tape based cloud storage service with slow file access but much cheaper per TB storage costs?
My start-up (http://degoo.com) is working on a P2P backup system that I think will match your requirements quite well. We are launching our beta tests within next couple of months. </shamelessplug>
Half price is of course better, but for my use case, I want something ten times cheaper. I'm willing to trade convenience and latency for that cost saving.
Use case: I want to be able to shoot several minutes of video on my cell phone and keep it backed up in the cloud without feeling like I've incurred a $1/year charge for the rest of my life.
im not sure with todays technology you could go cheaper than 4c/GB for a service like that. Doing it on your own would be $50 for 3TB cartridge == 2c/GB.
For some data, I could wait a week. I take pictures and movies that take up a lot of space; I don't want to ever lose them but they are not mission critical in any way, shape or form.
I store everything on a NAS + another NAS as backup; it's a fairly safe setup as long as my house doesn't burn down, isn't flooded or burglarized.
I would love an offsite backup solution that would be cheap because of very slow reads: if my house burns down it'll be some time before I can access those backups anyway.
Yes. If I could pay $10/TB/month, request a list of files via API, and be notified via webhook when I could retrieve them (and therefore, retrieve them while they're cached on spinning media for 24 hours), I'd definitely do it. I've got tons of video, pictures, and music to back up, and I don't need immediate access to backups.
I was looking into your problem. Just out of curiosity: even if you produce 100GB per day, what stops you from purchasing a 3TB (compressed) tape[1] for $54 and do backups on your own and store cassette offsite in your secure deposit box (some banks give those for free, or $30/year)
> what stops you from purchasing a 3TB (compressed) tape[1] for $54 and do backups on your own
The tape device costs over $1000 - you would need more than 33TB of (uncompressed) storage space before that becomes worth it compared to buying hard disks and hot plugging them when needed.
I've seen the tech side work. I worked at Fermilab on the USCMS side of the LHC for data taking; we had 5PB of spinning disk and tens of PB of tape in large automated tape silos with robotic arms on tracks. Data requested for processing from tape would be staged to disk, the job requesting the data would run, and then the data would be purged from disk after X hours/days.
It's easily done. Is there a business strategy behind it? Not sure.
Edit: I didn't answer your question. If I backup to tape and put the tape somewhere offline, I can't get access to it without a physical trip. I'm willing to trade access latency for lower cost, but I still want to move bits and not atoms.
For certain scales and usage patterns.
a 1.5TB* Tape is about $50. However it: (1) should last longer than a HDD and (2) Uses much less power, cooling, floor space, etc.
* Tape also has really nice compression (essentially you would be using the on-tape compression at least 95% of the time) with an average of about 2:1 (and often much higher). But you can also compress files on disk -- so it's just native storage.
Also -- Things like the SL8500 are such beautiful pieces of machinery. Computers + Robots = Awesome!
Editing to Add: LTO5 was released in 2010. LTO6 is likely to arrive this year with 3.2TB Native Capacity. (And the usual speed bumps of 50% Write // 100% Read speed). The roadmap for the tech is here: http://www.spectralogic.com/common/images/products/lto/lto5/...
It is hard to find pricing for the equipment, which makes me think it is expensive.but it could still potentially be cost effective. However it seems unlikely to come in at 10% of Amazon, ie at $150 per TB per year. You might be better off building your own robot for this application too, with more storage space.
Spun down hard drives would be a similar price. Not necessarily easier to work with.
I think spun down HDs would be a lot easier. They have standard interfaces which makes them cheap and flexible. Get a bunch of USB 3 docks[1] and some $10/hour clerks to shuffle the drives around.
Forget the clerks and USB. Get the very cheapest USB to ethernet chipset you can, on 100 mb ethernet, and only power it up when you need drive access. You should be able to do that at $10-20 a disk. No moving parts to maintain.
The big question is scale: tape robot systems are expensive to purchase and somewhat less reliable compared to an HDD so you have significant infrastructure costs bringing up the storage system, establishing a system of off-site rotation and redundancy, testing restores on separate drives (one common failure mode is that a tape drive slowly goes out of alignment, writing data which cannot be read by a calibrated drive), etc.
These are great examples of something which could be commodified so a provider could amortize them across many customers but the margins are continually tightening.
There are other providers that are definitely cheaper. Specially thanks to OpenStack Object Storage you can find different services with the same API, so you have an open door to leave any time.
While Amazon keeps reducing the price of storage, I don't think they have ever reduced the price of requests. :( I would have expected the cost of every component of a request (CPU, bandwidth, etc.) would have gone down over time, but it is still 1000 PUT/LISTs per $0.01 (or 10k GETs, also per $0.01). :(
(These request costs actually add up quite quickly: I spent something like $8k on PUT/LIST last month. At one point I found out a client library I was using made the horrible decision to verify the bucket existed with a LIST request each time you made a connection; I tracked that down to $70/day I was losing.)
Amazon started cutting prices long before there was any competition. I dont think this is due to Google, as for most people the difference in network costs depending if you use EC2 matters more. It is probably more to do with hard drive prices renormalizing after the flooding...
They specifically want to economically encourage bringing the house with you to get the cheapest rate (total commitment); or encourage building a larger house than you might otherwise have. A continuous function of price reduction per unit, rewards everyone as they go; Amazon very specifically doesn't want to do that (users gain benefit at every step, and are rewarded just for being at any given metric, encouraging no particular scale or usage).
The price points are supposed to be mental carrots, in other words. It can of course be debated which approach is better, but I have to suspect Amazon has a lot of data on scaling and pricing from retail behavior.
Yes and no. Correct me if I'm wrong but tipping over into the text tier doesn't change the price of the previous data, only the data that's in the next tier.
What I find slightly worrying is that the cost of EC2 instances does not decrease more with time. DRAM prices have fallen quite a bit since EC2 started, and yet the cost of EC2 remains relatively stable.
Look at EC2 pricing — instance pricing is pretty much a function of RAM size. I realize RAM isn't everything, but it is a large variable factor, with other cost factors either being constant or not growing proportionally fast.
I'd still argue that we should see bigger drops in pricing, especially for the large-memory instances.
They're really using RAM as a proxy for other real costs. The amount of RAM you allocate is a pretty good predictor for server/VPS utilization -- which means their actual power, bandwidth and hardware replacement costs as you wear out hardware faster. Virtually all unmanaged VPS and dedicated hosting companies make RAM the primary factor in determining the monthly service cost even though the price-per-gigabyte is more each month than it'd cost to buy the physical RAM outright. In other words, it's a mistake to think that the actual cost of RAM is the reason pricing is proportional to RAM size.
CPU is a better proxy for utilization than RAM, no?
CPU uses far more energy than a RAM, and I can write a low CPU web cache in 12GB RAM or a high CPU scientific simulation in L2 cache.
Amazon also does not charge for inbound bandwidth. I'm not sure what you're getting at. They also had to do the comparison with the additional support package from Amazon to inflate the price to make their pricing model look more appealing.
http://willj.net/static/amazon_s3_old_and_new_price_comparis...