The problem is that the size of media has been growing exponentially.
When I keep wondering how my phone is running out of space every time its images / videos. Even when you look at an app that is like 400 MB its not 400 MB of code, its like 350 MB of images and 50 MB of code.
I’d argue media storage usage is starting to level off somewhat because we’re approaching the limits of human perception. For movie content people with average to good eyesight can’t tell the difference between 4K and 8K.
Environmental regulations also bite, an 8K tv that’s “green” is going to have to use very aggressive auto dimming. Storage capacity growth to me looks like it’s outstripping media size growth pretty handily.
Now this isn’t to say I can’t think of a few ways to use a few yottabytes of data but I don’t think there is a real need for this for media consumption. You might see media sizes increase anyways because why not store your movies as 16K RAW files if you have the storage, but such things will become increasingly frivolous.
I would agree with you; but as technology improves we move the goalposts.
iPhones for example capture a small collection of images at the same time which are able to be replayed as a small animation (or loop) called “Live photos”.
I am certain the future will hold for us: video which allows us to pan left and right.
Interestingly enough I've been messing around with ffmpeg recently and the newest high end codecs (VVC / h266) drop HD video size by 30% or more, it's pretty crazy.
It'll be very interesting where AVIF and similar next generation image formats go in the near future, hopefully we'll get some reduction from the exponential growth.
>I’m sure we’ll see it rise, especially in the age of generated textures and materials.
if generative AIs get good enough then I suppose at some point the data transmitted for games and media could be significantly less than now -- you'd 'just' need to transmit the data required for the prompt to generate something within some bounded tolerance.
Imagine a game shipping no textures, generating what was needed on the fly and in real time to fulfill some shipped set of priorities/prompts/flavors.
we're not there yet but it seems like on-the-fly squashing of concepts into 'AI language' is going to be a trend in lossy compression at some point.
There are actually a lot of procedural games out there, I think No Man's Sky uses some of those techniques, but they definitely have been around since the 80s. The thing now is that the fidelity can be much higher, for sure.
I remember being a kid at Babbages at the mall in the 90s and some guy told my friend and I that he just built a system with 8 gigs of storage, and my friend I talked about it endlessly as the coolest thing ever.
While I agree, it's been hard filling up the 2TB drive in my laptop.
My home server has a couple dozen terabytes (on spinning metal) and, with current fill rate, it's predicted it'll need an increase in space only after two of the drives reach retirement according to SMART. It hosts multiple development VMs and stores backups for all computers in the house.
Another aspect is that the total write lifetime is a multiple of the drive capacity. You can treat a 256TB drive as a very durable 16TB drive, able to last 16 times more writes than the 16TB one.
Don't even have to set your sail; this landlubber likes likes to shoot videos with a smartphone, and these days, recording a few minutes of a family event, or even your plane taking off, in decent quality, will easily give you a multi-gigabyte video file. And that's for normal videos; $deity help you if you enable HDR.
And yes, this is the universal answer to "how much storage is enough" - use cases will grow to consume generally-available computing resources. Today it's 4k UHD + HDR; tomorrow it'll be 8k UHD + HDR, few years later it will be 120 FPS lightfield recording with separate high-resolution radar depth map channel. And as long as progress in display tech keeps pace, the benefits will be apparent, and adoption will be swift.
I'll be curious to see the file sizes for Apple's version of 3D video capture in their Vision goggles. After one, two or three generations, I'm sure the first gen files will look small and lacking.
I've actually found my videos are not increasing as rapidly as I would expect. I've been reencoding in x265 and the file size difference is shocking. Right now I'm not ditching the existing original files but I may do that at some point, or just offload to a cloud service like Glacier
I’m right up next to a limit on live (easily-accessible, always visible in photo apps) cloud storage, with years of family photos and video taking about 95% of that.
I definitely don’t want to delete any of it, so I have been just hoping for bigger storage to be offered soon, but…
I hadn’t considered that re-encoding could be an option. I take standalone snapshots of everything every few months so if re-encoding would make a significant difference I might have to try this.
Do you have any tips on tools, parameters etc. that work well for you, please?
I use a shell script with ffmpeg. I encourage you to check out what works best for you but honestly the quality is pretty stellar with just a really simple one like
That's a fast single-pass constant quality encode - a two-pass encode would be better quality for the size but I find that very acceptable. It knocks down what would be a ~2gb file all the way to between 800mb - 1200mb with very reasonable quality, sometimes even more - I've seen a 5gb file become a 400mb file (!!). You can experiment with the -crf 26 parameter to get the quality/size tradeoff you like. I run that over every video in the directory as a cron job basically.
I think, for me, it satisfies some kind of hoarding instinct. I have a hard time keeping 'random junk' laying around my apartment, but I have absolutely no problem keeping a copy of a DVD I ripped 15 years ago that I will probably never watch again, and would probably be upset if it disappeared for some reason.
Blu-rays can take up 25gb each, so just a decent collection of those could easily consume most of one of these drives. If you want to do basic model tuning in stable diffusion, each model variation can take 7gb. This level of storage would mean you could almost setup a versioning system for those. And finally, any work with uncompressed data, which can just be easier in general, could benefit from it.
Even with brand new 25TB 3.5" drives, it's 10 of them, each holding 1,000 movies, for a total of 20,000 hours of entertainment or, roughly, 2 years of uninterrupted watching.
Oh look at Mr. “I pay legitimate streaming services for all my tv shows and movies” over there. (=
I have a 12 TB NAS that is 99% full at the moment. Should I delete movies I may want to watch later, knowing full well they aren’t easily available on the streaming services I pay for? Ha.
It's interesting to think that, as flash densities surpass hard disks, it'll become cheaper to store data on flash than on spinning metal once you factor in rack space and power consumption.
For the kind of usage a streaming device has, an SSD is overkill. For that, spinning metal is probably a better choice. OTOH, 256TB of spinning metal take up space and is quite noisy.
I vividly remember seeing a 5TB drive at Fry's Electronics sometime around 2010-2013 and thinking to my self "Who in gods name would ever need that much space"
But practically don’t you reach a threshold where storing that much data on one drive makes it a bottleneck and safety risk until the speed of the surrounding systems catch up?
I do digital histology, and our (research) lab currently has 204TB of image files. They live in a data center, of course, but if my institution decided to spin us off as a company or something and we needed to move the data, it'd be way faster to download it to disk and upload it in the destination center. I'm not really sure we'd do it with just one giant drive instead of a whole lot of 1TB ones, but who knows.
(I'm currently working on sending 100TB of images to some colleagues at the NIH for a study, we're doing it about 500GB a night for the next year or however long it'll take just because there's no hurry on the data, so it's not just some academic thought exercise!)
Exactly, we just gobble up all the storage there is. In diagnostics it's easily 250GB per patient just for HEs. And if stuff like CODEX or light-sheet (or some other 3D) microscopy become common place even these drives won't be enough.
That's just 5 20tb drives. Some cloud providers will copy it on to them if you send them in, and usually cheaper than the bandwidth cost. Same for ingress
A drive these days is a CPU, memory, and some flash chips. If the CPU and memory are swappable (isn't in consumer SSDs, no idea about enterprise), then one drive today is really many independent pieces of storage media. Thus, you'd imagine the failure case to be more like the failure case for one entire storage server (pipe drips on it, tornado sucks it up) rather than worrying about the failure case for mechanically-linked hermetically-sealed platters spinning at high speed.
There is always the chance that all the flash chips fail at the same time because of a manufacturing defect. That has always been the gamble over multiple drives as well; many documented cases of all the drives in a RAID array failing at the same time. (This happened to me once! Terrible shipping from NewEgg damaged all 3 drives in my 3x RAID-1 array. I manually repaired them by RMA-ing the disks one at a time; fortunately different blocks failed on different drives and with 3 drives you could do a "best of 2".)
No matter how many independent drives you have, you will always need your data stored in multiple data centers to survive natural disasters. So I don't see 256TB in one device any differently than putting 32 8TB SSDs in one server. If you need that much storage, you spend less of your day plugging it into your computer. Savings!
I bought my first HDD in the mid-90s: 850MB. There was a 1.2GB model, but I thought, "Why would I ever need that much space?" This was before videos, before mp3s, and images were all low quality jpegs.
Back in the summer of 1989, I did an internship at Imprimis (later bought by Seagate). The big thing they launched that summer was the Wren VII hard drive, which was the first consumer hard drive with 1GB of storage. It was massive!
I learned a lot about Statistical Quality Control that summer, and built them some tools for improving their SQC across all their models.
Yes, there are big data applications where direct-attached storage density has a large impact on the economics of working with and analyzing that data. This is mostly due to bandwidth constraints and the fact that many analytical workloads can run efficiently with limited compute/memory relative to storage. Using a vast number of servers when they are not actually required introduces its own challenges. Sensing and telemetry data models sometimes have this property.
Yeah, when you're trying to pack as much storage into a small space, such as data centers with expensive real estate. Or where distance is constrained due to latency, such as high end machine learning workloads.
If I have a century’s worth of point-in-time data, and I want to quickly run various tests against it: data marshaling will kill my soul if I’m using an HDD or various incarnations of hot-cold disk schemes.
Granted, I’ll have to read it in 1TB “pages” since motherboard and RAM engineering haven’t gotten us very far.
they build now with many smaller SSD attached, and maybe with this large one all maintenance process will be encapsulated in one component, so you can think about it as NAS but placed directly inside the server.
Datacenters where you are paying a huge premium on every square inch of physical space used. Also currently every other common way to store this much data eats a lot more power and generates more heat.
I think any data center would want some of these as long as the speeds are enough to reconstruct a disk after a failure in a reasonable time in a RAID configuration.
I’d argue they already are, it’s not infeasible to get ~320tb into an overstuffed NAS (two 10 drive arrays with two drives as parity) currently with a few drives being overly hot and we’re seeing HAMR hit the market now and should probably easily see density double before too long. We’re at the point you can home movie library the size of a streaming site’s library for less than 10k without needing a rack. If you drop density a bit and use used enterprise SAS drives you could probably get things done for 5k + a decent power bill. Still inaccessible to most but plenty accessible to an enthusiast with some disposable income.
QLC would be nice for such a home application over noise/space/power usage concerns but the cost is still extremely high.
I can absolutely tell the difference and it’s frankly annoying to collect something which will become obsolete not just within your lifespan but when you’ll likely have a lot of lifespan left to go. 4K blurays to me hold up compared to 5k/6k/8k raw footage to the point I wouldn’t be too troubled if I had an old format on my NAS when an h.267 8k version came out.
You can have a great experience by streaming videos directly off shady streaming sites and using a cheap Hisense TV but there will always be a market for people dropping tens to hundreds of grand on some elite home theatre with inky blacks and perfect sound and maybe a datacenter storage array in their garage. It’s practically a hobby.
you know, I’ve never actually bought a bluray video before, maybe I’ll pop one in my ps5 and see if I’m blown away, or more likely if the other sources start having things I cant unsee
I did recently get back into piracy, and did a couple comparisons since I figured 4gb h265 is so last decade, and I really wasnt amused by the larger bluray rips. I have 20/10 vision acuity and can also see a broad vivid colorspace. I have great monitors and screens chosen for that color space and quality too.
Honestly if we compare a 10gb compressed re-encode to a 4K bluray rip or remux I would have trouble telling the difference but 4gb is pretty small especially for certain kinds of movies.
There’s certain scenes where I can definitely see compression artifacts or whatever and others where quality differences cannot be noticed.
My guess: this probably solves the issue with compression artifacts creating annoying blocks of color in dark/black areas of the video, which is increasingly important as the past decade had movie and TV show makers all switch to shooting everything ridiculously dark.