Hacker News new | past | comments | ask | show | jobs | submit login

While my eBook library is pretty something, the storage costs to rip my movie library onto hard drives would be cost prohibitive. I do keep physical discs of the movies I really enjoy, and generally only buy digital during sales. Most of the "value" in my digital movie collection comes from digital copy redeems.

Vudu does $5 sales all the time, and the way I see it, that's less than a movie ticket. So if I see it once, it was like going to see a movie, but with the theoretical hope of long-term persistence. The fact that MA insulates the risk of losing your titles across every major tech company adds to the comfort level now too, and as noted, nobody is losing their UV library when UV shuts down.




How many movies do you have? A 5TB hard drive is slightly over $100 and will hold a minimum of 100 Blu-ray rips, assuming they use the entire disk capacity and you don't reencode them to something smaller.


Well, for one, hard drives also fail. So I need backups, unless I want to have to sit and re-rip everything when it happens. I'm a big fan of RAID 1 since I don't have horrific rebuilds, and I offsite my data as well. (Most of my data is stored on five hard drives currently.) So take your estimate drive costs and start multiplying.

Second, my Vudu library contains over 600 movies and significantly more television shows than that. And before you calculate the cost of the storage, bear in mind there's another, secondary cost: The much higher cost I'd have spent to have built that library solely on ripped discs. I never paid full price for a digital purchase. (And specifically, never more than it would've cost to acquire it via disc.)

Third, time. I'm also significantly backlogged in organizing my ebooks, which I do store on drives. Blu-ray ripping time just isn't something I want to dedicate a big chunk of my life to.


I totally feel your pain. I have been slowly building a local search engine for navigating the ebooks, PDFs, and other documents I've got stored on my NAS device.

I ended up replacing my NetApp StorVault (6TB (4TB usable), RAID6) with a FreeNAS box from iXSystems (24TB (20TB usable), RAID6) which cost me a bit more than $2500. (it is also much quieter than the StorVault was :-)). And on that 20TB I've stored a bit more than 200 book volumes that I had digitized at 1DollarScan, probably close to 1000 magazines (scanned generally manually with my ScanSnap 1500), and perhaps as many PDF documents that were never printed to begin with (like data sheets). I also have my music collection, and I have recently looked at putting movies on there as well since the streaming services are letting me down here as well.

At the end of the day it is the 21st century version of one's personal library.


I have had a FreeNAS mini for 5 years now. Best 2K on tech I have ever spent. So much better then rolling my own (which I used to do). 4x4TB (2 striped 4TB mirrors). Thinking about replacing the drives with 8TB ones. I had one drive fail in 5 years.


Currently I can only really do metadata search, but I'd really love to set up fulltext at some point. I have this dream where someday I can do things when the Internet is down.

The other thing I'm interested in, and wonder if you have thoughts on: When you pass, your personal library is inherited. Will people know what to do with your digital library? Will it be useful? Will people want it?


My goal is exactly that, usability without me present :-).

The tools for processing PDFs into searchable text have a lot of warts. For a while IBM was offering a free Watson service to do this (now its part of Watson Discovery) which has some warts. I did manage a set of perl scripts that would post process the statements that I downloaded from the bank into CSV files, but I would still like to pull tabular data out of PDF book scans to make the data they provide more useful.

I have a simple frontend based on the perl Mojolicious module which Blekko had developed as part of another project but my indexing tools are still quite primitive. Simple bi-gram and tri-grams, and a growing synonym index. I don't give it enough queries to use my own traffic for ranking feedback. So basically everything is nearly equal rank. Basically I am about to the AltaVista level of search capability :-).

The vision is it just runs as a server and anyone on the same network can access it like a web service an pull up documents (and in the future media) of interest.


For my eBooks, I use a java app called DocFetcher. It's indexing and word search capabilities are pretty nice, and I've had great luck with it, especially as I can set it up portably.


Did you get very far with other enterprise stuff like Sharepoint?


I didn't try Sharepoint, although I looked at Elastic Search on AWS briefly. The goal though was to have all of this stuff on premises both for latency reasons and to maintain a credible defense should someone come after me for copyright infringement.


A bit of a tangent, but if you use a Mac, I've found that Dash is really great for getting things done without internet access. It's main purpose is to pull in local copies of docsets and search those, but it can also pull in sections of stack overflow (such as all Python questions, or all Pandas questions), and it integrates with things like Alfred, so you can do scoped searches without any GUI interaction.


So do you have ~19TB still free? It doesn't seem like your items would take too much space.


I do still have a lot free, when I replaced the StorVault it was about 75% full (3TB out of 4TB), I'm up to about 5TB now. But there are more things on there as well. The Postgres data for my local gitlab instance for example, and time machine backups of my Macbook pro. So it isn't all library stuff.


There's actually great tools to help you build and organize that library... assuming you pirate everything.

I'm just hoping the movie/TV industry gets the picture soon and goes the way the music industry has with streaming.


I don't pirate everything. In fact, I don't pirate anything. For instance, with regards to eBooks, I hit Humble Book Bundles hard. They're bloody incredible for the discerning legitimate DRM-free digital content purchaser.


I was more just making the point that it's actually easier to do library organization for movies and TV as a pirate than it is as a legit owner of Blu-Rays for example.

Sad state for the industry, really.


Meh, it isn't that difficult with ripped blu-rays either. Plex can do it quite well.


I've found Plex great for movies, my ripped television shows on the other hand tend to be weird. Since the order of DVDs, BRs don't always match up to the airing order there can be a lot of manual work in cleaning that up. Plex is also not super great with music yet. It's getting better but it's definitely having problems with a lot of my music since I've got a lot of local/indie bands.


You have backups.

You don’t have to throw away the disk after you rip it. The disk itself is a great backup.

Personally, I prefer everything to be in my Plex server, but I’m in no rush and just rip as I go. I’ll probably never actually rip every one. By the sounds of it, I don’t even have anywhere close to the number of disks to rip as you, so I get it no being appealing.


Backblaze is $5 a month for unlimited storage. If your hard drive fails, you can order a hard drive with your data.


You better store encrypted if you're pirating movies/music.

It's rather easy to get TOS'ed even if you match filenames of common pirate name signatures.


Backblaze insists that your data is encrypted and that they can't see it. It's supposedly not like Box/Dropbox/Google Drive.

Do you have any examples where someone's Backblaze account was terminated for backing up pirated media?


I can comment on this - it's flawed. They encrypt locally and only send encrypted data to their servers, true, but they deduplicate that data between customers meaning they know file hashes or something similar and could remove pirated content if they so desired. Second, in order to get your data back their solution is to go ahead and type your "private" encryption passphrase into their site so their servers can then decrypt it and send you a zip or hard drive full of decrypted data.

I still suggest it for friends and family because it's cheap and damn sure that encryption is better than nothing, but if you have a lot of sensitive data I wouldn't recommend it. If you demand privacy, use Restic and Backblaze's B2 service, you're paying per GB then though.


> I still suggest it for friends and family because it's cheap and damn sure that encryption is better than nothing, but if you have a lot of sensitive data I wouldn't recommend it. If you demand privacy, use Restic and Backblaze's B2 service, you're paying per GB then though.

First, though, find out if your friends or family have an Office365 subscription. A lot of people get one because they want the Office apps, and don't make heavy use of the 1 TB of OneDrive storage that it includes. Restic vis rclone should be able to use that for backups.

Backblaze is cheap, but using something you have already paid for is even cheaper. :-)


How many of your friends are going to know how to use rclone?


You'd only need rclone if you wanted to use OneDrive with restic, because restic doesn't have direct OneDrive support.

Several other cloud backup programs do have OneDrive support built in, including Arq, duplicity, duplicati, duplicacy, and GoodSync. There are probably others.


Luckily passive commercial scanners don't require much; tossing it into an encrypted zip with everything set to fastest still runs over everything passive that's not nationstate


It makes more sense to encrypt the entire data store and just sync the entire data store. This is proof against anyone who isn't willing to hack your computer or break into your house while its open.


You'll want 2 copies there --- 100 blu-ray rips will take some time to rebuild if that drive dies.


You'll probably want 2 (or more) drives since even if you have the original discs the time needed to rip them again will justify the investment. On top of that you will most likely want a NAS to make the setup practical.

This adds up to a few hundred dollars that you wouldn't need to spend if the DRMs implemented were sane and reasonable with legal owners. And I don't even factor in the cost of the time, electrical power, or computing power involved to make it happen.


You are right, but just recalculate with 8 TB drives (more modern, quite cost effective) and 3 disks at minimum (RAID5) and you reach $600 for ~ 14 TB usable and safe storage. Add a CPU, MB, RAM and you reach $1000.


Can't you just plug an external HD into your existing wifi router's USB port to make it into NAS and invest in a paid off-site backup service for failures?


A dedicated NAS is useful, if only for keeping hard drives together. Any good NAS build will be primarily hard drives for your cost.

5TB to 8TB hard drives are cost-efficient in my experience. 6x 8TB Hard Drives for 24TB of storage (RAID1 like redundancy. Use ZFS btw) is $1200 (~$200 per 8TB hard drive).

> invest in a paid off-site backup service for failures

What's your recovery plan? If you want to recover 5TB of data off of a 100 MBit connection, that's 140 Hours. Local is the only thing that makes sense if you're at the point of filling up hard drives.

Having 1Gbps or even 10Gbps (local "fiber" with Direct Attached Copper) connections locally is relatively cheap. And it seems like 10Gbps is getting cheaper.

Accessing a hard drive, even a sped-up RAID set of drives, over 10Gbps is going to have the same bandwidth as a local SATA connection. (SATA is only 6Gbps). Your NAS is practically a "local drive" at that point.


> What's your recovery plan? If you want to recover 5TB of data off of a 100 MBit connection, that's 140 Hours. Local is the only thing that makes sense if you're at the point of filling up hard drives.

It's a media archive. As long as you can recover specific files on demand, it doesn't really matter if it takes two months to download the entire thing.


My $2k NAS holds over 2,000 movies and 20k tv episodes. The cost per is negligible.


DVDs are usually less than 5GB. 1TB drives cost $40 these days, less if amortized over larger disks.

Which means $100 buys you enough disk space for two redundant copies of 200 movies assuming you vobcopy them; about twice or thrice if you convert to h264 or h265 or av1; and half again if you have BD quality rips.

That’s about 50cents/movie for two copies, half if you only keep one. Is this really prohibitive?


I assume people with blue ray want to rip at original quality - isn’t that 10s of gig per disk? (I have never bought blue ray so have no idea).

For me the problem was ripping my dvds just took so long, and wasn’t automatable (weirdly my bottom of the line Mac mini struggles with re-encoding ;) )


Yep, Blu-rays can store something like 25GB per layer; industry standard for movies is dual layer (50GB).

Rips can achieve comparable quality with lower bitrate than used on the original disk. Blu-ray dates to 2006 and (typically) uses inefficient high bit-rate H.262 ("MPEG 2", which DVDs used), H.264 (AVC), or VC-1.

Rips can use high-efficiency H.265 ("HEVC"). Typical for a 1080p H.265 encoding is 9-15 GB. (Lower for animated films.)

UHD Blu-rays do use the more efficient H.265 already, but are much higher resolution as well. I'm not sure how prevalent they are.


The vast majority of newer Blurays use H.264, which is pretty efficient (much more than the other two). In addition, if you're just wanting to back up the film, and not any special features on the disk, at least half of films fit in 25 GB with no reencoding required.

One big downside of reencoding in HEVC is that for a moderate gain in compression (0-50% depending on the film), you considerably increase your playback requirements. I use a Raspberry Pi 3 with Kodi as a playback device, and it's not capable of playing 1080p HEVC videos.

UHD Blurays are getting more common. They usually come on 66 GiB disks, and it's becoming more common to let the video take up most of that, and bundling a second disk of special features if necessary. So even backing up the main feature of a UHD can easily run you 50+ GiB.


> The vast majority of newer Blurays use H.264, which is pretty efficient (much more than the other two).

Right; still quite a bit less efficient than H.265, though.

> One big downside of reencoding in HEVC is that for a moderate gain in compression (0-50% depending on the film), you considerably increase your playback requirements.

Yes, although this is becoming less true as more hardware offload support becomes available.

For example, Nvidia 750, 950-960, and 1030 are all capable of HEVC offload[1]. The latter can be had new for ~$85; the others are probably a bit harder to find but given they're older midrange cards at this point maybe they're a little cheaper.

> So even backing up the main feature of a UHD can easily run you 50+ GiB.

Sure, I don't think there's any economical way to reduce storage requirements of UHD significantly without defeating the point. On the other hand, I don't think the extra pixels really matter as much as higher color depth (10 or 12 bit) and fewer artifacts at lower resolution, unless you've got a home IMAX or something.

[1]: https://developer.nvidia.com/video-encode-decode-gpu-support...


Depends on the movie and if its 1080p or 4K, but generally I anticipate a ripped Bluray to be around 60gb.

Re-encoding them is a PITA as well. Even on a HEDT gaming desktop with a brand new CPU, the process only proceeds at 8-10fps, if that, so at least 2x movie runtime. Maybe there are some ways to involve the video card which can speed it up, but I haven't tried it.

And even if you want to and can keep the originals, you'll want to re-encode it and keep both; most Plex clients (think: Apple TV, shitty laptop, etc) simply can't handle raw 4K bluray playback.


ffmpeg supports a variety of hardware encoders but supposedly the quality will be worse than a CPU-encoded video at the same bitrate


Pirated 4k videos are often 7-15GB. I presume this is optimized for size over maximum quality.


That seems too small IMO. Encoded BD rips average about 20-30 GB; remuxes (no compression/encoding) begin around 50 GB.


How big is your movie library and what would you consider cost prohibitive? Hard drives are pretty cheap these days.


I thought cost was a blocker too till I actually sat down and worked it out. A cheap microserver with about 15TB of disk space came in well under the thousand pound mark - the real blocker is sitting down and actually ripping the things. There's a trade-off between space and encoding quality to be made, but I'm entirely happy with libx265's results at 1080p.

Obviously your mileage may vary as to what's a reasonable cost.


When my kids were little and basically destroyed anything they touched, I started ripping their DVDs. Now there's over 200 of them on my media center PC. I had to upgrade from a 1GB to a 3GB disk at one point, but there's still plenty of room for more.


> The storage costs to rip my movie library onto hard drives would be cost prohibitive

The storage costs as well as streaming transmission costs are not only borne by but also provide profit to the provider. If they can profit off them, you can hopefully at least break even!


You must be kidding. I bought two external 4TB for less than $130 CDN each.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: