Hacker News new | past | comments | ask | show | jobs | submit login

How do you handle backups of such amount of data?



A 22TB pool can perhaps be backed up to a single 26TB drive (over USB? Thunderbolt?):

* https://www.techradar.com/news/larger-than-30tb-hard-drives-...

Buy multiple drives and a docking station and you can rotate them:

* https://www.startech.com/en-us/hdd/docking

ZFS send/recv allows for easy snapshotting and replication, even to the cloud:

* https://www.rsync.net/products/zfs.html

* https://arstechnica.com/information-technology/2015/12/rsync...


However, such a drive is getting heavily into diminishing returns territory.

e.g. a 20TB drive from Seagate is $500. A 4TB drive is $70, 8TB is $140. Getting the same spend in smaller capacity drives would give you 28TB in the 4TB drives and 24TB/32TB in the 8TB drives (for $80 under/$60 over).

Add in a second to rotate and you're spending $1000 in drives, assuming these 26TB drives replace the 20TB drives at a similar price when they trickle down to consumer hands.


You have to factor in the power usage of having multiple drives spinning. Though I’d agree that smaller drives are better when you have a drive failure, as resilvering is quicker.



I have a nightly restic backup from my main workstation to buckets on Backblaze and Wasabi. It backs up the few local folders I have on my workstation and all the files I care about on my NAS, which the workstation accesses over Samba. I've published my scripts on Github.[0]

I don't back up my Blu-Rays or DVDs, so I'm backing up <1 TB of data. The current backups are the original discs themselves, which I keep, but at this point, it would be hundreds of hours of work to re-rip them and thousands of hours of processing time to re-encode them, so I've been considering ways to back them up affordably. It's 11 TiB of data, so it's not easy to find a good host for it.

[0] https://github.com/mtlynch/mtlynch-backup


"CephFS supports asynchronous replication of snapshots to a remote CephFS file system via cephfs-mirror tool. Snapshots are synchronized by mirroring snapshot data followed by creating a snapshot with the same name (for a given directory on the remote file system) as the snapshot being synchronized." ( https://docs.ceph.com/en/latest/dev/cephfs-mirroring/ )

We found ZFS led to maintenance issues, but it was probably unrelated to the filesystem per say. i.e. culling a rack storage node is easier than fiddling with degraded raids.


I use B2 + E2EE. TrueNAS can push and pull pools to many different options but Backblaze is the cheapest I've found.


Buy another, use ZFS send/receive. It's only double the price! Better yet, put it elsewhere (georedundancy). With ZFS encryption, the target system need not know about the data.

For critical data though I use Borg and a Hetzner StorageBox.


As it gets bigger and bigger to me the only thing that makes sense is getting another nas and replicating that way.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: