Not that it helps you now, but i also keep all our family photos in the cloud (iCloud in my case), but at the same time i have a small ARM machine at home that keeps a mirror of the iCloud data.
That ARM machine also has the responsibility of making backups, local to a USB drive, as well as to another cloud. Not mirrors, but proper versioned backups (as in Restic, Borg, Arq, Duplicacy, Kopia, etc).
I also maintain a couple of USB drives with yearly updated mirrors of the entire photo library. The drives are stored at geographically different locations, and surface scanned, updated and rotated yearly.
And finally, as a "last ditch recovery", i maintain an archive of M-disc Blu-Ray discs that contain a complete copy of our family photo library. Every year i make an identical set of discs containing the past years photos, and these sets are stored alongside the USB drives.
I don't bother archiving documents as everything that is important is stored on government servers anyway, or exists in hardcopy. Also, if every step in my normal 3-2-1 backup scheme has failed and i need to recover from the archive, i probably have bigger issues than retrieving my budget for this years finances.
As a fellow small ARM machine owner, what's your strategy for getting the photos from iCloud? Is there a tool one shouldn't feel weird to give their iCloud credentials to?
I was doing just this for a while but stopped because my Mac Mini was too old for the last few OS releases. I then switched to using the Windows iCloud client but a bug from ~2 years ago that consumes tremendous CPU cycles made that less than ideal. (The best you can do is lock it to a single thread, which will then use 100% 24/7)
Now I just don't backup my iCloud, though I do remove everything older than one year every new years to my home server which follows a good 3-2-1 backup strategy.
TL;DR: If you go this route, try to get a Mac Mini that can run a supported macOS for some time to come.
I've been using Windows iCloud client to backup to a VPS. It works okay for files, but for photos it pegged disk usage even when there weren't any new photos to download and my VPS provider wasn't happy. So far my solution is Windows iCloud client for files, and then OneDrive on my iPhone for photo backup, with OneDrive again on the VPS.
I like how the backup is outside my house, but I'm about to add Yubikey to my iCloud account and I'm not sure the Windows iCloud client is going to like that.
I was actually running ESXi on my Mini for a while and successfully installed macOS in a VM on it. The performance was horrendous though, so much of macOS depends on GPU acceleration which I didn't get. I think I've read newer macOS builds don't even have a software video fallback, though that might just be the Apple Silicon builds which wouldn't apply to me.
It was definitely a fun project even if not terribly useful.
It doesn’t talk about not being able to access person tags. Not being able to programmatically access the data about who google thinks is in each of my photos has been an annoying pain point for me for years. Last I checked, the data is also not included in Google Takeout dumps.
i have been using icloudpd for years. there is a docker image that makes it simple to install on my synology. because of 2fa, i have to re-authenticate it from the terminal every couple of months, but that takes one minute.
I tried this but I think my issue is the library is too large. It can pull a few dozen or a hundred or so photos just fine, but then it will time out but not quit out, so I have to babysit the process. Maybe there's a flag I missed in the documentation to retry downloads, but it basically was a nonstarter to me in the state it was in. I think its apples fault though; I can't get a big zip file of icloud data to succesfully download with their website either, it will also time out. Likewise when I try and update the bootcamp drivers on my intel bootcamp machine, I will get 30% of the way there and then it times out. I'd blame my home connection, but from cursory glances at various forums this is apparently a widespread issue with these sort of downloads form Apple.
I have recently switched to a M1 Mac Mini, and just have each family member sign in to that using Remote Desktop. It brings the added bonus of working as a content cache for anything iCloud.
My only gripe is that it downloads the shared photo album (new in iOS 16) once for each account, and when your photo library is 1.8TB, that suddenly becomes a lot of wasted space. When it comes to backing it up the backup software deduplicates the data, but not for the initial storage.
I really wish Apple would implement some kind of method for backing up photos stored in the cloud without the need for mirroring them.
Before the M1 I was using iCloud photo downloader ( https://github.com/icloud-photos-downloader/icloud_photos_do... ) on a Raspberry Pi 4 which also worked well, but in the end I got tired of iCloud credentials expiring every ~90 days, requiring each family member to login again through a console.
Considering the M1 idles at roughly 20% more than a RPi4 (M1 at 4.5W) it was an easy sell. I just got the cheapest model and added a large USB drive. Using a Mac also gives you the possibility of using something like Backblaze Personal with unlimited backup storage, if that’s your thing :-)
I use Healthchecks.IO ( https://healthchecks.io/ ) to keep an “eye” on the backup status (and other more mundane tasks like monitoring the power state of my summerhouse)
You have to fuck around with the storage format because by default scanning sofware, even when scanning documents, will basically just take poorly compressed high resolution images. Maybe there's some service which sorts that out for you but so far I have a little webapp which stores an index of paper copies. Once I figure out how to automate scanning, compressing and OCRing documents then I think it will be worth storing digital copies. But for now it works.
I do scan some paper but there's increasingly less of it and I'm likely to need future access to so little of it that it's mostly not worth the trouble.
I still do a 3-2-1 backup of documents with 2 versioned backups, one at home and one in another cloud provider, just like with photos.
The archive however is the recovery if I’m not able to retrieve my normal cloud copy (hacked, ransomware, loss of credentials, etc), I cannot access my local mirror copy (ransomware, dead disk, etc), I cannot access my local backup (dead disk, separate from the mirror disk), and I cannot access my cloud backup either.
For all of those things to go wrong a the same time, something major has to happen. Besides, where I live, most required documents (drivers license, passport, birth certificate, tax records, etc) exists in government databases, so all I have at home will be various documents that maybe have sentimental value, but not exactly needed.
Furthermore, documents change “frequently” where photos tend to be somewhat more static, so I can archive photos, and maybe get <10% “duplicates” due to later edits, archiving documents will pretty much be a lot of duplicates each year.
That being said, I think we have like 1GB documents in total, so it would be easy to fit in the archive.
I do occasionally want to retrieve older documents. As you say, they're smaller and the effort to put individual docs into a "I might want this someday" folder seems more trouble than it's worth. Of course, having a vast sea of basically write-only docs does make finding things harder especially as I don't put as much effort into organizing things any longer.
That ARM machine also has the responsibility of making backups, local to a USB drive, as well as to another cloud. Not mirrors, but proper versioned backups (as in Restic, Borg, Arq, Duplicacy, Kopia, etc).
I also maintain a couple of USB drives with yearly updated mirrors of the entire photo library. The drives are stored at geographically different locations, and surface scanned, updated and rotated yearly.
And finally, as a "last ditch recovery", i maintain an archive of M-disc Blu-Ray discs that contain a complete copy of our family photo library. Every year i make an identical set of discs containing the past years photos, and these sets are stored alongside the USB drives.
I don't bother archiving documents as everything that is important is stored on government servers anyway, or exists in hardcopy. Also, if every step in my normal 3-2-1 backup scheme has failed and i need to recover from the archive, i probably have bigger issues than retrieving my budget for this years finances.