Streaming ZFS snapshots to a file and storing that in a remote of your choice is...

gnfargbl · on Oct 4, 2022

Interesting. Who is making that recommendation, and why?

For a while now, my backup process has been essentially to run

    zfs send -L -c -P "${snap}" | zstd - > "${out}"
    gsutil -o GSUtil:parallel_composite_upload_threshold=150M cp "${out}" "${gcs_path}/"

It seems to work well. Am I carrying some unrecognised risk that I wouldn't be carrying if I were backing up into another ZFS filesystem? How does that risk balance against the risk of data loss on the backup server? i.e. I am pretty sure GCS won't lose my data, but not quite so sure about the other services you mention.

7e · on Oct 4, 2022

If your monolithic backup suffers single a bit error, it's unrecoverable. If you zfs recv, though, you can recover all pristine files. And if you zfs recv to a redundant filesystem (mirror, RAID-Z), you can periodically zfs scrub and recover from those errors completely.

gnfargbl · on Oct 4, 2022

Hmmn, I wasn't aware of that behaviour. I think this is the correct answer: thank you.

radiowave · on Oct 4, 2022

I've been running ZFS continually since about 2009, currently using its replication as a major part of the disaster recovery strategy for my employer, and in the intervening time having been through a number of disk failures, controller failures, etc.

Just chiming in here to say that I endorse every aspect of the answer that 7e has given you.

infogulch · on Oct 4, 2022

Have you restored any data using this system? A backup isn't a backup unless you've practiced a whole cycle.

Snapshots are incremental, so management of your retention period is an active process and may require combining older snapshots together when you drop them.

With a zfs server on the other end you can zfs send / recv the other way to restore, and it correctly manages deleted snapshots.

gnfargbl · on Oct 4, 2022

> Snapshots are incremental

They are but, unless the -i flag is specified (it isn't in my example above) then zfs send will send a complete replication stream, not an incremental.

aborsy · on Oct 4, 2022

ZFS receive checks the stream to ensure correctness. The stream can be easily corrupted, due to even a single error, if ZFS send is used without ZFS receive.

The error occurs in transit or on client side, not necessarily on remote.

Also managing a chain of incrementals can get unwieldy.

Ask r/zfs.