Hacker News new | past | comments | ask | show | jobs | submit login

> any read error on any of the remaining disks is a lost/corrupted file.

That is the meat of it. With traditional RAID it is the same issue, except you never know it happens because as long as the controller reads something, it's happy to replicate that corruption to the other disks. At least with ZFS, you know exactly what was corrupted and can fix it, with traditional RAID you won't know it happened at all until you one day notice a corrupted file when you go to use it.

RAID-Z1 is better than traditional RAID-5 in pretty much every conceivable dimension, it just doesn't hide problems from you.

I have encountered this literal scenario where someone ran ZFS on top of a RAID-6(don't do this, use Z2 instead). Two failed drives, RAID-6 rebuilt and said everything was 100% good to go. A ZFS scrub revealed a few hundred corrupted files across 50TB of data. Overwrote the corrupted files from backups, re-scrubbed, file system was now clean.




You don't need to fix anything.

ZFS automatically self-heals an inconsistent array (for example if one mirrored drive does not agree with the other, or if a parity drive disagrees with the data stripe.)

ZFS does not suffer data loss if you "suffer a total disk failure."

I have no idea where you're getting any of this from.


If the data on disk (with no redundant copies) is bad, you’ve (usually) lost data with ZFS. It isn’t ZFS’s fault, it’s the nature of the game.

The poster built a (non redundant) zfs pool on top of a hardware raid6 device. The underlying hardware device had some failed drives, and when rebuilt, some of the underlying data was lost.

ZFS helped by detecting it instead of letting the bad data though like would normally have happened.


The parity cannot be used in the degraded scenario that was under discussion.

See eg here where the increasing disk size vs specified unrecoverable read error rate is explored in relation to the question at hand: https://queue.acm.org/detail.cfm?id=1670144 (in the article Adam Leventhal from Sun, the makers of ZFS, talks about the need for triple parity).

Also, the conclusion "ensure your backups are really working" is an important point irrespective of this question, since you'll also risk losing data due to buggy software, human errors, ransomware, etc.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: