Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Simple enough to be safe, at the cost of performance: uncompress and compare to the original.


You could have bugs that show up on different hardware or compiler versions. So the round trip is table stakes but not a guarantee.

Edit: someone deleted a response that said that if you can read it back then the data is there. I think in a data recovery sense that’s definitely true, if it’s consistent across inputs. But building something that simulates the undefined behavior or race condition - if it’s symmetrical between read and write could be pretty tricky. And you’d have to decode based on file version number to know if you need the emulation. So possible but terrible to maintain, and the interim versions from creating and discovering the bug would still be broken.


And what do you do if it doesn't match?


Isn't it obvious? Warn the user, who can now use something else instead.


That only works if the "user" is an interactive TTY with a human on the other end of it though. What if I tried using this for compressing automatic backups? Do I need an error handling routine that uses something else?


A backup system should be reliable and be able to report errors. No matter what they may be.


Your automatic backups may actually be corrupted by random bit flips. Happens quite a lot with ZFS NAS systems where the admin forgot to set up a scrub job and still uses incremental backups.

Any read or write could fail for a multitude of reasons. The chance of an entire file being lost is rather small, but it's still an edge case that can happen in the real world if the FS flips to read only at just the wrong time. Hell, on platforms like macOS, you can't even assume fsync returning success will actually write floating data to storage!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: