Hacker News new | past | comments | ask | show | jobs | submit login

Compression usually has checksumming already.



Compression and checksumming have nothing to do with each other.

I can't think of even one compression algorithm that implements checksums. That's why archive formats (like ZIP, 7z, etc.) need to implement checksums separately — inflate and deflate algorithms ZIP uses don't have any kind of built in checksums.


Applications use libraries that implement data compression. Such libraries usually have checksumming in their algorithms, because compression is not very practical without it. LZ4, that VDO uses, has checksumming too.


LZ4 frame format (called LZ4F) has checksums — a stream that contains LZ4 compressed data. But LZ4 algorithm itself doesn't have any checksums.

A block device layer would use the "raw" algorithm, not any frame/container format and use something like SHA256 for checksums.

Take a look at LZ4 source: https://github.com/lz4/lz4/tree/dev/lib

VDO LZ4 source:

https://github.com/dm-vdo/kvdo/blob/master/vdo/base/lz4.c

No LZ4F frame format or checksum in sight.


Yes, you are right, they use raw lz4 and don't do checksumming there. And I can only find checksumming in superblock code and volume geometry code in vdo repository. So it doesn't look like they do it on regular data blocks at all, probably just assume you have a properly working FTL.

(by the way, sha256 is a slow cryptographic hash, not a checksumming one, like crc32)




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: