> False positives from SHA-1 hash collisions are detected after object retrieval from the disk by comparison with the requested key.
I was curious why they would bother, but it seems this isn't quite accurate.
What happens is they first use 32 bits from the SHA-1 hash to find the hash bucket, then they scan for the full SHA-1 of the key. They do not check for actual SHA-1 collisions.
edit: Also on the subject of hashes, the readme suggests switching to MD5 as a possible way to reduce entry size. That is unnecessary; SHA-1 can be truncated to whatever size you're comfortable with.
I was curious why they would bother, but it seems this isn't quite accurate.
What happens is they first use 32 bits from the SHA-1 hash to find the hash bucket, then they scan for the full SHA-1 of the key. They do not check for actual SHA-1 collisions.
edit: Also on the subject of hashes, the readme suggests switching to MD5 as a possible way to reduce entry size. That is unnecessary; SHA-1 can be truncated to whatever size you're comfortable with.