CDB is one of my favorite data structures. When a student wants to learn about d...

zura · on May 10, 2014

Any links to theory behind cdb?

netghost · on May 10, 2014

Here's an articles I came across that explains the structure: http://www.unixuser.org/~euske/doc/cdbinternals/

I think the theory is mostly that it's really simple and ends up requiring very few disk reads.

twic · on May 10, 2014

I'd be particularly interested to know why there are 256 hashtables, rather than one, or some other number. I don't think i've seen this hybrid trie-hashtable before.

I wonder if there's any mileage in using a perfect hash function to build a database like this. It seems suited to the operating model of being slow to build but fast to access.

geogriffin · on May 11, 2014

i second that; if anyone knows why there are 256 hashtables rather than just 1, please speak up. my only guess is that it might be a way to prevent 32-bit int overflow in the C code..