Hacker News new | past | comments | ask | show | jobs | submit login

CDB is one of my favorite data structures. When a student wants to learn about databases, I get them to implement cdb.

It's easy to implement and really demonstrates some good system engineering tradeoffs.




Any links to theory behind cdb?


Here's an articles I came across that explains the structure: http://www.unixuser.org/~euske/doc/cdbinternals/

I think the theory is mostly that it's really simple and ends up requiring very few disk reads.


I'd be particularly interested to know why there are 256 hashtables, rather than one, or some other number. I don't think i've seen this hybrid trie-hashtable before.

I wonder if there's any mileage in using a perfect hash function to build a database like this. It seems suited to the operating model of being slow to build but fast to access.


i second that; if anyone knows why there are 256 hashtables rather than just 1, please speak up. my only guess is that it might be a way to prevent 32-bit int overflow in the C code..




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: