CDB is awesome for its use case - slow changing, read-heavy workflows that are tolerant of stale data. One limitation of the original implementation is the use of 32-bit keys for addressing, which limit the addressable size to 4gb. There are 64 bit modifications, but I have not used them. Does anyone have an opinion on any of the 64-bit implementations?
Where the pointers are notionally 32+k bits long, but the bottom k bits are always zero, so they can be stored in 32 bits. This would mean that records always have to be aligned to a 2^k-byte boundary. If the data has some natural alignment anyway, this could be arranged so as to not involve wasted space, and even if it doesn't, for large keys and/or values, the amount of wasted space would be relatively small.
Alternatively, there was a way to reliably locate the start of a record when scanning through data, the records could be packed without padding, and the 2^k-aligned pointers could simply be approximate, pointing to somewhere in the 2^k bytes before the start of the record. Retrieval would involve following the pointer, then scanning forward to find the actual record. This would be a bit sketchy, but something a bit like this is done in ATM:
CDB doesn't include CRCs. However, you could think up various schemes to identify records based on what you know about the record format, the key you're looking for, and its hash. It's probably not worth the effort!
I'd be particularly interested to know why there are 256 hashtables, rather than one, or some other number. I don't think i've seen this hybrid trie-hashtable before.
I wonder if there's any mileage in using a perfect hash function to build a database like this. It seems suited to the operating model of being slow to build but fast to access.
i second that; if anyone knows why there are 256 hashtables rather than just 1, please speak up. my only guess is that it might be a way to prevent 32-bit int overflow in the C code..
I'm put in mind of all the things that take up the most space on a minimal Ubuntu installation: charmaps, mime types, tzdata, geoip, etc. Seems like all of these could do with being packed into a tight, read-only K/V store.
It's extremely fast and it's easy to work with for reading, but it's just a key-value store. Also, it's sort of weird to work with on the write side. You have to do atomic replaces of the read-only data files.
Regarding writes, most likely the expectation is that you'll update your entire data set periodically in a batch job, or export and cache another authoritative data store. I understood the atomic replace feature as referring to a convenient way to flip to a new version of the database after its production & distribution by a batch job. It sounds like the "cdbmake" tool is designed to facilitate this.
A lot of data doesn't change very often; and a lot of data can tolerate propagation delays on change. When you have an extremely high read volume against such data sets, it can be economical to cache the entire data set and distribute it periodically to the fleets of machines that require access, as opposed to servicing individual reads over a network. Provides lower cost and latency, while supporting a higher volume of reads and higher availability, at the expense of engineering cost and propagation delay.
I don't know how many people remember this, but this kind of "weird atomic replaces of the read-only data file" was very common on unix-systems for a long time.
Take sendmail: people would have their email-aliases in a /etc/aliases file, with a simple ascii-format, lines formatted according to: "<key>: <values...>". The "newaliases" command then would convert the content of the ASCII file /etc/aliases into the binary (originally: Berkeley db) /etc/aliases.db.
As far as I know, CDB was developed initially for qmail, DJBs email-daemon, for exactly that purpose.
Another example for this is the "Yellow-Pages" (yp) / NIS directory services to distribute usernames/userids/groups/... in a UNIX cluster or network. ASCII files were converted into e.g. /var/yp/DOMAIN/passwd.byname, a .db-file where keys are the usernames from /etc/passwd and values were the full ASCII lines from /etc/passwd, same goes for passwd.byuid, group.byname, and so on. All for fast indicing, so that the network server didn't have to repeatedly parse /etc/passwd.
It can be reimplemented from scratch in a few hundred lines of code?
It's not really the same thing, though. CDB provides a single key -> value mapping in a single file, with updates by replacing the entire file; SQLite is a relational database with updates, locking, transactions, SQL, etc.
cdb is immutable - when you want to change the contained data set you generally rebuild the file, flip a fs link to the new version, and release file descriptors referencing the old version in applications when it's time to move on. bdb updates in place, but is slower for reads.
I'd like to see a more up-to date CDB with the 4Gb limit removed. Writing could be implemented by using additional smaller CDB or text file. It would be an interesting problem to find at what size a flat text file should be converted into CDB to maximize speed on the whole.
Found out why (or maybe there is more) - it's using mmap() which is not directly available on Windows (but there is MapViewOfFile and CreateFileMapping).
Notice cordite said a Github mirror, not the Github mirror. You don't need to have your main development happen on Github to have a mirror there, just someone mirroring from the main server.
to be fair, i assume he's only asking because github starring something is actual a fairly good way of keeping track of projects in the way he's talking about. i agree that it sounds a bit presumptuous to say "where's the github mirror, mr. unpaid contributor to open source project", but on the flip side it's a fair point in terms of potentially capitalizing on exposure for a project. and also it's very true that you can sometimes find the best code in very unassuming places.
I'd be really interested to see statistics on this. I don't work in Silicon Valley, and almost all of the stuff people around me are publishing is on GitHub.