Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This note about his other project (that the robin hood hash table depends on) is quite curious:

> This robin hood hashing is implemented using my project Object Persistence In C (OPIC). OPIC is a new general serialization framework I just released. Any in-memory object created with OPIC can be serialized without knowing how it was structured. Deserializing objects from OPIC only requires one mmap syscall. That’s say, this robin hood implementation can work not only in a living process, the data it stored can be used as a key-value store after the process exits.

> Right now, the throughput of OPIC robin hood hash map on small keys (6bytes) is 9M (1048576/0.115454). This is way better than most NoSQL key-value stores. The difference might come from write ahead logs or some other IO? I’m not sure why the performance gain is so huge. My next stop is to benchmark against other embedded key-value store like rocksdb, leveldb and so forth.



Thanks for raising this up. I spent way way more time on OPIC (almost a year) than robin hood hash map (2 weeks to complete the POC). Glad to see people noticing this project :)

The other comment pointed out MDBM, which I didn't know about. From their performance number I think this may show that why OPIC robin hood is quite optimal. https://yahooeng.tumblr.com/post/104861108931/mdbm-high-spee...

MDMB gives users raw access to the mmaped data and pointers. And from its benchmarks it results 10x faster than rocksdb and leveldb. The design of OPIC has even less overhead (may not be a good thing) than MDBM, and it also works on a mmaped file (or anonymous swap). There's no lock, transaction, or WAL in OPIC. OPIC just brings you the raw performance a hash table can gives you.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: