Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It's not really fair to compare a custom implementation against the standard unordered_map implementations, which need to be fully general. See https://news.ycombinator.com/item?id=9675608

Still, I'm shocked that GCC, LLVM, and boost all assign buckets using modulus, which is very slow. I would love to know the reasoning. I assumed that they mask the high bit (or & with the table size).

Fast hashmaps use xor to mix information from low bits (examples below). Fibonacci hashing amounts to running a second multiplicative hash over your input, which is only worthwhile if you're paranoid about your input distribution.

Java SDK: http://hg.openjdk.java.net/jdk8/jdk8/jdk/file/687fd7c7986d/s...

Android (C++): https://android.googlesource.com/platform/system/core/+/mast...



Yes, but it's also important to understand what you lose when using these better-performing hashtables. First thing that comes to mind is that when performing a successful lookup on a std::unordered_map, the reference is guarrantied to be valid until either (a) you erase that element or (b) the table itself is destroyed. Almost all faster hashtables around give up that properties for better lookup performance. Any insert operation might break all references.

So, in a sense, it's a bit unfair because std::unordered_map, in a real-world scenario, will potentially require you to perform less lookup because you can cache the references. With the others, you can't.


Yes, I agree with you that unordered_map should be slower. For context, the author shows off a graph where his custom Fibonacci implementation is the fastest, which proves nothing about Fibonacci hashing.


I had an application several years ago in which the executable was 10x as fast when using khash instead of std::unordered_map.


Hash table design in libraries is an exercise in minimizing the costs of the worst case, and it's made harder by typically only being able to control one side of the equation - the table and not the hash function.

Thus general purpose hash table implementations need to choose whether they will help naive programmers by doing modulus with a prime number of buckets, or if they will leave a potential performance bomb for people who e.g. hash integers to themselves. The article argues for a third way which is somewhere in the middle: faster than prime modulus, but not as dangerous as bit masking.

It's usually easy to beat any general purpose hash table for these reasons, as long as you control the hash function you can tweak your implementation to suit.


std::unordered_map is a bit of a red-headed stepchild in the C++ world. I'm not surprised that it hasn't had nearly the amount of tuning that hashmaps have had in other languages.

When i asked some C++ers about it, they warned me off using it, for reasons i didn't fully understand, but i got the impression that there are structural reasons why it can never be really fast, so anyone who needs a really fast hashmap uses some non-standard one anyway.


> It's not really fair to compare a custom implementation against the standard unordered_map implementations, which

But he is comparing the standard unordered_map implementation to his own implementation of the standard unordered_map (same API).


It's the same API, but I highly doubt that his implementation is fully compliant with the C++ standard.


It is, and so is boost::unordered_map, that's the whole point of the comparison. Just check the implementation out.


Source on that claim? Here's a comment chain where the author admits that his library is noncompliant: https://probablydance.com/2017/02/26/i-wrote-the-fastest-has.... Also note the complete lack of a standards test suite.

This library uses Robin Hood hashing for speed. As my original link explains, standard implementations use chained buckets so that users can cache lookups and to have better worst-case performance: https://news.ycombinator.com/item?id=9675964




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: