Summarizing your results, your hash table implementation is between five and ten times faster than STL and G++ generic maps, because G++, like most C++ compilers, imposes a substantial abstraction penalty on the use of templates — even though in theory that's not necessary. Is that right?
Yes, 5-10x faster. As to the reason, I can only speculate. I doubt it's template-imposed abstraction penalty per se, though it could be a result of having to write the algorithm in a generic way. In other words, if you did a sort of "manual instantiation" of the hash_map template, where you made the types specific but didn't change a thing besides that, I think you'd get the same performance as the templated algorithm.
I can't say for sure the reason for the difference. Previously I thought that hash_map might have been wasting time calculating a hash of the integer key (whereas I just use the key as the hash), but I just added another test that uses hash_map with an identity hash function and the performance was unchanged.
I bet that hash_map is using external chaining, whereas I'm using internal chaining which has better caching behavior. Besides that it's hard to say for sure why mine is so much faster.
Could be. I used internal chaining also in the program I mentioned in the other comment. And probably a power-of-2 size and another tweak or two -- I don't remember details.
> What's wrong with red-black trees for an ordered map?
I was wondering that too, except without the "n ordered" part. The claim that trees are competitive with hash tables in the average case rests on the claim that comparison is much faster than hashing, because you only have to hash once in the average case, whereas you have to compare something like 1.5 lg N times in the average case, which might be between 4 and 22 in common cases. I just ran this microbenchmark to compare hashing integers with comparing them, and hashing seems to be actually slightly faster than comparing on my CPU; this implies that hash tables should be dramatically faster than red-black trees.
In theory, this forces an unpleasant dilemma on code that aspires to be real-time but wants to do a lookup in a finite map: use hash tables with their awful worst-case performance (typically O(N), although you can improve this by hanging trees off your hash buckets, but that would be stupid), or use trees with their awful average-case performance?
In practice, I've never written real-time code, so I don't know if this dilemma is real.
It won't perform allocations, but since the structure has been built from multiple disjoint allocations, you're likely to get worse spatial coherence. (I.e. more cache misses)
And yes, for an ordered map RBL is not a bad choice. "I object to std::map being ordered" would probably have been the better wording.
In my last C++ program, 5 years ago, I did find it worthwhile to code a custom hashtable. Was a bit surprised. This was g++, probably an older version even then.