> Exactly. But also he inadvertently actually touches on this in the post. Talki...

cmrdporcupine · on Nov 28, 2022

Thing is that hashmap usage goes beyond data manipulation / key-value store uses. People use it to construct semi-structured network data models, track counts by indices, etc. Because there's an assumption that the O(1) of a hashtable always beats the O(N) of a linear search. Except that that's really not always the case anymore on a modern arch, with things properly vectorized, etc.

When what we should have is libraries that provide high level relational/datalog style manipulation of in-memory datasets, and you describe what you want to do and the system decides how. Personal beef.

Think of the average "leetcode" question. It almost always boils down to an iterative pass over some sequence of data, manipulating in place. C-style strings or arrays of numbers, etc. If you tried to answer the question with "I'd use std::sort and then std::blah and so on, on some vectors" they'd show you the door because want you to show clever you are at managing for loop indices and off-by one problems and swapping data between two arrays in a nested loop, etc.

So we're actually gating people on this kind of thing. And imho it's doing us a disservice. The code is not as readable. And it doesn't necessarily perform well. It's often a 1988 C programmer's idea of what good code is that is used as the entry bar.

int_19h · on Nov 29, 2022

> Because there's an assumption that the O(1) of a hashtable always beats the O(N) of a linear search. Except that that's really not always the case anymore on a modern arch, with things properly vectorized, etc.

It was never universally true. I remember reading about compiler design in the 1970s; I think it was somewhere in Wirth where he pointed out that, for local symbolic lookups, it was more efficient to use a simple loop over an array because the average number of symbols was so low that the constant management overhead from any more advanced data structure was more expensive than a linear scan.