Swiss Tables and absl::Hash

ot · on Oct 24, 2018

See also F14 (https://github.com/facebook/folly/blob/master/folly/containe...), it uses a similar design to SwissTable but was opensourced a few months earlier.

Interestingly it is faster than SwissTable on Google's benchmarks: https://groups.google.com/forum/#!topic/hashtable-benchmarks... . I recommend reading the whole thread, the authors of the two implementations discuss in detail the design differences.

romka2 · on Oct 24, 2018

Note that the hashtable micro-benchmarks open-sourced by Google cannot be used to figure out which of the two hashtable implementations is faster. See https://groups.google.com/forum/#!topic/hashtable-benchmarks... for the details. If someone ran a high-quality production load test with two implementations (Google definitely can do that), it would give us useful information.

Disclaimer: I wrote the micro-benchmarks referenced above and parts of Swiss Table. I'm no longer with Google.

utopcell · on Oct 24, 2018

I believe the design is similar because Google presented the SwissTable design more than a year ago in CppCon [1].

If I'm reading the link you sent correctly, then there was an initial bias in the benchmarks that favored F14, but after adjusting for that, the results are similar (not surprising since the basic idea is the same.)

[1] https://www.youtube.com/watch?v=ncHmEUmJZf4

_urga · on Oct 24, 2018

I recently finished a scalable hash table for Node.js, designed to store billions of elements in huge flat buffers without impacting the garbage collector (no pointer chasing), and without causing memory fragmentation.

At first, I tried a native add-on, using SIMD instructions to compare multiple keys per instruction. This makes the use of buckets (multiple keys per slot) compelling, because the branching cost of key comparisons almost disappears.

However, in the end, the 100ns to 200ns cost of calling into C from Javascript using N-API (or V8) mitigated any SIMD gains, so I rewrote the hash table in Javascript directly. While not as fast as a pure C implementation, it's still fairly fast, and more scalable than Javascript's native vanilla objects, Sets or Maps.

The notes on design and optimization decisions taken are here, if anyone is interested: https://github.com/ronomon/hash-table

acqq · on Oct 24, 2018

Wow, really nice! I had to carefully analyze the page to find the answer, first reading:

   keySize=16 valueSize=0

    @ronomon/hash-table: Inserting 4000000 elements...
    @ronomon/hash-table: 783ms

            new Set(): Inserting 4000000 elements...
            new Set(): 3695ms

...that the key and value sizes are in bytes (seen only once one reads the api). So, almost 5 times faster set insert, congratulations. The major advantage is, of course, the chance to less stress the GC when the data sets are big.

> the 100ns to 200ns cost of calling into C from Javascript using N-API (or V8)

Do you (or some reader here?) maybe know what's the current cost of calling into WebAssembly, compared to that "calling C" cost?

_urga · on Oct 24, 2018

Thanks!

I updated the vanilla.js script and README to clarify that the key and value sizes are in bytes.

I haven't tried calling into WebAssembly so I don't know. Perhaps the bridging cost would be similar or slightly better (or who knows, worse)? Does WASM have SIMD? Do you know of any areas where WASM would outdo the current Javascript implementation?

For times like these, I wish server-side Javascript itself had SIMD, and access to CPU builtins, such as popcount etc.

ryanworl · on Oct 24, 2018

WASM has a popcnt instruction, and will eventually have SIMD. I think the bridging cost is less than calling something through NAPI or a native add-on because you're still in the VM.

Great hash table by the way!

flurrything · on Oct 24, 2018

node, chrome, etc. already support the current WASM SIMD proposal

ryanworl · on Oct 24, 2018

Do you know when this landed and/or if I need a flag to enable it in Node? I'm running Node 11 and get an error when trying to run a function with a SIMD instruction in it.

I hacked up a WAT file with this inside a function body and compiled it with wat2wasm --enable-simd

    (local $2 v128)
    (set_local $2
       (i8x16.splat (i32.const 8))
    )

And I get an error "invalid local type" when I try to run it within Node.

acqq · on Oct 24, 2018

> Do you know of any areas where WASM would outdo the current Javascript implementation?

As soon as you can imagine something that would "better work in C" it's reasonable to expect that Wasm will have an edge vs. Javascript. Javascript does eventually get JIT-ed etc, but with wasm it's effectively "just the code" from the start -- conceptually, Wasm code doesn't have to contain any checks to "fall back" to some slower implementation, once it's processed.

Here's one of the "minimal" examples of calling the C function which is compiled to the WebAssembly: https://medium.freecodecamp.org/get-started-with-webassembly...

Note the resulting native assembly code in the picture on the right there. It's that minimal. I don't know how big the costs JavaScript to that are... But seeing how beautifully you've measured for your project, I'm quite sure you're more than suited to evaluate that too, and to measure the currently minimal costs. And as there is a lot of interest in these topics, if you demonstrate the obvious weakness there's a chance it could be fixed?

It seems that the string has to be copied every time a call from JavaScript to Wasm is made (there stringToUTF8):

https://blog.teamemo.com/behind-teamemo-using-a-c-library-fr...

uji · on Oct 24, 2018

Great to see google open sourcing its internal cpp libraries.

utopcell · on Oct 24, 2018

Indeed. It raises the baseline that Google itself had set many years back with dense_hash_map.

romed · on Oct 24, 2018

It was only about one year between Swisstables being generally available within google and the absl release, which isn’t bad at all.

rurban · on Oct 24, 2018

Can somebody please write this in C99 C? I'll do it also, but I'm too busy currently.

I'd also love to see the 7 word variant implemented with the first bit mark for full. it's only mentioned in the cppcon talk but not in the repo.

flurrything · on Oct 24, 2018

> We generally recommend that you use absl::flat_hash_map<K, std::unique_ptr<V>> instead of absl::node_hash_map<K, V>.

So what's the point of `node_hash_map` then ?

mattkulukundis · on Oct 24, 2018

If you need pointer stability in keys, then node_hash_map has that. Also, it is easier for large scale conversions because it has the same pointer stability guarantees as std::unordered_map

blauditore · on Oct 24, 2018

Where does the name come from though?

utopcell · on Oct 24, 2018

It was initially developed in Zurich.