If it's a hash function that only works for this exact data set, I can see the a...

lilyball · on Dec 28, 2016

Sure, but it's not a good benchmark. If you can demonstrate that you have blazing fast performance, but only for the exact input data that the benchmark uses, then you haven't demonstrated anything at all beyond an ability to tailor your code to a very specific input. But in the real world, we don't get to write our code for a very specific input (after all, if we knew what the input was ahead of time, we could just pre-calculate the answers instead of publishing the program).

So yeah, if you can come up with a tailored hash algorithm that works correctly and is DoS-immune for all possible input for the program, go ahead and use that tailored hash algorithm. But if your hash algorithm is only correct for the specific input data the benchmark game uses, and would not be collision-resistant under real world conditions, then you really shouldn't use that hash as it does not demonstrate anything useful.

pvg · on Dec 28, 2016

Well, here's the thing. I don't think it's for the exact input, it's for a type of inputs. Custom hash functions for specific types of data are a basic optimization technique and I find it odd you'd even suggest every hash function should be 'DoS-immune'. There's absolutely nothing 'real world' about this unless you think the world consists entirely of hostile inputs. In the real world, people absolutely optimize.

Your argument seems to be that that's not the intent of the benchmark which may be true but it's not clear from the rules provided at all. To me, it looks like the opposite is true - they talk about using a standard hash table and most of those allow user-specified hash functions.

mmstick · on Dec 28, 2016

Rust's default hashmap algorithm is DoS-immune for any kind of input, which is a perfectly logical default algorithm for a language intended to be used in security-critical areas like operating systems, system libraries, web browsers, etc. and promotes memory safety.

pvg · on Dec 28, 2016

That's really great but I don't see how it's related unless your argument is truly 'custom hash functions are bad', in which case I don't really know what else to tell you beside 'that's completely wrong'.