In the first place this is an error of RTFM kind. The "problem" behaviour is literally documented in the docs. However, in the second place, this is the wrong solution for the original problem. (The author links to https://andrewlock.net/why-is-string-gethashcode-different-e... )
The original problem is using linked lists to handle hash collisions. Assuming this is what you want (to keep all values rather than discarding old values, like e.g. Python's dict objects do) obviously you should use a data structure that can handle the case of lots of collisions well, or check your user input.
At the very least the function should have a name that reflects its behaviour, eh? "GetUnguessableHashCode()" or something?
This reminds me of the recent fiasco over in Python land where somebody noticed that int() called on large strings can block the interpreter. The obvious solution is to add an optional timeout parameter (the problem is in the time domain after all.) Instead they added a limit to the size of strings you can pass to int()!
The article seems incomplete without bringing in the code or test that relied on the hash code. Was the test unnecessarily strict? Was code assuming more than a collection guarantees?
I wonder if a type system could track nondeterminism and prevent it. E.g. you couldn't just loop over a set, you'd have to order it, or apply a commutative operation. It would fit well into concurrency too, where nondeterminism is easily introduced.
I'm sure there are languages which do this, I seem to recall seeing it in Mercury
Languages with first class effects, like koka[1], require you to declare up-front what side effects (sample randomness, do I/O, abort, etc...) your function might cause.
Then from another outer function, you can't call the effectful function unless the outer function also declares those same side-effects (or you provide handlers for them).
This way you can easily isolate non-determinism and I/O to the top-level part of your program, while keeping the interior business logic pure.
The "first-class effects" part is what lets you easily compose effects and inject custom handlers. Ideally in a way that's more ergonomic than juggling with Monad type system Jenga.
The original problem is using linked lists to handle hash collisions. Assuming this is what you want (to keep all values rather than discarding old values, like e.g. Python's dict objects do) obviously you should use a data structure that can handle the case of lots of collisions well, or check your user input.
At the very least the function should have a name that reflects its behaviour, eh? "GetUnguessableHashCode()" or something?
This reminds me of the recent fiasco over in Python land where somebody noticed that int() called on large strings can block the interpreter. The obvious solution is to add an optional timeout parameter (the problem is in the time domain after all.) Instead they added a limit to the size of strings you can pass to int()!