`/dev/urandom` isn't a real file / stream. It's part of the 'everything is a a file' *nix mantra. Even if two users are reading from /dev/urandom simultaneously, they'll each get unique values. The CSPRNG keeps track of a sequence number and so you'll end up with something like [process 0 requests sequence 0, process 1 requests sequence 1, process 1 requests sequence 2, proceess 0 requests sequence 3...].
Is that strictly true? I know urandom doesn't block if it lacked entropy, but if it had entropy I was under the impression urandom's output was derived from that instead.
Well, a lot changed since the article. For one the test tool now eats more CPU than RNG.
From my dumb tests (run DD in one, then many threads), the 4 thread run have 4x the performance of single thread one (I have 4 core CPU), while 16 thread one have predictably same-ish total throughput, so if there are any serialization still there it is not noticeable much.
(ignoring that they could get the same output by chance)