For anyone who is interested in playing around with these charts, the various as...

andersource · on April 22, 2022

Thanks John! Very interesting.

What your simulation includes and the original article didn't (and I didn't touch at all in my article) is the statistical reliability of the tests they administered. Where you got a CC of -0.38 you used equal reliability (/ unreliability) of the skill tests and self-assessments. You can see that as you increase the test reliability, the CC shrinks and the effect disappears.

I have no idea what's the actual reliability of the DK tests, they do seem to consider that but maybe not thoroughly enough. In my view it's very fair to criticize DK from that angle. But that would require looking at the actual tests and their data.

My point being, that any purely random analysis is based on assumptions that can easily be tweaked to show the same effect, the opposite effect, or no effect at all.

john_pryan · on April 22, 2022

That's a nice spot about the decreasing CC as we increase accuracy!

My hypothesis would be that some of the DK effect in the original paper may be down to an effect like this (as suggested in the original article) but that asserting it is completely incorrect because of it is premature. We'd need access to more data to verify that the level of reliability was sufficiently acceptable.

andersource · on April 22, 2022

Right. Just to be clear, "an effect like this" is (comparatively) unreliable tests, not some elusive statistical phenomena as implied by the original article. I'd have no issue if the author had called the article "the DK effect is due to poor skill tests", spent 5 minutes showing that the DK results are consistent not only with their claims but also with unreliable tests (like you did), then went on to show data that indicates that the tests indeed are not reliable enough to draw the conclusions that DK did. Instead the author spends a lot of time digging under the wrong tree and no time at all saying anything about the reliability of the tests.

john_pryan · on April 22, 2022

I agree, the article seems to imply the plot in the original paper will always be an incorrect thing to do, instead of something which can have some issues in cases when we have inaccurate tests.

I've gone back and updated the colab notebook to use orderings exclusively instead of values and you can see that the auto correlation plot B from the first article exists when the noise is high enough but disappears when you reduce it, definitely not a statistical law.