Yeah that's a good point that "is description X and accurate representation" is a better question, though harder to ground truth on since the truth is a large set of possible 'accurate representations.'
From the PReLU paper, I found this:
> Russakovsky et al. [22] recently reported that human performance yields a 5.1% top-5 error on the ImageNet dataset. This number is achieved by a human annotator who is well trained on the validation images to be better aware of the existence of relevant classes. When annotating the test images, the human annotator is given a special interface, where each class title is accompanied by a row of 13 example training images. The reported human performance is estimated on a random subset of 1500 test images.
Yeah that's a good point that "is description X and accurate representation" is a better question, though harder to ground truth on since the truth is a large set of possible 'accurate representations.'
From the PReLU paper, I found this:
> Russakovsky et al. [22] recently reported that human performance yields a 5.1% top-5 error on the ImageNet dataset. This number is achieved by a human annotator who is well trained on the validation images to be better aware of the existence of relevant classes. When annotating the test images, the human annotator is given a special interface, where each class title is accompanied by a row of 13 example training images. The reported human performance is estimated on a random subset of 1500 test images.