All of the photos in the paper, both "criminals" and "non-criminals" are from government ids. Though as the article mentions, in the pictured example all the "non-criminals" are wearing collared shirts.
This reeks real hard of overfitting. 2000 images for training a CNN feels so tiny. The paper should have included a learning curve.
For the criminals they use pictures from government id cards. For non-criminals they searched the internet for random pictures.
These bozos just trained CNN to pick government id photos.
This is scary stuff because police are not very technically savvy and use biased scoring systems already[0].
[0] https://www.propublica.org/article/machine-bias-risk-assessm...