I think you didn't understand my question and maybe I phrased it poorly. The problem is not whether we should trust any deep learning model (the answer is indeed no). But the question is how we can find out if a model is any good before investing our time into that model. Each bad reply we get has a price, because it wastes our time. So, how can we compare models objectively without having to try them out ourselves first?