This is not a competition in getting it to answer right, though. It’s that the i...

This is not a competition in getting it to answer right, though. It’s that the instances where it answers wrong demonstrate its lack of inner mental model of what it is supposedly reasoning about, as well as a lack of meta-awareness. I think we tend to underestimate what mere linguistic correlation is capable of producing, and are too quick to attribute intelligent reasoning and an inner mental model to it.