> 'I do not know' or 'I am not sure' for every occasion when it is not 100% sure...

freediver · on Dec 22, 2021

Statistical approaches require you to most commonly use a threshold. Sometimes the model output can be above the threshold and still wrong, and below the threshold and correct. You can never tell for sure, but just try to improve the benchmark average. This is not acceptable in most use cases where the wrong outcome of a single output can be disastrous.

When a human does not know something it can tell that with 100% certainty.

phreeza · on Dec 23, 2021

I don't see the difference between a human and a statistical model here. Surely in order to select an action to take, a person also has to apply some sort to threshold on their confidence? E.g. how is a doctor deciding to amputate or not amputate an organ based on an x-ray different from a classification model for the same task?

That problem aside, language models like Gopher are in fact generative, so no such threshold is needed! You instead sample from the implicit distribution.

freediver · on Dec 23, 2021

The correct analogy would be if I ask you when did Neil Armstrong land on Mars and you 100% know 'never'. A statistical model may output '1969' with 10% confidence and/or '2147' with 3% confidence.