> asking for a "negative example", to serve the higher purpose of training an et...

remram · on Dec 3, 2022

The difference is that this is a language model, so it literally is only concerned about language and has no way to understand what it says means, what it's for, what knowledge it allows us to get, or anything of the sort. It has no experience of us nor of the things it talks about.

As far as it's concerned, telling us what it can't tell us is actually fundamentally different from telling it to us.

iainmerrick · on Dec 3, 2022

It sounds like you’re saying that’s a qualitative, insurmountable difference, but I’m not so sure.

It reminds me a lot of a young child trying to keep a secret. They focus so much on the importance of the secret that they can’t help giving the whole thing away.

I suspect this is just a quantitative difference, the only solution is more training and experience (not some magical “real” experience that is inaccessible to the model), and that it will never be 100% perfect, just as adult humans aren’t perfect - we can all still be tricked by magicians and con artists sometimes.