> ChatGPT models are language models; they represent closeness between text utte...

> ChatGPT models are language models; they represent closeness between text utterances. It works by looking for the chains of words most similar or usually connected to those indicated in the prompt, with no understanding of what those words mean.

No, it does not work that way. That’s how base GPT3 works. ChatGPT works via RLHF and so we don’t “know” how it decides to answer queries. That’s kind of the problem.