Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Maybe look at it another way: ask GPT to complete the following

  Bananas are yellow due to a

  Bananas are yellow due to an
In the first case it might respond

  Bananas are yellow due to a pigment called bromelain.
In the second case it might respond

  Bananas are yellow due to an organic compound called bromelain, which is a yellow pigment.
So in either case GPT could have picked "a" or "an" without any impact on the semantic meaning of its response. In the extreme case, you could see the LLM operating according to a dumb heuristic:

  The token following "due to" is "a" with 55% probability, "an" with 45% probability.
In reality it is of course more sophisticated than this. But this dumb heuristic would explain the behavior.

And if you didn't actually include any facts about bromelain in the pretraining data, LLMs absolutely could autocomplete this with something about "an optical illusion." GPT-3 made factual mistakes like that pretty routinely, but I recall it figured out the grammatical rules of "a" and "an."

I don't think the concept actually needs to be pre-activated as you said, though I agree with faabian that this "preactivation" probably does happen in some implicit/emergent sense.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: