Somehow I don't think this is going to be a problem. I can't exactly articulate ...

aezart · on April 12, 2023

LLM output cannot be higher quality than the input (prompt + training data). The best possible outcome for an LLM is that the output is a correct continuation of the prompt. The output will usually be a less-than-perfect continuation.

With small models, at least, you can watch LLM output degrade in real time as more text is generated, because the ratio of prompt to output in the context gets smaller with each new token. So the LLM is trying to imitate itself, more than it is trying to imitate the prompt. Bigger models can't fix this problem, they can just slow down the rate of degradation.

It's bad enough when the model is stuck trying to imitate its output in the current context, but it'll be much worse if it's actually fed back in as training data. In that scenario, the bad data poisons all future output from the model, not just the current context.

roflyear · on April 12, 2023

But haven't you used these models? They are baby AGI! They'll grow up!! /s

93po · on April 12, 2023

Unless the LLM actually evaluates sources and assigns weights to sources.

Kye · on April 12, 2023

This is interesting because it's essentially how human bullshitters work. The more they know, the longer they can convince you they know more than they do.

https://xkcd.com/451/

Imposter sophistication levels might be the way we rank these in the future.