Somehow I don't think this is going to be a problem. I can't exactly articulate why, but I'm going to try.
The success of an LLM is quite subjective. We have metrics that try to quantitatively measure the performance of an LLM, but the "real" test are the users that the LLM does work for. Those users are ultimately human, even if there are layers and layers of LLMs collaborating under a human interface.
I think what ultimately matters is that the output is considered high quality by the end user. I don't think that it actually matters if an input is AI generated or human generated when training a model, as long as the LLM continues producing high quality results. I think implicit in your argument is that the _quality_ of the _training set_ is going to deteriorate due to LLM generated content. But:
1) I don't know how much quality of the input actually impacts the outcome. Almost certainly an entire corpus of noise isn't going to generate signal when passed through an LLM, but what an acceptable signal/noise ratio is seems to be an unanswered question.
2) AI generated content doesn't necessarily mean it is low quality content. In fact, if we find a high quality training set yields substantially better AI, I'd rather have a training set of 100% AI generated content that is human reviewed to be high quality vs. one that is 100% human generated content but unfiltered for quality.
I don't necessarily think this feedback loop, of LLM outputs feeding LLM inputs, is necessarily the problem people say it is. But might be wrong!
LLM output cannot be higher quality than the input (prompt + training data). The best possible outcome for an LLM is that the output is a correct continuation of the prompt. The output will usually be a less-than-perfect continuation.
With small models, at least, you can watch LLM output degrade in real time as more text is generated, because the ratio of prompt to output in the context gets smaller with each new token. So the LLM is trying to imitate itself, more than it is trying to imitate the prompt. Bigger models can't fix this problem, they can just slow down the rate of degradation.
It's bad enough when the model is stuck trying to imitate its output in the current context, but it'll be much worse if it's actually fed back in as training data. In that scenario, the bad data poisons all future output from the model, not just the current context.
This is interesting because it's essentially how human bullshitters work. The more they know, the longer they can convince you they know more than they do.
The success of an LLM is quite subjective. We have metrics that try to quantitatively measure the performance of an LLM, but the "real" test are the users that the LLM does work for. Those users are ultimately human, even if there are layers and layers of LLMs collaborating under a human interface.
I think what ultimately matters is that the output is considered high quality by the end user. I don't think that it actually matters if an input is AI generated or human generated when training a model, as long as the LLM continues producing high quality results. I think implicit in your argument is that the _quality_ of the _training set_ is going to deteriorate due to LLM generated content. But:
1) I don't know how much quality of the input actually impacts the outcome. Almost certainly an entire corpus of noise isn't going to generate signal when passed through an LLM, but what an acceptable signal/noise ratio is seems to be an unanswered question.
2) AI generated content doesn't necessarily mean it is low quality content. In fact, if we find a high quality training set yields substantially better AI, I'd rather have a training set of 100% AI generated content that is human reviewed to be high quality vs. one that is 100% human generated content but unfiltered for quality.
I don't necessarily think this feedback loop, of LLM outputs feeding LLM inputs, is necessarily the problem people say it is. But might be wrong!