What I mean is, we could program an entity which gathers its own data and updates its own weights. With the lines of code being in the tens of thousands.
But op's point was about people talking about tiny LLMs failing to consider the amount of parameters used. This has nothing to do with the ability of autonomously collecting the training data.