The term "AGI" has been loosely used for so many years that it doesn't mean anything very specific. The meaning of words derives from their usage.
To me Shane Legg's (DeepMind) definition of AGI meaning human level across full spectrum of abilities makes sense.
Being human or super-human level at a small number of specialized things like math is the definition of narrow AI - the opposite of general/broad AI.
As long as the only form of AI we have is pre-trained transformers, then any notion of rapid self-improvement is not possible (the model can't just commandeer $1B of compute for a 3-month self-improvement run!). Self-improvement would only seem possible if we have an AI that is algorithmically limited and does not depend on slow/expensive pre-training.
What if it sleeps for 8 hours every 16 hours and during that sleep period, it updates its weights with whatever knowledge it learned that day? Then it doesn't need $1B of compute every 3 months, it would use the $1B of compute for 8 hours every day. Now extrapolate the compute required for this into the future and the costs will come down. I don't know where I was going with that...
These current LLMs are purely pre-trained - there is no way to do incremental learning (other than a small amount of fine-tuning) without disrupting what they were pre-trained on. In any case, even if someone solves incremental learning, this is just a way of growing the dataset, which is happening anyway, and under the much more controlled/curated way needed to see much benefit.
There is very much a recipe (10% if this, 20% of that, curriculum learning, mix of modalities, etc) for the type of curated dataset creation and training schedule needed to advance model capabilities. There have even been some recent signs of "inverse scaling" where a smaller model performs better in some areas than a larger one due to getting this mix wrong. Throwing more random data at them isn't what is needed.
I assume we will eventually move beyond pre-trained transformers to better architectures where maybe architectural advances and learning algorithms do have more potential for AI-designed improvement, but it seems the best role for AI currently is synthetic data generation, and developer tools.
To me Shane Legg's (DeepMind) definition of AGI meaning human level across full spectrum of abilities makes sense.
Being human or super-human level at a small number of specialized things like math is the definition of narrow AI - the opposite of general/broad AI.
As long as the only form of AI we have is pre-trained transformers, then any notion of rapid self-improvement is not possible (the model can't just commandeer $1B of compute for a 3-month self-improvement run!). Self-improvement would only seem possible if we have an AI that is algorithmically limited and does not depend on slow/expensive pre-training.