For starters, it means you should not take the success of the math and ascribe it to an advance in the LLM, or whatever phrase is actually being used to describe the the new fancy target of hype and investment.
An LLM is at best, a possible future component of the speculative future being sold today.
How might future generations visualize this? I'm imagining some ancient Greeks, who have invented an inefficient reciprocating pump, which they declare is a heart and that means they've basically built a person. (At the time, many believed the brain was just there to cool the blood.) Look! The fluid being pumped can move a lever: It's waving to us.
Interesting metaphor, but I’m not sure you’re fully appreciating the hypothetical. The agent didn’t seem like it was going to solve a math problem, it did.
Before intuitive computing, the best we could do with word problems was Wolfram-esque regex stuff, which I’m guessing we all know was quite error-prone. Now, we have agents that can take quite vague word problems and use any sequence of KB/web searches, python programs, and further intuitive reasoning steps to arrive at the requested answer. That’s pretty impressive, and I don’t think “well technically it relies on tools” makes it less impressive! Something that wasn’t possible yesterday is possible today; that alone matters.
Re:general skepticism, I’ve given up on convincing people that AGI is close, so all ill say is “hedge your bets” ;)
An LLM is at best, a possible future component of the speculative future being sold today.
How might future generations visualize this? I'm imagining some ancient Greeks, who have invented an inefficient reciprocating pump, which they declare is a heart and that means they've basically built a person. (At the time, many believed the brain was just there to cool the blood.) Look! The fluid being pumped can move a lever: It's waving to us.