> So what changed? We aren't sure, but the speculation is that in the process of...

> So what changed? We aren't sure, but the speculation is that in the process of training, GPT-3 found that the best strategy to correctly predicting the continuation of arithmetic expressions was to figure out the rules of basic arithmetic and encode them in some portion of its neural network, then apply them whenever the prompt suggested to do so.

I saw a lot of basic arithmetic in the thousands range where it failed. If we have to keep scaling it quadratically for it to learn log n scale arithmetic then we're doing it wrong.

I'm surprised you think it learned some basic rules around arithmetic. A lot of simple rules extrapolate very well, into all number ranges. To me it seems like it's just making things up as it goes along. I'll grant you this though, it can make for a convincing illusion at times.