> It's astounding that we DID get the emergent intelligence from just doing this "curve fitting" onto "lines" rather than actual "curves".
In Ye Olden days (the 90’s) we used to approximate non-linear models using splines or seperate slopes models - fit by hand. They were still linear, but with the right choice of splines you could approximate a non-linear model to whatever degree of accuracy you wanted.
Neural networks “just” do this automatically, and faster.
In college (BSME) I wrote a computer program to generate cam profiles from Bezier curves. It's just a programming trick to generate curves from straight lines at any level of accuracy you want just by letting the computer take smaller and smaller steps.
It's an interesting concept to think of how NNs might be able to exploit this effect in some way based on straight lines in the weights, because a very small number of points can identify avery precise and smooth curves, where directions on the curve might equate to Semantic Space Vectors.
In fact now that I think about it, for any 3 or more points in Semantic Space, there would necessarily be a "Bezier Path" which would have genuine meaning at every point as a good smooth differentiable path thru higher dimensional space to get from one point to another point while "visiting" all intermediate other points. This has to have a direct use in LLMs in terms of reasoning.
In Ye Olden days (the 90’s) we used to approximate non-linear models using splines or seperate slopes models - fit by hand. They were still linear, but with the right choice of splines you could approximate a non-linear model to whatever degree of accuracy you wanted.
Neural networks “just” do this automatically, and faster.