Hacker News new | past | comments | ask | show | jobs | submit login

There are many mathematical functions whose output is hard to predict and requires immense calculations. I just recently read about how they had "discovered" the 9th Dedekind number, or something like that.

Just because we can't predict what the 10th Dedekind number will be does not mean it is somehow 'mysterious". It is just mathematics, logic and programming.




I don't think the Dedekind number relationship is really like what I described though. These are numbers which. have known properties (ie given a number you can test whether or not it is that) but no known closed form solution exists for the generator of the sequence, and probably there is no structure to the intervals between the numbers other than the properties we ascribe to the numbers. I see them as similar to primes for example in that you know one when you see one but not how to make all of them[1].

In my example, the relationship between the fractions in the Tailor expansion and the orbit definitely exists but if you don't have calculus it is not something that is amenable to understanding. There is a fundamental structure but the language to describe it would be missing.

ML is a universal function approximator and in the case of GPT models the functional form of the model consists of linear algebra operations and the parameters are matrices of weights. The mysterious part is "how the model processes information" like the original person said - why a particular mix of model weights corresponds with particular types of outputs. That is genuinely mysterious. We don't know whether or not there really is a structure and if there is, we don't know the "calculus" that would link them.

Now it may be that there isn't a missing piece (ie that the banal truth is we tweak the weights until we see what we want to see and by doing so we create an illusion of structure via the training process and the whole perception that the model is doing any information processing at all is something we make up). I actually have a lot of time for this point of view although I really need to understand the topic much more deeply before I make my own mind up.

[1] I don't know any number theory so could be totally wrong about this in which case I apologise.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: