We do fully understand how, and why, they work. That’s how we can build and opti...

giardini · on Nov 24, 2023

dartos says >"We do fully understand how, and why, they work. That’s how we can build and optimize them. We can and do explicitly explain their inner workings in math."<

I certainly don't! I don't know of anyone who understands how and why they work. Granted there are many who write code that "works" but where's the proof that the code actually "works properly"? And if you don't know where you are then it is difficult to determine a path to a better place.

I see few attempts to explicitly explain their inner workings in math. I do see lots of programs/code, however but that isn't math.

Similar dramatic moments have likely occurred before many times. The bare facts lie before us but we do not understand. I'm thinking of late 1800s-1900 just before Einstein published his works on special relativity. We knew Maxwell's equations and their relationship to the speed of light. We even had Lorentz's equations but we did not understand what they meant. Einstein interpreted the math and gave us special relativity.

Perhaps there is an Einstein (or two, or three, ...) working with LLMs now but they are quiet so far.

calamari4065 · on Nov 27, 2023

I think we're talking about entirely different levels of understanding.

We do, of course, understand ML models at a pretty deep level. What you can't do is identify the weight values that encode all information about squirrels. You can't point to a particular neuron and say this is why it hallucinates. We do not grok these models. I severely doubt that it's even possible for a human to grok a 13B parameter LLM