Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

>No it isn't. Type a question into a base model, one that hasn't been finetuned into being a chatbot, and the predicted continuation will be all sorts of crap, but very often another question, or a framing that positions the original question as rhetorical in order to make a point.....

To be fair, only if you pose this question singularly with no proceeding context. If you want the raw LLM to answer your question(s) reliably then you can have the context prepended with other question-answer pairs and it works fine. A raw LLM is already capable of being a chatbot or anything else with the right preceding context.





Right, but that was my point - statistically, answers do not follow questions without some establishing context, and as such, while LLMs are "simply" next word predictors, the chatbots aren't - they are Hofstaderian strange loops that we will into being. The simpler you think language models are, the more that should seem "magic".

They're not simple though. You can understand, in a reductionist sense, the basic principles of how transformers perform function approximation; but that does not grant an intuitive sense of the nature of the specific function they have been trained to approximate, or how they have achieved this approximation. We have little insight into what abstract concepts each of the many billions of parameters map on to. Progress on introspecting these networks has been a lot slower than trial-and-error improvements. So there is a very real sense in which we have no idea how LLMs work, and they are literally "magic black boxes".

No matter how you slice it - if "magic" is a word which can ever be applied to software, LLM chatbots are sure as shit magic.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: