Hacker News new | past | comments | ask | show | jobs | submit login

> These are ML models and they are able to only do the task they have been specifically trained for

Yes, but the models we're talking about have been trained specifically on the task of "complete arbitrary textual input in a way that makes sense to humans", for arbitrary textual input, and then further tuned for "complete it as if you were a person having conversation with a human", again for arbitrary text input - and trained until they could do so convincingly.

(Or, you could say that with instruct fine-tuning, they were further trained to behave as if they were an AI chatbot - the kind of AI people know from sci-fi. Fake it 'till you make it, via backpropagation.)

In short, they've been trained on an open-ended, general task of communicating with humans using plain text. That's very different to typical ML models which are tasked to predict some very specific data in a specialized domain. It's like comparing a Python interpreter to Notepad - both are just regular software, but there's a meaningful difference in capabilities.

As for seeing glimpses of understanding in SOTA LLMs - this makes sense under the compression argument: understanding is lossy compression of observations, and this is what the training process is trying to force to happen, squeezing more and more knowledge into a fixed set of model weights.




Yes, this is why I think the LLM and image generation models are still impressive. Knowing they are ML models in the end and still produce a results that surprise us, makes you wonder what we are in the end. Could we essentially simulate something similar to us given enough inputs and parameters in the network, with enough memory, computing power and a training process that would aim to simulate a human with emotions. I would imagine the training process alone would need bunch of other models to teach the final model "concepts" and from there perhaps "reasoning".

Why I think AI is not the appropriate term is that if it were AI, the AI would have already figured everything out for us (or for itself). LLM can only chain text, it does not really understand the content of the text, and can't come up with new novel solutions (or if it accidentally does, it's due to hallucination), this can be easily confirmed by giving current LLMs some simple puzzles, math problems and so on.. Image models have similar issues.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: