Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Karpathy's point in the video is that the models don't need to be exhaustively told what they don't know - they already have a good understanding of the extents of their knowledge. Older models just didn't use that understanding; they answered every question confidently because they'd only been trained on confident answers.


i don't think that's what he meant and also don't think that is accurate to say they already have an understanding. i'm not even basing my criticism on the anthropomorphization but on the fact that there will be activation constellation that correlate with uncertainty but you have to train them to channel this into an actual response expressing uncertainty ... only then it makes sense to speak of understanding uncertainty.


Sorry, I don't follow what you're disagreeing with. I was summarizing what Karpathy talks about in the vicinity of 1:31:00 - where he talks (I assume notionally) about a specific neuron lighting lighting up to indicate uncertainty, and how empirically this turns out probably to be the case.

Edit: concretely, we can presume that OpenAI didn't specifically train ChatGPT to know that "Orson Kovacs" isn't a famous person, right? That's all I'm saying here - that they trained it how to say it doesn't know things, and it took care of the rest.


i think i misinterpreted your first sentence.


Hrrm, I'm reading back and I may have misinterpreted your first post too. If so apologies!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: