Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The insidious part about chatGPT getting things wrong is that it is a superb bullshitter.

It gives you answers with 100% confidence and believable explanations. But sometimes the answers are still completely wrong.



Knowing little about how ChatGPT actually works, is there perhaps a variable that could be exposed, something that would represent the model's confidence in the solution provided?


I'd say you can't do that, because ChatGPT has no internal model for how the things it is explaining work; so there can't be any measure of closeness to the topic described, as would be the case for classification AIs.

ChatGPT models are language models; they represent closeness between text utterances. It works by looking for the chains of words most similar or usually connected to those indicated in the prompt, with no understanding of what those words mean.

As a metaphor, think of an intern who every morning is asked to buy all the newspapers in paper form, cut out the news sentence by sentence, and put all the pieces of paper in piles grouped according to the words they contain.

Then, the director requests to write a news item on the increase in interest rates. The intern goes to the pile where all the snippets about interest rates are placed, will randomly get a bunch of them, and write a piece by linking the fragments together.

The intern has a PhD in English, so it is easy for them to adjust the wording to ensure consistency; and the topics more talked about will appear more often in the snippets, so the ones chosen are more likely to deal with popular issues. Yet the ideas expressed are a collection of concepts that might have made sense in their original context, but have been decontextualized and put together pell-mell, so there's no guarantee that they're saying anything useful.


> ChatGPT models are language models; they represent closeness between text utterances. It works by looking for the chains of words most similar or usually connected to those indicated in the prompt, with no understanding of what those words mean.

No, it does not work that way. That’s how base GPT3 works. ChatGPT works via RLHF and so we don’t “know” how it decides to answer queries. That’s kind of the problem.


Explainable AI specifically Language Models will be a very interesting field to follow then.


something something sufficiently advanced markov chains something something GAI


I don't think so. It doesn't understand what it says, it basically does interpolation between text it copy-pastes in a very impressive manner. Still it does not "understand" anything, so it cannot have any kind of confidence.

Take Stable Diffusion for instance: it can interpolate a painting from that huge dataset it has, and sometimes output a decent result that may look like what a good artist would do. But it doesn't have any kind of "creative process". If it tells you "I chose this theme because it reflects this deep societal problem", it will just be pretending.

It may not matter if all you want is a nice drawing, but when it's about, say, engineering, that's quite different.


It's not available for ChatGPT but the other GPT models can expose the probability for each generated token, which can serve as a proxy for confidence.

Tuning the temperature and topP parameters you can also make the model avoid low probability completions (useful for less creative use cases where you need exact answers).


> It's not available for ChatGPT but the other GPT models can expose the probability for each generated token, which can serve as a proxy for confidence.

A proxy for confidence in what exactly?

Language models represent closeness of words, so a high probability would only express that those words are put together frequently in the corpus of text; not that their meanings are at all relevant to the problem at hand. Am I wrong?


In cases where you ask GPT-3 questions that have a clear correct answer, I think you can use the probability to judge how correct the answer is. For example, when asking "How tall is Mount Everest?" I would want the completion "Mount Everest is ____ meters above sea level." to have a very high probability for the ____ tokens.

This is because I'm operating under the assumption that sequences of words that appear often in the training set are more likely to represent something correct (otherwise you might as well train on random words). This only holds if the training set is big enough that you can estimate correctly (e.g. if the training set is small a very rare/wrong phrase may appear very often).

Maybe confidence was the wrong word, but for this kind of questions I would trust a high-probability answer way more than a low one. For questions belonging to very specific subjects, where training material is scarce, the model might have very skewed probabilities so they become less useful.


> In cases where you ask GPT-3 questions that have a clear correct answer, I think you can use the probability to judge how correct the answer is. For example, when asking "How tall is Mount Everest?" I would want the completion "Mount Everest is ____ meters above sea level." to have a very high probability for the ____ tokens.

Maybe, as long as you're aware that this is the same kind of correctness that you get from looking at Google's first search results (the old kind of organic pages, not the "knowledge graph", which uses an different process - precisely to avoid being spammed by SEO) i.e. "correctness by popularity".

This means that the content that is more replicated will be considered more true by the system, regardless of its connection to reality or its coherence with the rest of the knowledge in the system. And you know what they say about big enough lies that you keep repeating millions of times.


I agree, and furthermore, a search engine is constrained to pick its responses from what's already out there.

This line of thought is a distraction, anyway. The likelehood that GPT-3 will do as well as a search engine on topics where there is an unambiguous and well-known answer does little to address the more general concern.


> This means that the content that is more replicated will be considered more true by the system, regardless of its connection to reality or its coherence with the rest of the knowledge in the system.

I understand the problem, but what better way do we currently have to measure its connection to reality? At least from a practical point of view it seems that LLMs have achieved way better performance than other methods in this regard, so repeatedness doesn't look like that bad a metric. Or rather, it's the best I think we currently have.


> I understand the problem, but what better way do we currently have to measure its connection to reality?

We can consider its responses to a broader range of questions than those having an unambiguous and well-known answer. Its propensity for making up 'facts', and for fabricating 'explanations' that are incoherent or even self-contradictory shows that any apparent understanding of the world being represented in the text is illusory.


This resonates with me. We have all worked with someone who is a superb bullshitter, 100% confident in their responses, yet they are completely wrong. Only now, we have codified that person into chatGPT.


That might be the problem. Too many bullshitters who like posting online and chatGPT has been trained on them.


I doubt it. Even if it was trained with 100% accurate information chatGPT would still prefer an incorrect decisive answer to admitting it doesn't know.


TBH, a lot of SEO-optimized results are the same, although I think the conversational makes people assign even more authority to chatGPT.


SEO optimized sites can also be identified and avoided. There's various indicators of the quality of a site, to the point where I'm positive most people on HN can know to stay away or bail from one of those sites without even being consciouly aware of what gave them that sense of SEO.


General Purpose Bullshitting Technology. I've always found LLMs most useful as assistants when working on things I'm already familiar with, or as don't-trust-always-verify high temperature creatives. I think that attempts to sanitize their outputs to be super safe and "reliable sources" will trend public models towards blandness.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: