I asked OpenAI’s ChatGPT some technical questions about Australian drug laws, like what schedule common ADHD medications were on - and it answered them correctly. Then I asked it the same question about LSD - and it told me that LSD was a completely legal drug in Australia - which is 100% wrong.
Sooner or later, someone’s going to try that as a defence - “but your honour, ChatGPT told me it was legal…”
Y’all are using this tool very wrong and in a way that none of the AI integrated search engines will. You assume the AI doesn’t know anything about the query, provide it the knowledge from the search index and ask it to synthesize it.
There’s still the risk that if the search results it is given don’t contain the answer to the exact question you asked it, that it will hallucinate the answer.
10,000% true which is why AI can't replace a search engine, only compliment it. If you can't surface the documents that contain the answer then you'll only get garbage.
A GAN approach to penalising a generator for generating something that is not supported by it's available data would be interesting (and I'm sure some have tried it already, I'm not following the field closely), but for many subjects creating training sets would be immensely hard (for some subjects you certainly could produce large synthetic training sets)
Look I know that "user is holding it wrong" is a meme but this is a case where it's true. The fact that LLMs contain any factual knowledge is a side-effect. While it's fun to play with and see what it "knows" (and can actually be useful as a weird kind of search engine if you keep in mind it will just make stuff up) you don't build an AI search engine by just letting users query the model directly and call it a day.
You shove the most relevant results form your search index into the model as context and then ask it to answer questions from only the provided context.
Can you actually guarantee the model won't make stuff up even with that? Hell no but you'll do a lot better. And the game now becomes figuring out better context and validating that the response can be traced back to the source material.
The examples in the article seem to be making the point that even when the AI cites the correct context (ie: financial reports) it still produces completely hallucinated information.
So even if you were to white-list the context to train the engine against, it would still make up information because that's just what LLMs do. They make stuff up to fit certain patterns.
That’s not correct. You don’t need to take my word for it. Go grab some complete baseball box scores and you can see that ChatGPT will reliably translate them into an entertaining English paragraph -length outline of the game.
This ability to translate is experimentally shown to be bound to the size of the LLM but it can reliably not synthesize information for lower complexity analytic prompts.
You don't build an AI search engine by just letting users query the model directly and call it a day.
Have you ever built an AI search engine? Neither have Google or MS yet. No one knows yet what the final search engine will be like.
However, we have every indication that all of the localization and extra training are fairly "thin" things like prompt engineering and maybe a script filtering things.
And given that despite ChatGPT's great popularity, the application is a monolithic text prediction machine and so it's hard to see what else could be done.
Sooner or later, someone’s going to try that as a defence - “but your honour, ChatGPT told me it was legal…”