You can use it for that, but you have to test its claims. It's specially helpful...

visarga · on Feb 13, 2023

I think LLM checking is going to be the most important research direction this year. Generative models are worthless without verification. It relates to deciding truth and dealing with fake information, synthetic text and spam. Google hasn't been able to solve it in the last decade, but some people have an idea:

> "Discovering Latent Knowledge in Language Models Without Supervision" Existing techniques for training language models can be misaligned with the truth: if we train models with imitation learning, they may reproduce errors that humans make; if we train them to generate text that humans rate highly, they may output errors that human evaluators can't detect. We propose circumventing this issue by directly finding latent knowledge inside the internal activations of a language model in a purely unsupervised way.

https://arxiv.org/abs/2212.03827

In other words the model already tries to predict the truth because it is useful in next token prediction, but we need to find a way to detect the 'truth alignment' in its activations.