> Binding the LLM to a formal logic and proof engine is one way to stop them hallucinating and make them useful for the real world.
Checking the output does not mean the model does not hallucinate and thus does not help for all other cases in which there is no "formal logic and proof engine".
What if I consider the model to be the llm plus whatever extra components it has that allows it to not hallucinate? In that case then the model doesn't hallucinate, because the model is the llm plus the bolt-ons.
Remember llm truthers claim that no bolt-ons can ever fully mitigate an llm's hallucinations. And yet in this case it does. But saying that it doesn't matter because other llms will still hallucinate is moving the goal post, or at least discounting the utility of this incremental progress. I think it's unfair to do this because there are many many domains where things can indeed be reduced to a formal logic amenable to approve engine.
If they don't care about the actual output of a hybrid system that doesn't hallucinate, because it's math and not speech, then do they care about solving the issue at all, or providing human utility? I get the feeling that they only want to be right, not the benefit anyone.
This shows that in cases where we can build good enough verifiers, hallucinations in a component of the system do not have to poison the entire system.
Our own brains work this way - we have sections of our brains that hallucinate, and sections of the brain that verify. When the sections of the brain that verify our sleep, we end up hallucinating dreams. When the sections of the brain that verify are sick, we end up hallucinating while awake.
I agree with you that the current system does not solve the problem for natural language. However it gives an example of a non hallucinating hybrid llm system.
So the problem is reduced from having to make llms not hallucinate at all, to designing some other systems, potentially not an llm at all, that can reduce the number of hallucinations to a useful amount.
Checking the output does not mean the model does not hallucinate and thus does not help for all other cases in which there is no "formal logic and proof engine".