The training data contains tons of false information and the training objective ...

sigmoid10 · on Feb 13, 2023

The large corpus of text is only necessary to grasp the structure and nuance of language itself. Answering questions 1. in a friendly manner and 2. truthfully is a matter of fine-tuning as the latest developments around GPT3.5 clearly show. And with approaches like indexGPT the usage of external knowledge bases that can even be corrected later is already a thing, we just need this at scale and with the correct fine tuning. The tech is way further than those cynics realize.

LarryMullins · on Feb 13, 2023

Without the opportunity to live in the real world, these LLMs have no ground-truth. They can only guess at truth by following consensus.

pillefitz · on Feb 13, 2023

I'm sure you can add constraints of some sorts to build internally consistent world models. Or add stochastic outputs as has been done in computer vision to assign e.g. variances to the probabilities and determine when the model is out if its depth (and automatically query external databases to remove the uncertainty / read up on the topic..)