Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Written language is used, in large part, to express sensory data (ex: colors, shapes, events, sounds, temperatures, etc). Abstract models are, through inductive reasoning, extrapolated from that sensory information. So in effect more sensory data should mean more accurate abstract models.

For example, it might take several paragraphs to wholly capture all the meaningful information in one image in such a way that it can be reproduced accurately. Humans, and many animals, process large amounts of data before they are even capable of speech.

The data GPT-3 was provided with pales in comparison. It is unclear whether these GPT models are capable of induction because it may be that they need more or better sanitised data to develop abstract models. Therefore they should be scaled up further until they only negligepbly improve. If even then they, still, are incapable of general induction or have inaccurate models. Then the transformer model is not enough or perhaps we need a more diverse set of data (images, audio, thermosensors, etc).



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: