Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The scaling laws for LLMs depend heavily on the quality of data. For example, if you add an additional 100gb of data but it only contains the same repeating word, that will hurt the model. If you add 100gb of completely random words, that will also hurt the model. Between these two extremes (low and high entropy), human language has a certain amount natural entropy that helps the model gauge the true co-occurrence frequency of the words in the sentence. The scaling laws for LLMs aren't just a reflection of the model but the conditional entropy of human-generated sentences.

RL is such a different field that you can't apply these scaling laws directly. eg. agents playing tictactoe and checkers would stop scaling at a very low ceiling.



One possible risk I see is that with the amount of model generated text out there it will at some point inevitably result in feeding the output of one model into another unless the source of the text is meticulously traced. (My assumption is that that would hurt the model that you are trying to train as well.)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: