Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The novelty is wearing off and the reality of parsing hundreds of TBs into hundreds of GB memory blobs you can query by the Kb is setting in.



What's interesting is that each token goes and visits all the model. Basically each token touches the synthesis of the whole human culture before being fully formed.


To bake a cake first you have to invent the universe


That depends on whether the weight matrix for the model is sparse or dense. If it's sparse, then a large swath of the path quickly becomes 0 (which could still be considered "visited", though pretty pathological).


It’s not only interesting but also necessary. What is language if not a compressed version of all human culture?


And it's probably the closest approximation to what happens in our heads when we utter each word or take an action. We are thin layers of customisation running on top of Language.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: