The novelty is wearing off and the reality of parsing hundreds of TBs into hundr...

visarga · on April 14, 2023

What's interesting is that each token goes and visits all the model. Basically each token touches the synthesis of the whole human culture before being fully formed.

jimsimmons · on April 14, 2023

To bake a cake first you have to invent the universe

NhanH · on April 14, 2023

That depends on whether the weight matrix for the model is sparse or dense. If it's sparse, then a large swath of the path quickly becomes 0 (which could still be considered "visited", though pretty pathological).

0x008 · on April 14, 2023

It’s not only interesting but also necessary. What is language if not a compressed version of all human culture?

visarga · on April 14, 2023

And it's probably the closest approximation to what happens in our heads when we utter each word or take an action. We are thin layers of customisation running on top of Language.