Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Non-determinism is a red herring, and the token layer is a wrong abstraction to use for this, as determinism is completely orthogonal to correctness. The model can express the same thing in different ways while still being consistently correct or consistently incorrect for the vague input you give it, because nothing prevents it from setting 100% probability to the only correct output for this particular input. Internally, the model works with ideas, not tokens, and it learns the mapping of ideas to ideas, not tokens to tokens (that's why e.g. base64 is just essentially another language it can easily work with, for example).


No. Humans think it maps to ideas. This is the interpretation being done by the observer being added to the state of the system.

The system has no ideas, it just has its state.

Unless you are using ideas as a placeholder for “content” or “most likely tokens”.


That's irrelevant semantics, as terms like ideas, thinking, knowledge etc. are ill-defined. Sure, you can call it points in the hidden state space if you want, no problem. Fact is, the correctness is different from determinism, and the forest of what's happening inside doesn't come down to the trees of most likely tokens, which is well supported by research and very basic intuition if you ever tinkered with LLMs - they can easily express the same thing in a different manner if you tweak the autoregressive transport a bit by modifying its output distribution or ban some tokens.

There are a few models of what's happening inside that hold different predictive power, just like how physics has different formalisms for e.g. classical mechanics. You can probably use the same models for biological systems and entire organizations, collectives, and processes that exhibit learning/prediction/compression on a certain scale, regardless of the underlying architecture.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: