Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> It's clear that we need a paradigm shift on memory to unlock the next level of performance.

I think this is on point to the next phase of LLMs or a different neural network architecture that improves on top of them, alongside continual learning.

Adding memory capabilities would mostly benefit local "reasoning" models than online ones as you would be saving tokens to do more tasks, than generating more tokens to use more "skills" or tools. (Unless you pay more for memory capabilities to Anthropic or OpenAI).

It's kind of why you see LLMs being unable to play certain games or doing hundreds of visual tasks very quickly without adding lots of harnesses and tools or giving it a pre-defined map to help it understand the visual setting.

As I said before [0], the easiest way to understand the memory limitations with LLMs is Claude Playing Pokemon with it struggling with basic tasks that a 5 year old can learn continuously.

[0] https://news.ycombinator.com/item?id=43291895



Continual learning is definitely part of it. Perhaps part of it (or something else) is learning much faster from many fewer examples.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: