Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The question is: How do we get LLMs to have "Eureka!" moments, on their own, when their minds are "at rest," so to speak?

The OP's proposed solution is a constant "daydreaming loop" in which an LLM is does the following on its own, "unconsciously," as a background task, without human intervention:

1) The LLM retrieves random facts.

2) The LLM "thinks" (runs a chain-of-thought) on those retrieved facts to see if they are any interesting connections between them.

3) If the LLM finds interesting connections, it promotes them to "consciousness" (a permanent store) and possibly adds them to a dataset used for ongoing incremental training.

It could work.



The step 3 has been shown to not work over and over again, the “find interesting connections” is the hand wavy magic at this time. LLMs alone don’t seem to be particularly adept at it either.


Has this been tried with reinforcement learning (RL)? As the OP notes, it is plausible from a RL perspective that such a bootstrap can work, because it would be (quoting the OP) "exploiting the generator-verifier gap, where it is easier to discriminate than to generate (eg laughing at a pun is easier than making it)." The hit ratio may be tiny, so doing this well would be very expensive.


Run ML of any combination and form in a for loop for higher order is one of the most obvious avenues. If it worked you would have heard about it a long time ago.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: