Hacker News new | past | comments | ask | show | jobs | submit login

And also the breakthrough that let AlphaGo and AlphaStar make the leaps that they did.

The trouble is that those board games don't translate well to other domains. But if the game space can operate through the realm of language and semantics, then the hope is that we can tap into the adversarial growth curve, but for LLMs.

Up until now, everything that we've done has just been imitation learning (even RLHF is only a poor approximation "true" RL).




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: