Hacker News new | past | comments | ask | show | jobs | submit login

The RL is done on problems with verifiable answers. I’m not sure how o1 slop would be at all useful in that respect.





Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: