Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Yeah, this seems very possible—it will be interesting to see where this goes if the cost of RLHF decreases or, even better, people can choose from a number of RLHF datasets and composably apply them to get their preferred model.

And true that objectionable content doesn't arise often while coding, but the model also becomes less likely to say "I can't help you with this," which is definitely useful.



In my fantasy world, RLHF algorithms become efficient enough to run locally such that I can indicate my own preferences and tune models on them.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: