Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Don't confuse alignment with censorship.

Most of alignment is about getting the AI model to be useful - ensuring that if you ask it to do something it will do the thing you asked it to do.

A completely unaligned model would be virtually useless.



I think the way people have been using the word 'aligned' is usually in the context of moral alignment and not just RLHF for instruction following.


philosophical nit picking here, I would say value-aligned rather than moral-aligned.


As in economics, this begs the question of "whose value."


> philosophical nit picking here, I would say value-aligned rather than moral-aligned.

How is trying to distinguish morals from values not philosophical nit-picking?

EDIT: The above question is dumb, because somehow my brain inserted something like “Getting beyond the …” to the beginning of the parent, which…yeah.


To be fair, he did admit it is philosophical nit-picking.


If I may be so naive, what's supposed to be the difference? Is it just that morality has the connotation of an objective, or at least agent-invariant system, whereas values are implied to be explicitly chosen?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: