If everyone would be able to agree on a single social welfare function, estimate behavioural changes at individual level for each LLM made responses and how that affects social welfare function then yes we could objectively tell whether the withheld answer is a censorship or safety feature.
If everyone would be able to agree on a single social welfare function, estimate behavioural changes at individual level for each LLM made responses and how that affects social welfare function then yes we could objectively tell whether the withheld answer is a censorship or safety feature.