I've been wondering if all of the jailbreak-fixing/rlhf-tuning that is happening...

I've been wondering if all of the jailbreak-fixing/rlhf-tuning that is happening to GPT4 is responsible for "nerfing it" (Still unsure if that's actually happening or if people are just noticing the gaps in its understanding more now).

Imagine someone who is perfectly politically-correct and never says anything even remotely edgy/original. When I imagine people like this (who I've met irl), they are genuinely a little bit stupid. And I wonder if the "make this model never output anything "dangerous" process" causes a model to become stupider.

Anyway, I'm off to go see if Claude 2 will help me stage a coup in a third-world country and become its dictator. Adieu.