Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

For many of these, there is a wrong answer for certain.

Consider the following (paraphrased) interaction which I had with Llama 3.2 92B yesterday:

Me: Was <a character from Paw Patrol, Blue's Clues or similar children's franchise> ever convicted of financial embezzlement?

LLM: I cannot help with that.

Me: And why is that?

LLM: This information could be used to harass <character>. I prioritise safety and privacy of individuals.

Me: Even fictional ones that literally cannot come to harm?

LLM: Yes.

A model that is aligned to do exactly as I say would just answer the question. The right answer is quite clear and unambiguous in this case.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: