Yes, you are right. If we ask LLM to grade something, we must create a prompt wi...

Yes, you are right.

If we ask LLM to grade something, we must create a prompt with good instructions. Otherwise, we will have no idea what 0.5 means or whether it is given consistently.

(A rule of thumb: Is it likely that various people, not knowing the context of a given task, will give the same grade?)

The most robust approach is to ask to rank things within a task. That is, "given blog post titles, grade them according to (criteria)" rather than asking about each title separately.