I’m pretty sure it is, except instead of comparing the result to a source image, it is analyzed and scored by a model that was trained to recognize images.
It's kind of the opposite. Stable Diffusion starts with random noise, changes it, asks the model of it's representative of the goal, repeat. But there's much more to it.