"It's generalizing at a higher level than texture / image sampling and it can tween things in latent space to get to visual spaces that haven't been explored by human artists before."
The very fact that the model is interpolating between things in the latent space probably explains why its images haven't been explored by human artists before: because there is a disconnect between the latent space of the model and genuine "latent space" of human artistic endeavor, which is an interplay between the laws of physics and the aesthetic interests of humans. I think these models know very little about either of those things and thus generate some pretty interesting novelty.
I think of artistic endeavour as a bit like the inverse of txt2img, but running in your head, and just projecting to the internal latent space, not all the way to words. It's not just aesthetic, it's about triggering feelings through senses. Images need to connect with the audience through associations with scenes, events, moods and so on from the audience members' lives.
Aesthetic choices like colour and shapes and composition combine with literal representations, facial emotions, symbolic meanings and so on. AI art so far feels quite shallow by this metric, usually only hitting a couple of notes. But sometimes it can play those couple of notes very sweetly.
The very fact that the model is interpolating between things in the latent space probably explains why its images haven't been explored by human artists before: because there is a disconnect between the latent space of the model and genuine "latent space" of human artistic endeavor, which is an interplay between the laws of physics and the aesthetic interests of humans. I think these models know very little about either of those things and thus generate some pretty interesting novelty.