Hacker News new | past | comments | ask | show | jobs | submit login
Würstchen: Fast Diffusion for Image Generation (huggingface.co)
21 points by cmitsakis on Sept 13, 2023 | hide | past | favorite | 2 comments



Some cursory (and shallow) testing shows that the model does reasonably well at representing artist styles compared to SDXL, certainly in terms of genre and color palette.

It actually does a bit better in terms of reproducing an artist's typical texture and technique (eg. "Art by Virgil Finlay" reproduces that artists' stippling technique without having to specify anything further like a particular medium).

In terms of artistic composition, results often seem a bit simplified or smoothed compared to SDXL (or SD v2.1 for that matter). Almost cartoonish, and reminiscent of Craiyon or other early models.

Interpolating between artist styles are about on par with SDXL, though the results are somewhat less consistent (that is, a series of "By ARTIST_1 and ARTIST_2" from SDXL will tend to share a common style, and from Würstchen they will be more diverse).


I wonder how this compares to GAN based text-to-image models like StyleGAN-T. If I remember correctly, GAN models mainly shine at very fast inference, but the same may not be true for training. Also diffusion based models seem to have generally higher quality.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: