I've been using the HuggingFace diffuses repo[1] with 6GB of VRAM fine.
It's well engineered, maintainable and with decent installation process.
The branch in this PR[2] adds M1 Mac support with a one line patch and it runs faster than the CompVis version (1.5 iterations/sec vs 1.4 for CompVis on a 32 Gb M1 Max,
I highly recommend people switching to that version for the improved flexibility.
I wouldn’t recommend using that as-is. MPS doesn’t give deterministic random number generation, which means that seeds become meaningless and you won’t ever be able to reproduce something. You can work around it by generating random numbers on the CPU and then moving them to MPS, but that probably requires a fix in PyTorch.
That’s a fix for the CompVis forks, not the diffusers system we are talking about in this thread.
Also, that’s only a partial fix that doesn't really work properly. It doesn’t affect img2img and it still gets things wrong on the first render. Since txt2img starts from scratch each time, that means you’re always getting an incorrect render, it just happens to be the same incorrect render each time.
Yeah, but with the CompVis derived repos, it’s pretty easy to go in and change all the calls to PyTorch random number generators.
Having said that, the last comment [0] on the PyTorch issue gave me the idea of monkey patching the random functions. The supplied code assumes you’re always passing in a generator, which is not true in this case, but if you monkey patch the three rand/randn/randn_like functions to do nothing but swap out the device parameter for 'cpu' and then call to('ops') on the return value, it’s enough to get stable seed functionality for the CompVis derived repos without modifying their code, so I’m guessing it will probably work for diffusers as well.
Also, it’s probably a bug in the CompVis code, but even after you fix the random number generator, the very first run in a session uses an incorrect seed. The workaround is to generate an image once to throw away whenever you start a new session.
(Annoyingly I just went and made similar changes and was about to create a PR for them. But they have a fix for a "warm-up" issue I wasn't aware of too)
Yes, the warm-up issue is what I meant by “the very first run in a session uses an incorrect seed”. It looks like they’ve backed away from that now and expect developers to warm up themselves. Not a big deal, you can just use a single step to do that.
It's well engineered, maintainable and with decent installation process.
The branch in this PR[2] adds M1 Mac support with a one line patch and it runs faster than the CompVis version (1.5 iterations/sec vs 1.4 for CompVis on a 32 Gb M1 Max,
I highly recommend people switching to that version for the improved flexibility.
[1] https://github.com/huggingface/diffusers
[2] https://github.com/huggingface/diffusers/pull/278