Most things require workarounds, some things aren't possible (or we haven't foun...

Most things require workarounds, some things aren't possible (or we haven't found workaround yet) and it's not as fast as CUDA. But stable-diffusion inference works, and so does textual inversion training. I was also able to run training of a T5 model with just a couple of tweaks.

I'd stick with PyTorch 1.12.1 for now. 1.13 has problems with backpropagation (I get NaN gradients now when I attempt CLIP-guided diffusion -- I think this applies to training too), and some einsum formulations are 50% slower (there is a patch to fix this; I expect it'll be merged soon), making big self-attention matmuls slow and consequently making stable-diffusion inference ~6% slower.