Diffusion model is now SOTA in audio and image generation. Has anyone given it a...

polygamous_bat · on Aug 2, 2023

There are two problems with this. Diffusion models work on a single rule of thumb: if you keep adding small, noisy gaussian steps to a "nice" distribution many times, you get uniform gaussian at the end.

So, for text: a) what is the equivalent of a small, noisy step? and b) what is the equivalent of a uniform gaussian in language space?

If you can solve a and b, you can make diffusion work for text, but there hasn't been any significant progress there afaik.