With glide I think we've reached something of a plateau in terms of architecture...

andybak · on April 6, 2022

Wow. Thanks for GLID-3. It was genuinely exciting for a few days but then I must admit latent diffusion stole my attention somewhat ;-)

What kind of prompts is GLID-3 especially good for? I remember getting lucky when I was playing around a few times but I didn't do it systematically.

Jack000 · on April 6, 2022

glid-3 is trained specifically on photographic-style images, and is a bit better at generalization compared to the latent diffusion model.

eg. prompt: half human half Eiffel tower. A human Eiffel tower hybrid (I get mostly normal Eiffel towers from LDM but some sensical results from glid-3)

glid-3 will be worse for things that require detailed recall, like a specific person.

With smaller models you kind of have to generate a lot of samples and pick out the best ones.

rahimnathwani · on April 7, 2022

Thanks!

Do you happen to know how much GPU RAM I need to run glid-3 and/or the latent diffusion model, if I don't want to run on colab?

Jack000 · on April 7, 2022

Just tried glid-3 with a batch size of one and I'm getting 4781MiB. The latent diffusion model peaks at 8403MiB

These are fp16 numbers though, you might need a recent nvidia card to run it.

rahimnathwani · on April 7, 2022

I'll try them out. I have an RTX 2070, which apparently supports fp16. But it only has 8GB RAM.

I used the instructions here to check: https://github.com/wang-xinyu/tensorrtx/blob/master/tutorial...