You can run Stable Diffusion on an iPhone 11 and it completes in under a minute ...

throwaway290 · on May 22, 2023

Again, poor quality or slow on conventional hardware.

The first of your examples was generated on a desktop computer with 2080 Ti and even then still glaring uncanny hands. We don't know how long it took but I think the reason for the hands is that it's too slow to generate a dozen of these in hopes that hands would come out right.

The other one I can see done on any laptop in a few minutes, but it's more primitive and just a monochrome sketch. I skip over obvious issues e.g. with shape of glasses.

For both examples you don't need any specialized tools or watermarking to notice this stuff.

Maybe you see what I mean why indie homegrown AI is not such a big deal ;) Sure there are people who will invest in hardware but those people will are not and for now won't be mainstream enough to matter. Especially if it will be licensed, most people don't like to violate laws. Most people will just use chatgpt or dall-e.

srslack · on May 22, 2023

I don't see what your point is. The regulatory capture going on right now with the attempt to license things is ludicrous and akin to licensing matrix multiplications. It won't stick. Stable Diffusion and Approximated Functions (neural networks) are not something magical despite the fear they want to impart on them.

Commercial AI has all of those issues you mentioned, and more, and less. Midjourney is just a bunch of LoRAs layered on top and scripting to generate the images. But since they do that, midjourney images have a specific "feel" that it can't seem to get rid of. It's nothing really out of the reach for someone sufficiently motivated to reproduce.

DALL-E is laughable now, and it's only been a year. Certainly has been surpassed by open source, and outside competitors. I'm not sure what your motivation is to discount open source. People are already running LLM inference on their phones.

throwaway290 · on May 22, 2023

My point: 99% of people will use chatgpt etc. because homebrew alternatives are either bad (easy to detect with naked eye) or slow. Probably Microsoft will also make sure no competitors can offer good enough AI by pushing for regulation. So if those big platforms are required to watermark/detect own AI results that's good enough. Remaining 1% of crazy people don't count.

Your point?

srslack · on May 22, 2023

That midjourney et al are also detectable to the naked eye.

Why do you need to watermark them, again? The error level analysis is off the charts with generative images. They light up like a Christmas tree. Just because you and uninformed legislators and journalists don't know how to check the ELA of an image, doesn't mean they're undetectable. And the cheaters include the bogus sources spit out by ChatGPT already. The cryptographic qualities will be lost as soon as an editor gets their hands on it, automated editing or not. It's a cat and mouse game.

And also, I find it telling that you think if someone doesn't have high end hardware, they're going to pay $20/mo to OpenAI. For $20/mo, you can buy a mid-range video card and write it off. For an extra $10/mo, you can deprecate the cost and buy a high end laptop for that price, if you're a professional, and you're not locked into OpenAI. You're also assuming that 1. hardware doesn't get better and 2. techniques don't improve to run them on limited hardware.

throwaway290 · on May 23, 2023

> That midjourney et al are also detectable to the naked eye.

Again, either too slow, requiring outrageous hardware, or obviously noticeable. So far no examples to the contrary.

Don't forget, the topic is using special measures to detect undetectable with naked eye. When you can simply see the screwed up hands on a photo it's not even necessary.

> Why do you need to watermark them, again?

Why do you think I need to watermark them again?

> a mid-range video card

and a PC to put it in, a space to put the PC in, etc. With a laptop we're back in wait for an hour to see a result.

> hardware doesn't get better and 2. techniques don't improve to run them on limited hardware

We can revisit this if it consumer hardware gets good enough...

srslack · on May 23, 2023

>With a laptop we're back in wait for an hour to see a result.

Any laptop within the last five years with decent memory can run stable diffusion on the cpu in around 12 minutes. My MacBook Pro runs a batch of four on Metal in around 30 seconds.

>We can revisit this if it consumer hardware gets good enough...

I mean, I just showed you a quantized llama running on a Pixel 5 and 6. And, I wouldn't discount most of the next generation of hardware having ML co processing like MacBooks and iPhones and Pixels do with all of this hype.

throwaway290 · on May 26, 2023

> Any laptop within the last five years with decent memory can run stable diffusion on the cpu in around 12 minutes.

Majority of output is bad so you need to try dozens of takes to get a result that is reasonably realistic. Multipy 12 accordingly

> quantized llama

I don't know what that means but if it's better than chatgpt/gpt4 then sure.