Hacker News new | past | comments | ask | show | jobs | submit login

Looking forward to following this variant of Stable Diffusion, as it's working great on my laptop. Mighty glad I got the 16g of RAM, though I find if I step a canvas dimension down from 512 I get snappier generation… no biggie, anything I got that's useable I'd have to upscale anyway…

Since it's a Mac app, I have to wonder if it could stick the prompt, steps, and guidance into the notes field of Get Info? I find I'm generating a lot of relatively low guidance (I'd love a 6.5 option) images and iterating on the prompts with an eye to what it's suggesting to the algorithm. As such I have no way to closely track what prompt was active on any output as it changes so often.

I strongly suspect the real merit of this approach is not the crowd-pleasing, 'set very high guidance on some artistic trope so it's forced to fake something very impressive', but rather the ability to integrate a bunch of disparate guidances and occasionally hit on a striking image. It's like the harder you force it into a particular mold, the more derivative and stifled its output becomes, but if you let it free associate… I'll be experimenting. Seems like getting the occasional black image shows you're giving it the freest rein.

Looking forward to 'image to image' a lot. I assume the prompt still matters, as it's fundamental to the diffusion denoising? Image to image means iterating on visual 'seeds'.

I've seen talk of textual inversion training: it would interest me greatly to be able to generate objects and styles and train a personal version of SD in a sort of back-and-forth iteration. The link to language is really important here, but so is the ability to operate as an artist and generate drawings, aesthetics and so on, to train the model. I did 440 episodes of a hand-drawn webcomic once, which had recurring characters and an ink-wash grayscale style I gradually developed. That means I have my own dataset, which is my own property, and certainly didn't make it big enough to make it into Stable Diffusion like say Beeple did.

Interesting times for the cybernetic artist. Basically computer-assisted hallucinatory unconscious, plus computer-assisted rendering. You could feed all of Cerebus (Dave Sim and Gerhard) into a model like this, panel by panel, and you'd probably get a hell of a lot of Gerhard out because so much of the panel area is tone and texture from him…




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: