Hi HN,
I've been playing around with AI art (who hasn't) and wanted to share the results of last nights run: 300 auto-generated plants in a big collage that you can zoom into and explore.
My end-goal with this is to work on some sort of speculative biology project. GANs like the one I used here are great for generating nice-looking images in bulk, which could then be curated and refined into some sort of consistent work.
Happy to answer any questions on how this was generated or on AI art in general :)
and to me the voynich manuscript, perhaps this AI plant generation could provide an additional clue to the meaning of the fantasy plants in the manuscript!
I'm curious what the state of AI generated music is. I heard a quote once that "music is math for people that don't like numbers" which struck me as particularly insightful. All the things that people appreciate about the instrumental aspect of music are mathematical in nature (rhythm, chords, progressions, etc.), at least at a superficial level.
Apart from vocals and lyrics (yet), it seems that AI ought to be particularly well suited for creating artificial music that actually sounds decent. Sure it's not going to be pushing any boundaries, but I'd be surprised if it wasn't able to churn out some catchy melodies.
Some sounds OK, most sounds like a series of plausible notes that nevertheless don't actually go anywhere. Most generative models tend to do OK at predicting the immediate next note, say, but they can't keep a macro-level structure unless you force that as a constraint. For me what has worked best is generating music and then re-interpreting that with some manual post-processing to get it into a less jarring form.
A fairly recent project that is close to the state of the art: https://openai.com/blog/jukebox/
One of my own experiments: https://datasciencecastnet.home.blog/2021/05/13/whistlegen-g...
Really? My initial reaction is to disagree, though I do think I understand your point.
While the medium is music is purely physical, and physical things can be abstracted to mathematics, what do you think about the generation/composition of music. Most music isn't really composed of instruments playing single notes; it's composed of things like phrases phrases and expressions. The math is just an obtuse way of representing the phrases, emotions, and expressions.
The result of that is what is played up and down the charts, learned from what sells 'records', and embedded into almost anything new to be more and more refined to the taste of the majority/masses. Also the stuff playing in supermarkets to make you buy more.
This is cool! Thanks for sharing. I've played around with some random Colab notebooks that have surfaced on HN but the results have been underwhelming compared with some of the polished AI Art I've seen in the wild. Some questions that popped into my head:
What's your setup (Cloud/Colab/Custom hardware)? Did you borrow the code in its entirety or is there a secret source? How long did it take you to fiddle around with the hyper-parameters until you were happy with the results? How many iterations did you settle on before stopping? Thanks!
re: secret sauce, I tend to be in the 'sharing is caring' camp. The code for this was based on the popular notebook by @RiversHaveWings (VQGAN+CLIP) although I've edited it back and forth a few times.
I usually run for a few hundred iterations (eg 250).
EDIT: here's a Google Colab that replicates my plant generation: https://colab.research.google.com/drive/1b1UfblpdhPJ7f1WRjfC...
My guess is that you've solved this already but I run it in a loop by replacing do_loop() at the bottom of the last frame with the following:
import random
prompt='crystalline alien spacecraft on a flower'
modifiers=['',' macro photography', ' with hyperrealistic 35mm depth of field']
original_prompt = prompt
while True:
for modifier in modifiers:
prompt = original_prompt + modifier
seed = random.randrange(1000)
tqdm.write(f'Prompt {prompt}, Seed {seed}')
do_run()
Dang it! I knew this existed but I couldn't find it for some reason. Thank you!!! This has been a wonderful journey...my poor kids have been getting spammed with my latest incantations for the past week lol.
I keep trying to think of ways to make this more consumable for the less technically inclined, but GPUs are just not cheap. I was on a V100 for probably 36 hours through Colab via my $9 monthly subscription, that'd run $100 on AWS for the same amount of time on a p3. So then I'd need to figure out a tip jar or something to keep the gas tank full.
Had a play the other day - I love it! But it's quite slow, so for most of my experiments (which involve video and motion) I haven't yet found a way to incorporate it into my workflows without slowing things down considerably.
Thanks for sharing though, the work she's doing is so great!
Not dumb at all. This code is at the bottom of the notebook cell that actually generates the image. By default it's a one-shot deal, this allows me to continuously generate images with some perturbations in the input to build in some variety (seed value, some prompt 'modifiers' that tend to have a style transfer effect).
I'm curious, is there an advantage of doing style transfer implicitly via adding text over explicitly by providing a target image to copy the style from?
Thanks for taking the time to reply to my Qs; all useful insights and is much appreciated.
Mostly I'm excited about the direction and innovation velocity that AI art is going, but partly I'm anxious about what the eventual implications will be for human artists.
Packaging is meh, but you can use my repo if you want to generate your images using the CLI, a discord bot, or an IRC bot with your own hardware: https://github.com/luc-leonard/clip_generators. It includes vqgan+clip and guided-diffusion models
It can be hit-and-miss. With a bit of trial and error you start to see which prompts work well to generate pleasing images. Some people have done excellent work exploring ways to change the look, like https://imgur.com/a/SnSIQRu
As for hardware, a lot of my experimenting is done on Google colab. For this plant generation stuff I rented a GPU via vast.ai for ~$0.20 an hour and set it running overnight.
I tried generating some fruit - you can see some outputs in the notebook I just shared (https://colab.research.google.com/drive/1b1UfblpdhPJ7f1WRjfC...). Not quite as pleasing as the plants but I'm sure with a bit of fiddling with the prompt we could get a few good ones.
I'd be lying if I said I hadn't considered setting up a website at thisplantdoesnotexist.com haha luckily I changed my mind.
I could try some fruits - let me set a smaller run going and see that it comes up with.
To me this is probably the most interesting / delightful.
The one with persons is useful, this one is something I'd like to order as a print or maybe even as a wallpaper for my company headquarters (haha, it is currently <<10m2) sometime in the future :-)
I'm happy to share the original JPEG if you want to print this out :) I have it in A2 on my wall although at that scale the individual pictures are a little small.
Thanks for sharing, this is cool! But in most of such projects I have seen is lack of continuity in generated content.
From a bird eye view things look natural but as soon as you observe the content, a little closer, things start to fall apart.
Agreed. When I'm using this as part of a larger workflow I'll sometimes use the GAN to generate a starting point and then manually edit things to remove weird inconsistencies. The output is also quite low-res so it needs tweaking. Of course both problems can be avoided by making the individual images so small that the weirdness is hidden - as I did for this project :)
No, I think that's fair in this case. Most of my stuff involves a lot more post-processing, tweaking, editing etc to convey something specific, in which case I claim 'art' more liberally :)
Fantastic stuff! I love the painted over pieces (assuming they are painted over). The pronounced details and refined features almost add another level of "dreaming" on top of the AI nightmare. Gloriously creepy and fun.
I made a prompt generator that creates descriptions like 'A biological illustration of an alien plant near sand, watercolor'. A model called CLIP gives a way to compare an image with a text prompt and then I can optimise the image to match the prompt
You've been able to get some interesting output! I've been playing with the following notebook: https://colab.research.google.com/drive/1oA1fZP7N1uPBxwbGIvO... but my results are not quite so well put together as yours. What are some tweaks you've made to the model? I'd like to better understand what is going on under the hood.
The big difference is the model used to generate the images. VQGAN (which I use) tends to look a lot nicer to my eye than some of the other approaches.