Hacker News new | past | comments | ask | show | jobs | submit login
The Weird and Wonderful World of AI Art (jxmo.notion.site)
183 points by jxmorris12 on April 22, 2022 | hide | past | favorite | 56 comments



Hi, I'm the author of this post. I hope you all enjoy it! I researched and wrote this back in January, and although the main ideas are still relevant, the landscape of AI art generation has changed quite a bit in just three months. Here are some important new developments:

- DALL-E 2: https://openai.com/dall-e-2/

- Midjourney: https://twitter.com/midjourney

- Laion 5B dataset: https://laion.ai/laion-5b-a-new-era-of-open-large-scale-mult...

- Compvis latent diffusion: https://github.com/CompVis/latent-diffusion

Since the field is moving so quickly, this newsletter is a good way to try to stay on top of things: https://multimodal.art/news

Also I went on Yannic Kilcher's podcast to talk about this! https://www.youtube.com/watch?v=DdkenV-ZdJU&ab_channel=Yanni...


Glad you did this. I started working on something myself in January as I was dabbling in it but the whole scene was evolving so fast I was like, by the time I write it the piece will seem like a time capsule. The survey/history style is good and educational, maybe this will spur me to put out the effort and do my own..


> this newsletter is a good way

Thank you! ...RSS?

Edit: same "...RSS?" is valid for your blog, at https://jxmo.io/ ...

PS: the article: very precious summary!


I think this link works: https://jxmo.io/feed.xml


Thank you!

In case one day you'll want to increase its use, I suggest to put it at least in the homepage source (where we go check if we find there a literal 'rss', or 'feed', or maybe 'xml').


You all might enjoy some of the things I've made with it.

Tour of the Sacred Library — A short story illustrated with VQGAN+CLIP https://moultano.wordpress.com/2021/07/20/tour-of-the-sacred...

Doorways — A series of images exploring "semantic symmetry" using CLIP's embeddings to do visual analogy completion. https://moultano.wordpress.com/2021/08/23/doorways/

Depth of Field — Exploring the scale of the Hubble Ultra Deep Field image using CLIP guided diffusion to create visual analogies. https://moultano.wordpress.com/2022/03/24/depth-of-field/


Why don't you share the exact code for these experiments so that anybody can reproduce them? (and tweak them!)


Pretty sure Moultano's Tour was made with a hosted version of the original VQGAN+CLIP method https://colab.research.google.com/drive/15UwYDsnNeldJFHJ9Ndg... Though that method and implementation is quite old.

If you want an up-to-date list of open implementations, it's here https://pharmapsychotic.com/tools.html Whatever is the newest Disco Diffusion has been the best around for the past few iterations.


For the first and third, I can't, it isn't my code.

For the second, I have to get approval from my employer to release it, but I stalled out half way through the paperwork and haven't had the energy to keep pushing it forward.


It’s cool to see the impact colab has had here too. It seems like a huge enabler in the art community of people who would otherwise have tons of overhead to play around (getting and setting up the machine and libraries). It’s awesome that google just gives away this scarce resource (gpu cycles) for free.


I've recently had fun using VQGAN + CLIP, combined with slowly zooming in, for an art project on ecology: https://rybakov.com/project/metamorphosis/


It's crazy how far & fast AI art has come.

I still love the deep dream style tho :) Feels the most like the AI is coming up with it's own art by itself for other AIs which I find just the right amount of unsettling

Can't wait to try out the colab links later today! Thanks so much for the great article :)


A couple years ago I decided to fork StyleGAN, optimistic to kick the tires and just see if it would run on my desktop. I got bored after an hour or two and scrapped the setup, having felt it was too complex for someone with a casual interest.

Fast-forward to finding the link to NightCafe a half hour ago, where I just had to type a ridiculous phrase for it to spit out something that attempted to match my description after just a few minutes: https://creator.nightcafe.studio/creation/sPsmFbpijhePoRDwqF...

Still, I don't see using this for anything more than gags with friends right now. But it's a solid leap forward.


There's certainly a market for AI generated art. But, of all of the AI art that I've looked at, it all lacked "soul". It feels like writing I've read that was generated by GPT-3 - it has a very pleasant way of saying a lot of nothing.


My professional artist friend said the same thing. I don't think anyone could define the term if they wanted to. I see enormous beauty and great "imagination" in what DALLE-2 is producing, e.g.:

"Oil painting of a sad girl looking out the window of a school bus" https://twitter.com/jennifermarsman/status/15175974527649546...

“avocados dancing, drinking, singing and partying at a Hawaiian luau” https://twitter.com/TheRealAdamG/status/1517906900540657665

Can anyone verbalize what these images might lack, or might gain through "soul"? Objective answers only please.


For myself, I feel fairly confident that if I were shown a work and had to determine whether it was made by a human or by AI, that I could not reliably do so.


You may be right. It would be an interesting blind experiment.


"Art is what you can get away with" — A.W.


The linked post suggests that AI art started in 2015 (what it labels as "early forms of AI art"). AI art goes back decades--it didn't start with Google. Perhaps the author might want to look a bit further back than seven years.


Why should he bother with such 'Schmidhubering'? Contemporary AI art owes little to before 2014.


There is lots of digital art before them. I'm not aware of any particularly interesting neural network based approaches before then though.


I've been playing around with the Dall-E 2 website all day. It's simply amazing: https://openai.com/dall-e-2/


Who do I beg to get earlier access?


A year ago or so there was a video posted about an AI generated video. It was extremely scary. Almost like a emotionless psychopat with absolutely zero feelings towards humans. Anyone remembers where to find it?


There are hundreds on YouTube...


> ...the main development is the rise of *multimodal learning*.

> Multimodal learning, in this case, is learning to match up text and images. Our new models are really good at learning to write captions for images, and (more importantly for artistic purposes) to generate images that correspond to a given caption.

Can anyone explain this more? I think I recall Facebook was automatically generating captions for images automatically a good while ago, but this is something different?



I wish there were analogous tools for sound design



Thanks. I wish I had the time, experience and the brainpower to dive into that and build some Joe Public tools to play around with AudioCLIP.


Twitter users should consider following https://twitter.com/rpgmakerai and https://twitter.com/ai_curio for examples of AI art in their feeds


I have some links in the post! Here are some more good accounts: https://twitter.com/advadnoun https://twitter.com/RiversHaveWings https://twitter.com/danielrussruss



I'll add another - https://twitter.com/PasanenJenni does some fantastic work by combining AI generation with more traditional digital art; great stuff.


I'm not creating it, but I can't stop retweeting all the cool stuff I find: https://twitter.com/whyboris


Am i alone in thinking that the older ai art looks artistic, where the newer art just looks like superficial images, which while potentially technically impressive aren't really interesting artisticly interesting.


The CLIP stuff has its downside. Much of what I see looks like a collage of real images. Not so much art as graphic design.


Interesting article.

But the web page itself: the PageDown key doesn't work for scrolling, though the arrow keys do.


AI is doing fine art now. Wait until it learns to write good programs and we're doomed.


> AI is doing fine art now

(Not really. "Illustration", "graphics", somehow, yes; "fine", maybe; "art", no¹.)

> Wait until it learns to write good programs

...and we will be empowered of a power almost unimaginable.

--

¹Well, I just had to be more explicit for a post nearby, so I'll copy that: «[...] an artistic object is a concretion of a structure of meanings. It takes a proper intelligence to build that - you need conceptual depth, reflection etc.: you have by definition wait for a GAN»


searching the same phrases on google images returns more interesting "pre-generated" results


I really don't think it's art, cloud's in the sky are more "creative"


Probably the creativity is in the eye of the beholder in that case. Anyway there are billions of people that even don't think about art or not art. They look for pretty pictures to put in a frame and the only thing that matters to them is the price of the picture plus the frame. Best if it costs less than a meal. The painter or the photographer could be unknown or an AI, they couldn't be bothered. My point is that it opens a market for very low cost unique self made pictures and if it happens people won't care if it's art. Pretty would be enough.


It looks like art. It acts like art. It stimulates the senses and evokes emotion like art.

Whatever "creative" means, either the AI has it, or it isn't really relevant.


You missed the important point: an artistic object is a concretion of a structure of meanings. It takes a proper intelligence to build that - you need conceptual depth, reflection etc.: you have by definition wait for a GAN.

To elicit emotions (per se not quite a demanding endeavor) is not sufficient to be qualified as "art".


> an artistic object is a concretion of a structure of meanings

This is a very narrow definition that discounts a lot of art. I'm also very dubious that proper intelligence [1] is required to build that (GPT3's output could certainly be accurately described that way).

[1] I agree that there is no intelligence at work in these algorithms.


> discounts a lot of art

...Examples?

> could certainly be accurately described that way

Like everything, provided we take an indicative definition for an instruction manual where terms can be used loosely outside original intention.


https://duckduckgo.com/?q=abstract+art&atb=v279-1__&iax=imag...

https://duckduckgo.com/?q=generative+art&atb=v279-1__&iar=im...

https://duckduckgo.com/?q=spirograph+art&atb=v279-1__&iax=im...

https://duckduckgo.com/?q=kaleidoscope+art&atb=v279-1__&iar=...

https://duckduckgo.com/?q=macro+art&atb=v279-1__&iax=images&...

https://duckduckgo.com/?q=fractal+art&atb=v279-1__&iax=image...

https://duckduckgo.com/?q=polyhedron+art&atb=v279-1__&iar=im...

Any significant sentence GPT-3 spits out will have more meaning, structure of meanings, and concretion of structure of meanings than most of the images on those pages. Those images all show up when searching for "art", even though your narrow definition discounts most of them as such, which means your definition is faulty. I don't think you can claim all those thousands/millions of people are wrong about what constitutes "art". I also don't think you can sensibly proclaim a faulty definition of something as general (and text-inclusive) as "an artistic object" to be "the important point" and then get annoyed when someone points out problems with it.

Maybe you're thinking of replying claiming all the images I linked above have meaning in them, but that could only be the case if you used a definition of "meaning" so broad as to be meaningless (e.g. "a person selecting it provides meaning", which to me translates to "it elicits emotion" since that's why it was chosen, and you already argued against that). Yes, an art critic might claim to find meaning in totally abstract artwork, but they'd find more meaning in most artistic images generated with machine learning, especially if you don't tell them the source.


(Sorry for the delay.)

> Any significant sentence GPT-3 spits out will have more meaning, structure of meanings, and concretion of structure of meanings

In the naturally intelligent reader or in the machine?

> narrow definition

/Ostensive/ definition. "Narrowing" because it is a (piece of a) definition. (It excludes shallowness.)

> your definition is faulty

If the reading (interpretation) is.

> I don't think you can claim all those thousands/millions of people are wrong

Well...

> when someone points out problems with it

Apart from that "something in the air" of historic duration and consequences, but curiously unusually present here in the latest few days, of heightened communicational struggles, yesterday I thought of other recent exchanges here showing that approach: «Is this place still attended to by people who build things, or is this the turn of consultants who think their job is to shake their heads and present an invoice?». Also with reference with those - present in this club - that use e.g. logic or science to "close paths" instead of furthering them.

> Maybe you're thinking of replying claiming all the images I linked above have meaning in them

No. I am still thinking of replying that either the creator intended to represent a found structure of proper meanings - concepts refined in a path towards maturity - or the product is phony.

If I could kindly ask you to take a brief look to my reply to krapp yesterday: surely a critic may find meaning in a random composition - a critic-in-act would because "we" do, we recognize those structures of meaning we were "internally" developing into all that could represent them (aaand look, that's the Arts - normally, when we reproduce that intentionally) - but then, the art is in the subject, not in the object.

So: before you can have an "artist" AI, you have to have a "philosopher" AI. (Sophistication matters little; "modules" do.) Otherwise, you may have a "mockerer" AI, a "trickster", a "deceiver", if there is no substance, no depth under the surface.


>You missed the important point: an artistic object is a concretion of a structure of meanings. It takes a proper intelligence to build that - you need conceptual depth, reflection etc.: you have by definition wait for a GAN.

I didn't miss the point, i'm arguing the point is debatable. There are entire artistic movements whose thesis is to deny the necessity of "meaning" in art, apart perhaps from whatever meaning the viewer imparts to it. If you didn't know some of these works were done by an AI, you would see the 'conceptual depth, reflection, etc.' in them, but that emotional, empathetic connection one has with a work of art is entirely fabricated. These AIs respond to natural language requests in ways that can be indistinguishable from human effort, they pass the Turing Test for creativity without intelligence.

So I think it's valid to question whether art necessarily requires any such thing on the part of the "artist" at all, or if we just believe it does because we're afraid that if creativity can be generated by an algorithm, it's just another sign that we're nothing more than deterministic machines ourselves.


> There are entire artistic movements whose thesis is to deny the necessity of "meaning" in art

If such thesis founds the artistic production, then this will be the «concretion of a structure of meanings» around the «den[ial of] the necessity of "meaning" in art».

There's "a whole world" of bodies of knowledge and "persons" (individual expressions of the possible forms of being in a supposedly organic - consequential - relation to the richness of being) founding an artistic expression. Have a creature experience being and react to it in proper senses, then the expression may start being artistic.

> These AIs respond to natural language requests in ways that can be indistinguishable from human effort, they pass the Turing Test for creativity without intelligence

So the artist - provided that the quality is actually there - in that case is the reader. That is already part of art ("If I put in my books everything people read in them I'd be a genius", someone said - I cannot recall who). It does not mean that the original product is artistic, that "it has what it takes".

Also: Turing Tests are far from objective, with reference to length or time (etc.) measurements. They surely may pass in front of some Turing Test/ers/. Surely some students pass, after some professors. It is even recorded that some professors managed to have a career and be illiterate.

> or if we just believe it does because

Even if the creator were a «deterministic machine», there would be a complexity in play and the presence or absence of modules to make the difference.

Without them, easily the system will show to be brittle - take for example the visual classifiers that were at some point shown to "recognize" textures but not shapes, that will change response upon the change of an actually irrelevant pixel etc.

> to question whether ... or

There is no question in that false dichotomy: proper human creation has grounds through modules that current AI does not have. All modules must be in place.

Anything can "create" - "unintentionally", "unawarily" etc.: with limited depth.


That is exceptionally well articulated, fully on your side.


I felt otherwise.


And provided a non-point. Now think of the asymmetry between agreement and disagreement.

(Solution: payload already expressed vs payload to be expressed.)


It's not AI.


It is called in this specific sense of the term 'intelligent' the entity which writes the implementation of a function from some input to some intended output; the engineer which creates a writer of functions from some input to some intended output is said to have created an "intelligence".

If somebody wears a thicker coat during a "cold war", the problem is not considered terminological.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: