Hacker News new | past | comments | ask | show | jobs | submit login
How a Stable Diffusion prompt changes its output for the style of 1500 artists (adityashankar.xyz)
375 points by politelemon on Oct 2, 2022 | hide | past | favorite | 191 comments



What's kind of crazy is how the images tend to have similarities in small features that become very apparent when flipping back and forth between images, but which are not obvious per se.

For example, I flipped back and forth between Beatrix Potter and Paulus Potter. A rounded white bonnet in one picture becomes a couple of blossoms in the other. The roof of a house becomes some shadowy wall with plants in the other. Two flower pots are very similar, just with slightly different coloring.

It makes it more apparent that the algorithm etches the images out of noise, and if the seed is the same for two images with different prompts, you're likely to see traces of that noise represented differently but recognizable in both images.


The positioning seems very consistent, almost to the point where I wonder if that was part of the selection process to demonstrate the differences in style. There are only four per style, where the position of a subject could be a selection factor. Hard to tell if the position similarities are driven by the Stable Diffusion model or by the selection of representative images.


The composition and positioning come from the original seed. If the same seed is used, the same background image noise is applied for the initial image which is transformed into all the styles.

Thus the similarities you see would make sense if using also the same seed for the tests.


Yes, if you generate a bunch of images of waves or the sea, or other repeating patterns with the same seed, you can see how all the 'peeks and toughs' of those patterns line up in the same place.


Maybe it's based on img2img? Some are different enough that it's not obvious that's the way it has been done though.


Hey, I'm the developer of this project, no this is not based on img2img all the images just have the same seed


incredible job! i was just showing my own experiments with sd to a friend and this just take the cake. thank you so much for mixing in that artist list!!


If you play with Stable Diffusion enough this behavior becomes very apparent. Changing the seeds will give different results, but even relatively significant changes in the prompt will still find similar themes or layouts.


So interesting, and really seems somehow related to how we dream, hallucinate.. or even experience reality?


Arguably it could be different from our experience. It could even be a superior and more efficient methodology then the things our brain uses to imagine things.


from "is your blue my blue" to "is the AI's blue our blue" :)


The blue woman in Sam Bosma's style looks strikingly similar to the one in Noah Bradley's style. They even have the same 3 patches of pink on their cheeks.

https://static.adityashankar.xyz/gorgeous/hair_flowers_full/...

https://static.adityashankar.xyz/gorgeous/hair_flowers_full/...


Very cool! I recently made a game kind of like AI "pictionary" where the user has to guess the "artist", subject, and description of a piece of art generated by stable diffusion:

https://wcedmisten.fyi/project/paintingGuesser/

I tried to make something more general, but stable diffusion is fairly inconsistent in how well the output matches the semantics of the input.


This is really awesome


I've only tested craiyon but I'm sure SD and D-E the others fail the "shirt without stripes" test


This could be a good candidate for a wordle-style scoring system


It would be so much better to just have all of these 1500 pictures next to each other in a list with the artist's name under it.


Or at the very least with the list of artists alphabetically sorted!


They're sorted by last name, it took me a second too! I just wish I knew more of them.


Most of the names are sorted! Scroll to the very end...


Sorted by last name but listed by first name means you can start typing the artist's name and get to the right place in the list...


You can do it anyway assuming they used a moment js framework and took 2 seconds to configure the list order and search order


I've sorted them in alphabetical order now


I think the idea is for the comparison to remain spatially consistent because that's what it's trying to highlight!


Including a few real sample pictures from the artist, or a link to a google image search of the artist name, so you can check if the generated style matches.


Indeed, what I ended up doing was Inspect Element > Search the name I want > Add the "selected" attribute to the option.


YES! I got mildly irritated using that clunky list.


Funny that the Bob Ross version just makes them look like Bob Ross. Maybe there are more pictures of Bob Ross in the training set than his actual paintings.


Walt Disney makes the pictures look like promotional Disneyland pictures.


The same thing happens for Vincent van Gogh, and presumably others too.


A good measure for whether you're more of a celebrity or an artist is how much of your face a google-image trained AI thinks belongs in your work.


Well what if the artist has a popular self portrait?


Smoke comes out of the computer, and Captain Kirk notches another victory for humanity.


Frida Kahlo comes to mind.


She's a rare exception in that she's mostly known for her self-portraits. Most other famous artists are mostly known for other things.

Again, if the training data was labeled well enough, confusion about this sort of thing shouldn't happen.


Either labelled well, or you have enough data and a good enough algorithm so that the computer can figure it out.


> A good measure for whether you're more of a celebrity or an artist is how much of your face a google-image trained AI thinks belongs in your work.

That this happens at all is evidence that the training data hasn't been curated, cleaned, or labeled well enough.


While it is the case the dataset isn’t well curated or perfectly labeled, it could just mean that grammar is not understood - the labels could be clear to a human, whether the image is a picture of Bob Ross or a painting by him. But the training misses that relationship. Even with poorly labeled data, I suspect AI will eventually figure out which labels are more likely to be poor and deal with it appropriately.

In the reverse direction, you can try:

A horse rides an astronaut

And you will probably generate an astronaut riding a horse. It’s not a poor description of what we want; our assumptions about how grammar should work aren’t being honored.


Not a problem with the Banksy images!


Nice experiment! I would only change two things: (1) sort the artists in alphabetical order, and (2) allow users to write the name of the artist and show if it's in the list.

I'm saying that because it's a little bit tedious to search for the artist you're looking for.

A part from that, I find the idea super-interesting :)


You can also just focus the select element and enter the full name to make it jump to the right name. But a real input would be better.


They are in alphabetical order, but by last name. Took me a while to see that though.


Only up to a point, it consists of at least three concatenated separately sorted lists (the first one being the largest by far).


Though with names like "de Hooch" under "d" and names like "van Gogh" under "v".


I've added a search function now, I'll change it to be in alphabetical order gimmie a sec


done


I'm not sure what you did, but now the name you select do not match with the style of the painting :)


I’m seeing that too. Results don’t match the artist at all. Also, the names at the top are alpha by firstname, and all the Japanese artists are bunched up at the bottom.


I lecture in painting, and to me some of these are truly impressive. Not surprisingly, those artists is not predominantly painting (e.g. Josph Beuys) or is predominantly linear (Audrey Beardly) do not fare so well.

There is a lot of talk in our department as to how we might prepare our students for this technology. It is scary how fast it is growing, and how it is spreading to things like 3D and texturing.

One of my team is already using it in production. It used to take his artists three days to come up with five visual development ideas. Now he can get fifty overnight to choose from.


>Not surprisingly, those artists is not predominantly painting (e.g. Josph Beuys) or is predominantly linear (Audrey Beardly) do not fare so well.

To my untrained eye, these both look pretty good.

https://static.adityashankar.xyz/gorgeous/hair_flowers_full/...

https://static.adityashankar.xyz/gorgeous/hair_flowers_full/...


I agree, they are nice results. But Joseph Beuys is better known for performance art. In his most famous piece, he covered himself in gold foil, and explained art to a dead hare (below). To my mind, it makes little sense to apply this method to his art.

https://uploads4.wikiart.org/images/joseph-beuys/how-to-expl...

The Audrey Beardly results are a bit better, but if I wanted Beardly-ish drawings I would likely be disappointed. Beardly's lines were very fine... 'filigree', like a spider crawling over the paper, and he almost never made art in colour. Also, a lot of his work was as sexy as hell. NSFW last link (bottom of page) for relatively mild examples.

https://uploads2.wikiart.org/images/aubrey-beardsley/the-dan...

https://www.theparisreview.org/blog/wp-content/uploads/2015/...

https://www.messynessychic.com/2019/11/13/the-world-wasnt-re...


I'd be interested in hearing your take because you are actively involved in this field. From my perspective outside of it, it seems that it's going to be an absolute bloodbath in terms of opportunities for people to actually live as artists (excluding those who work in mediums that can't be represented on a 2D screen).


Yep. A bloodbath is certainly on its way. Our illustration program will be first in the figuring line.

We think that there may be some room for our students as 'high class' art directors. What will give them unique merit is their deep knowledge of pictorial formalities. Anyone can give the text hint 'flowers in a vase'. But what about...

'Move the camera down to avoid the strong coincidence line between the edge of the vase and the edge of the table. Change the saturation value of the vase to emphasize background/foreground contrast. Increase the amount of negative spaces around the periphery of the flower mass' etc.

Tek like Stable Diffuse may also lead to a resurgence of interest in natural media, like oil paint, water colour and suchlike.


I suspect that we'll end up with a split similar to amateur and professional photography, with generative models for the latter trained not on plain English but on something much more stringently structured, with ability to unambiguously specify many important but non-trivial parameters in the prompt, including for specific well-defined areas of the output etc. Probably with a GUI on top where you can literally highlight specific objects etc and adjust parameters, and it'll construct the query for the next iteration.


The concept of "derivative work" is pretty important in copyright law. I wonder if anyone has thoughts on this, in terms of this type of project. Should there be legal implications to this?

I know someone -- a completely unknown artist -- who used to make a fair portion of their living by drawing D&D characters for people. Unfortunately, orders slowed down, because someone can input one of his images into software like this, and generate endless variations in the same style. Should this be allowed?

Are images created "in the style of" a certain artist completely dependent on images created by that artist? If so, should that artist be compensated? Why or why not?


"Should there be legal implications to this? ... Should this be allowed?"

Even if there are laws against it, the cat's out of the bag.

There's no stopping billions of people all over the world making derivative works at the push of a button.


> the cat's out of the bag.

Yeah, just wait until Disney characters get copied and mixed.

I wouldn't be surprised if their lawyers are preparing to change copyright law ... again ...


Seems more like a case of trademark law to me?


A human artist can freely paint something in the style of another artist. It's not considered a derivative work. You can't copyright a style.

A derivative work is an adaptation, translation, or modification of a particular, existing copyrighted work.

If you asked Stable Diffusion for "Vincent van Gogh's Starry Night with a cat looking at the sky", you'd get a derivative work (although Starry Night is in the public domain, so you wouldn't be violating its copyright).


I’ve been idly working on a similar list but with a much more basic prompt.

The results vary wildly, even run by run, but I might put them in a few buckets:

1. Similar enough that someone who doesn’t know much about art could be fooled

2. Amateur knockoff but recognizable style

3. Influence is there if you know what to look for

4. Artist probably not in the training data at all

The last one kinda surprised me, for artists whose work is online and who have unusual names. I would have thought those cases would be really good. Maybe they ran out of disk space with all the porn?

Also interesting that it gets much closer for figurative painters than for abstract painters.


What frustrates me about Stable Diffusion is there doesn't seem to be any documentation as to what artists or vocabulary it understands. Generally people say "look at existing prompts or use various prompt generators" but that doesn't really solve the problem. I don't want to just look at what other people have randomly discovered; I want to know what the program really knows.


You can by searching the training image set and their text tags essentially.

https://laion-aesthetic.datasette.io/laion-aesthetic-6pls/im...


You can't really debug an AI. The dimensions of it's understanding are quite literally beyond human interpretation, which makes it both smarter/more efficient than humans while also extremely dumb and context-unaware. Most of our attempts at adding a 'memory' to AI has been a hack thus far, which is why all of these prompts consist of people force-feeding word salad down the AI's throat for generally reproducible results.


They're debuggable, with effort. Here's the finance neuron in CLIP: https://microscope.openai.com/models/contrastive_4x/image_bl...


For memory, check out:

1) Retro, which is essentially attention over large databases, and fast as hell.

2) S4 Layers, explicitly designed for handling long dependencies.

These are orthogonal approaches to memory, and both very effective at what they do.


It's the nature of ML models--nobody is 100% sure what it understands until they try something and get results.

It was given a lot of tagged data: 600 million captioned images from LAION-5B. So if you want to know what it might support, you could try any one of the captions from those 600 million images.


But why isn't the list of words from those captions available anywhere (at least as far as I can tell)? There may be 600 million captions, but the number of unique words would probably be 10 or 20 thousand at most, completely feasible to browse or grep.


Its kind of a weird complaint. If i am having a conversation with someone, i wouldn't be concerned about knowing the set of all possible nouns.


SD isn't a person you can converse with. It's just a program trained on captions and can do no more than what's in them. It's like those old adventure games that would always complain "I don't that word" except even worse because SD will happily make a picture with words it doesn't know and not tell you.


I think anyone who has both played an IF game and has played with stable diffusion knows there is a world of difference between the two.

The main difference is that coming up with a word SD doesn't know that's not contrived is really difficult. In an IF game, you are constantly guessing the correct word.


The complete vocabulary is available here: https://huggingface.co/openai/clip-vit-base-patch32/resolve/...

It's a bit less than 50k words, but that includes space-padded duplicates.


I haven't downloaded the database myself, but I imagine if you did it wouldn't be too hard to get that data. Looks like you can get the torrent here https://laion.ai/blog/laion-400-open-dataset/


I don't think the underlying model is word based, but character based. You could download the caption data for LAION and grep that, but it's not strictly 1:1 with what SD was trained against.


No, it's word based.

The vocabulary is here: https://huggingface.co/openai/clip-vit-base-patch32/resolve/...

It is contextual though, so words in different orders mean different things.


Huh, interesting, I had just ... assumed CLIP's tokenizer was character based, like GPT's was. At least, I think GPT's is character based?

Is there any reason it couldn't be character based, besides the (presumably very large) increase in resources needed to train and run inference? This is all way out of my league, but seems like you could get interesting results from this, since (by my caveman understanding) this hypothetical transformer could make some sense of words it had never seen before, so spelling variants or neologisms and such.


I started a proper reply but had to board a plane.

It's actually a byte-pair encoded (BPE is better than character encoding but can do the things you mentioned) list of things that includes words. You can find common English suffixes in it listed separately too.


Thanks for the responses, I really appreciate the help. My only background with ML is playing with LSTMs and simple sequence-to-sequence models back before transformers, and the last few days I've been trying to deep dive as much as I can into the "state-of-the-art". I dislike treating the technology as a magical black box...


Here's the response I half wrote before:

GPT (and many other modern NLP models) use byte-pair encoding. Your summary of the benefits of this is correct - it can deal novel words much better.

Byte-pair encoding (BPE) is better than character encoding because it can deal with unicode (and emojis).

CLIP uses a BPE encoding of the vocabulary: The transformer operates on a lower-cased byte pair encoding (BPE) representation of the text with a 49,152 vocab size

So strictly this vocabulary is NOT (just) words, it is common sequences of byte pairs. You can see this if you examine the vocabulary - you'll find things like "tive" which isn't a word but is a very common English suffix.


Thank you. This is really helpful. Yes, you don't know exactly how SD will respond, but for example you can grep celebrity names and can know whether SD has any chance of drawing a picture with them in it or not rather than just randomly guessing.


It's a word list, so as I'm sure you've already figured out, you have to grep first and last names separately. For example, "jennifer" as a first name is token 19786, while "garner</w>" is token 20340. If you want "james garner" instead, looks like that's tokens 6963 and 20340. Except, since it's a word list, there's still no guarantee that either celebrity is necessarily represented until you try.


You can download it here:

https://github.com/rom1504/img2dataset/blob/main/dataset_exa...

You probably would want to stop after getting the metadata, unless you have 240TB available for the images :)

More details and links to dataset explorers here: https://laion.ai/blog/laion-5b/


49,407 tokens, many of which are not useful. It's an arduous process to narrow down, so this link decided to go in the other direction, working from zero up rather than 49,407 down.


The simple answer is that there is no clean cut list of artists that it "understands". The model has no explicitly programmed concept of artist or style -- just the CLIP based text encoding used to train the conditional autoencoding part of the denoiser network, trained on (AFAIK) caption data recorded with the image.

So in practice asking for art "in the style of <x>" is sort of limiting the denoiser to statistical pathways resembling other images captioned "in the style of <x>". At least, that's my understanding. Still trying to grok ML and diffusion models.


You can create (or discover) explicit vocabulary in the model using “textual inversion”, or train more into it using fine tuning.


Honest question, would a solid understanding of the open training data help?

Having the art vocabulary down as well.

In effect, knowing what is present and how it’s tagged so you can « invoke » it more readily in the prompt-result.

Maybe I’m out of my depth. I know the corpus of tagged image used for training is enormous … but I still think that would help the user ( a prompt-crafter )


I downloaded the vocab.json mentioned above and I think it helps. For example it explains why I can get SD to make pictures involving Einstein but not Feynman. Feynman simply isn't in the training set, but Einstein is.


thanks for the follow up, that was my understanding but your example is telling.

I will have to check myself.

I tryed with minor success to have stable diffusion draw lesser know non-us personality. It always kind of work, but the palette is limited. For instance I tried charles de gaule and you get something that look like him. But he's depicted talking on a radio or waving his arm around like a politician.

I tried to make him to grocery or play volley ball, it does not really work. While Michael jackson or Dennis Rodman get a way better treatment.

edit : that vocab file is smaller than I thought. "De Gaule" is not in there but neither Einstein or feiman? I think I missing something.

I Can find "obama", trump or Macron. Unclear about Michael Jackson. No beyonce or Denis Rodman. hmm weird, I had great result with all of them. Like.. recognizable details like tatoo or silly glasses.


"It" doesn't understand anything.

It's just a very advanced madlibs engine based on a database of a billion of already known images.


Some people would probably argue that humans are essentially the same.


Whatever else this emergent "creative AI" phenomenon may or may not do, it's definitely touching nerves in people who still believe there's something ineffable and transcendent about the creative experience.


Stable Diffusion is literally copy-pasting existing artwork. It's not creating anything new.

If anything, this sort of "AI" only makes human ineffable creative experiences even more valuable, because without it there would be no training set and no "AI".


Whether or not its creating something "new" is debatable, but its clearly not copy pasting. It is basing the artwork on a combination of other sources, nothing is created de novo, but that is very different than copy & pasting


> Stable Diffusion is literally copy-pasting existing artwork. It's not creating anything new.

Do you have a source for that? Somehow I doubt there would be this much interest in a tool that literally just copies existing artwork.


The training set for human artists is the entirety of human art that they have been exposed to in their lifetime, even accidentally.


I wonder how constant everything else is kept, e.g. The seed. It's interesting that all the poses seem to align


I noticed that too! I'm guessing they used the same seeds, which does a great job of providing a comparison in style.


Yeah, they use the same seed, which is used to generate random color pixels. Then they algorithm takes it from there.


If you want to experiment with something similar by yourself and you don't have the patience to wait for Stable Diffusion to crunch through thousands of images on your laptop or in a Colab notebook, here's how you can parallelize processing relatively easily on AWS Batch or Kubernetes: https://outerbounds.com/blog/parallelizing-stable-diffusion-...


Hate to say it but when i see stuff like this it only reminds me of what we could have achieved if this ingenuity had been applied in another domain.

Can't help feeling that this accidentally harms creative types and risks swamping us with visual junk.

The technical achievment is astounding but no-one would seriously claim that crafting an image via a short prompt is creative except in the most cursory way.

I'm probably missing some life changing use-case, but apeing art in random styles can't be it.


People said this about cameras. About digital cameras. About digital photo editing software. The next generation will normalize these tools and find incredible ways to be creative within their new cutting edge medium.

The post-art world is here! Just think about how history books will remember this period! The styles that will be borne of necessity, of the need to break down art and find what makes it tick.


"People said this about cameras. About digital cameras. About digital photo editing software."

Also about desktop publishing.

Remember all the printers (ie. people working in the printing industry operating printing machines) that were put out of a job when you could just buy a (electronic) printer for your home computer and just print whatever you wanted yourself?

People were wringing their hands about that too back then... now we take it for granted that we can instantly print whatever we want whenever we want, without having to pay an expensive professional to do it for us (something most people couldn't afford).

Has it resulted in more junk being printed? Absolutely. But it also let people print all sorts of fantastic not to mention useful things that would almost never have seen the light of day without cheap and easy access to home printers.

The xerox copier was similarly revolutionary... as was the printing press itself, which put a lot of scribes out of business.

Photoshop put a lot of airbrush artists out of business, and who does copy and paste with physical glue and paper anymore?

As with photography, printers, copiers and photoshop, artists who embrace this technology will be able to use it to enhance their creativity and speed up their creative process.

There'll be a lot more competition, a lot more junk but also a lot more fantastic art that we can't even dream of yet.


> [...] and who does copy and paste with physical glue and paper anymore?

When I was finishing high school in the early 2000s, we still had teachers who made worksheets that way.

I remember one history teacher in particular. She used a photocopier to get sections from books, cut-and-paste them together, and then use the photocopier again to make the final sheets to distribute to the students.

I was very surprised at the time, but also admired the ingenuity. The process is much more physical than using a PC.


Yep, I see this as a start, and very curious to see the ways in which it’ll get used with a human in the loop, and also the ways human artists will be pushed to creat art that’s out of distribution for these models.


Even then, few artist got famous on technical skill alone, surely less than those who got famous primarily for their message irregardless of their skill. And this is before getting into the endless pit of defining what art is.

Besides having an ai doing the legwork is no much different than Veronese giving large swath of paintings to his novices while focusing on the two/three major parts.


I think we agree then - if new technologies allow an artist to more rapidly explore, iterate, and refine a particular message, then those artists should still have something to create beyond what is possible with these images.


Yes! Was expanding and building a little.


> People said this about cameras. About digital cameras. About digital photo editing software.

Did they? Because I don't think they did. I think most people were amazed by all these technologies.


https://daily.jstor.org/when-photography-was-not-art/

Research beats idle speculation.

Google "photography skeptics early history"


From the article you linked:

> As long as “invention and feeling constitute essential qualities in a work of Art,” the writer argued, “Photography can never assume a higher rank than engraving.”

Ha, dissing engraving at the same time as photography.

Though I wonder if the 'write' meant engraving of a design someone else already produced, or any engraving work at all.


Both reactions always happen. With basically anything new, people will select some points via happenstance or bias, draw one of a few basic trend lines [1], and give a hot take. Because they generally think only about first-order effects and don't imagine other things that could happen, the hot takes are often of the utopia/dystopia variety.

These hot takes generally tell you more about the opiner (or the audience they're playing to) than the reality to come. It turns out it's hard to model en entire universe using 3 pounds of meat.

[1] Heinlein listed some of them way back in 1952: https://archive.org/details/galaxymagazine-1952-02/page/n19/...


Most people haven't heard about recent advancements in image generation. When they do, I expect they will be amazed.


True, but then we've essentialy had limitless image generation capabilities since we've had the tools to make marks. I guess this is faster, and in other ways it offers promising new opportinities for people who can't / don't want to learn to create stuff directly.

Others are interpreting my original comment as "this is not art", but I'm not really trying to make that argument. Art is entirely subjective and i don't presume to define what is or isn't art.

I guess my point is more specifically "what itch does this scratch"?

It's really cool, and that may well be the answer tbh.


people who can't / don't want to learn to create stuff directly.

That’s 99% of the people. I mean even to learn prompt engineering will probably be too much for majority of those 99% people, but it’s a huge step forward in user friendliness, compared to, say, photoshop.

”what itch does this scratch"?

How many people post pictures on social media? Many of those pictures are not personal, they show something pretty, cool, or interesting in some way. All of those people can potentially use image generators to achieve the same effect.


> only reminds me of what we could have achieved if this ingenuity had been applied in another domain.

I hate arguments like this. Even ignoring how dismisive it is of the achievement at hand, why would you assume ingenuity is transferable like that? Someone who makes a breakthrough in physics is by no means likely to have made an equivalently ground breaking advance in biology if they had decided to study that field instead.


I think this phrase actually means "I don't want to seem like a Luddite, but now that AI is disrupting something that I personally care about, I'm no longer enthusiastic about progress"


Aside from the fact that I explicitly praised the achievment, my point actually relies on said appreciation.

I guess my musing was hypothetical but I was careless in communicating that. I get that we can't centrally plan innovation or human effort - and I certainly wouldn't want to live in a society where this was the case.


Respectfully, you can praise the achievement while still being dismissive. The two aren't mutually exclusive.


I am a filmmaker and most films are essentially crafted this way. Beyond hiring and securing resources, directors essentially create by communicating ideas in short prompts because there isn’t enough time to do anything more.

I could absolutely see an AI model doing the job of an entire film crew. I have issues with this, but only with respect to the longer term aggregate affects on culture in the broad sense. I cannot honestly believe that much would be lost from the perspective of one project or another.


I'm of the opposite opinion. AI assisted art is simply the natural next chapter for "art" as a whole. It will finally kickstart the public discourse about what being an artist means in the perspective of artistic vision vs execution.

Most artists spend their lives not refining their brush stroke, but rather their eyes. The way I see it, the impact of curation and artistic direction will matter more and more in the future.


As with programming, an AI model cannot replace the key parts but can help automate the monotony.

For me it's exciting to use as placeholder art and then have a 'real' artist review it.


i find myself mentally unable to comprehend people who believe that the drawing part of drawing is monotony. discovering that this mindset not only exists but is widespread has been equally as disturbing as any ai advancement.


Not me, I enjoy drawing; perhaps I should've used the word 'manual' instead


I would look at it as more of "allowing normal people access to unbefore-dreamed-of levels of draftsmanship" vs some comment on capital-A Art. It allows non-artists to express themselves visually.

I'm biased: I've been working on an image generation app. But the beta users I've had so far will generate fifty or a hundred images in a day. That isn't a use case traditional artists support.


Well, robots have been better than humans at playing chess for decades now, and we still have chess players.


More impressively, chess has not only survived as a human past time. But we still have professional human chess players.


> reminds me of what we could have achieved if this ingenuity had been applied in another domain

It will; I think the reason we're seeing diffusion models applied to image generation first, is that it's a task that meshes well with the models. But also in general I think people will still be guided by the principle "use the right tool for the job" - this is just another tool. I doubt that the set of paths toward realization for any given needed creative imagery collapses to just "use a model"


I don't know why everyone assumes that ML researchers have some big map of the future where they can make decisions like "yeah, let's choose this branch over here, where AI gets good at generating art first, rather than that other one where it cures cancer a decade earlier." The breakthroughs come where they come, and no one knows where some model architecture will have an application in the future.


what other domain would you suggest which requires roughly the same skill set?


Music composition?


I fail to see why that would be more beneficial than to generate art.


Who said it was?


Bosch is marvelous. Mucha and Monet are good. Michelangelo tries to be more like his sculptures. Not sure about Walt Disney and Roy Lichtenstein, and Rembrandt version is especially ugly. Maybe it will worth to get rid of "van Rijn" in that sample.

Overall: the impression is better with author's popularity. I think if we train the model only with well-tagged filtered dataset - results may be much better, but we effectively will get a 5-year old Prizm app.


It is sort of funny/interesting -- I only tried a few, but famous anime or manga artists (try "Junji Ito" or "Hayao Miyazaki") seem to have at least one picture that is clearly the result of the algorithm picking up on their fans' art.


It's really interesting how good some of the results are, but the styles seem to be completely mixed up. Just looking up Claude Monet, and there isn't anything impressionist. Leonardo da Vinci, on the other hand, give results that rather look like British impressionist paintings.

I've seen paintings in the style of Donato Giancola that really looked like his style, but in the examples of this site none of the result do. Maybe there needs to be some adjustments to the prompts?


The Jackson Pollock one amuses me. He didn't really draw such symbolic pictures afaik, so applying the style makes little sense.


Interesting, how with billions of nodes and supposed "intelligence", the network hasn't been able to deduce a simple concept of symmetry in human faces. All of the eyes and all of the lips in all of the pictures are asymmetrical, which easily gives AI generated images away.


"Facial symmetry === beauty" is not that old of a scientific concept, relatively new if you compared to how long humanity has existed before someone started to really study it.

And even so, too symmetrical faces will look just as un-human as a face that is too asymmetric. You need a face that is just the right amount of symmetrical in order for it to actually look good.

I think you make it sounds simpler than what it is.

It's also not a model that is trained to make as realistic people as possible, it's trained on a lot of different things, so obviously it won't excel at making realistic people. But one can easily imagine that some future models will be heavily trained on making realistic people rather than semi-realistic everything, like Stable Diffusion is trained to do today.


Many of the artists in this list specifically try not* to have this symmetry present in the faces. It's what makes many styles of art separate from just taking a photo or simply going outside.

The system used here is actually astoundingly good at producing many artists stylesbecause it's not going for symmetry.


It's true not only for these artist sets, but for photo sets as well, first noticed that with ThisPersonDoesNotExist.


Midjourney is way better than this, though it too can produces some weird results... but at its best it's absolutely indistinguishable from photographs or art made by humans.


It feels like they really only track local continuity without any meaningful knowledge. Hands are wonky and wrong in ways that don't look like any drawing I've ever seen. Horses with five or six legs.

Humans also have a hilariously hard time drawing bicycles, but at least we pretty much always nail the number of appendages.


Nice, list could be sorted though


It’s sorted by last name


Interesting that it seems to have no concept of the filmmakers they tried to include here - Tim Burton and Walt Disney didn't produce anything recognizable and look to me like the default stuff you get without providing a style.


We need to solve more captchas


It doesn't really work. I just tried "Vincent Van Gogh, and the images have absolutely nothing to do with his style. Same for Gaugin.

If it doesn't work for such famous painters, hard to trust it for the other ones.



oh ok, looks much better indeed


Honestly I don't think its very good at emulating style. It picks up some things but often times misses the heart of the matter.

Did some spot checking with some of my favorite artists. Rockwell's paintings are all about storytelling, clearly not present in the work. Their emulation for frazetta doesn't look like frazetta's work at all. HR Giger emulation is a joke. David finch at least gets a penciler's style but misses the use of solid blacks and dynamic posing. Frank Miller doesn't look like miller's work at all. etc etc etc

This list goes on and on. Personally while I understand using the 'in the style of' as a way to change the image results, I think in many many cases the results just don't look like the art of that artist.


UI should allow you to choose from the images and get the artist name.


Truly impressive, though it gives me some great fear for all the people whose career is art. This will likely take the pareto to a new height instead of 20/80 maybe even 1/99


Their career isn't art anymore.

Just in case anyone needed to see this spelled out.

The people making fliers have been replaced by AI prompting overnight.

The people doing contemporary fine art with their audience are unaffected.


> The people doing contemporary fine art with their audience are unaffected.

It's not a particularly well hidden secret that the contemporary fine art is really not about the art or the artist.

https://www.youtube.com/watch?v=ZZ3F3zWiEmc


this seems exactly backwards

SD isn't outputting clean graphic designs with sensible content

And haven't there already been people winning contemporary fine art contests with SD? lol

https://www.reddit.com/r/StableDiffusion/comments/x2n0r1/aig...


> The people making fliers have been replaced by AI prompting overnight.

I haven't seen good examples of that yet but I'm curious how push-button you can make this. Flyers, web design and UI design require the copy, layout, information hierarchy, colours, illustrations, branding etc. to be cohesive so it's a different problem space with way more constraints compared to generating a single image.

If getting the final design requires a lot of rounds of prompting and tweaks, busy people are going to outsource this still (in the hope the prompting and feedback needed to the person doing the work will be less).


A lot of designers are artists that needed an income. Design jobs will likely be dramatically impacted by this.


> The people making fliers have been replaced by AI prompting overnight.

Hmm, no? Do go and try to make a flyer with any of the AI generators. I’m not saying it can’t happen one day, but the current tech is not there.


Its a post about how artists monetize.

That group will never be selling their own work as contemporary fine art but want that prestige, and the one way they had to make table scraps with that skillset is now gone.

A different person is doing their own flyer art with AI and adding words around it themselves, as evidence by my last months worth of fliers that have reached me. Promotion companies have always been up on trendy tools for differentiation.


Flyers are an extremely low bar that AI sails over.


show an example?


I used to draw the flyers when my band played shows. People still went to the shows. If had iterated a few times with one of these models to draw the thing that I wanted to draw rather than deluding myself, and spent 5 minutes polishing it in photoshop, the product would have been orders of magnitude better imo.


So when you say flyers, you're talking about the kind of flyer zero people are employed to make. We're talking about paid artists here.


Art is rarely about the product. Some color on a canvas has been traded for millions, while anyone could just have dropped a bucket of paint to create a similar result.


After checking some 19th century artists I see it failed hard. All of the responses look the same, there isn't enough data in the training set to differentiate actual styles beyond "vaguely realistic".


Just read the "Profession" by Isaak Azimov :)


Stable Diffusion now makes art history a desired line of study.


Wish there was some AI tool that could do some proper ligne Claire. It seems to be quite impossible. Oh, as is Escher, but I do understand that one.


Great job, always great to follow SD news! We’re missing Picasso? Salvador Dali not sure, maybe expecting something different.


I like how the algorithm sometimes confuses the style of the artist with the facial features of the artist. Try Bob Ross


This website should include a google-image-search link to the artist, so you can compare the styles more easily.


How a Stable Diffusion prompt makes working artists sad and grumpy about huge abuses of “fair use” in 1500 ways.


I sure would love to see this systematically improved.

Angel Adams, for instance, clearly wasn’t too present in the dataset.


From an artist to a brand identity.


How come the faces features and body parts are almost always at the same place on the image ?


I suspect they're used the same seed for every iteration, which is great cause it shows the same context in different styles.


Likely generated from the same random seed.


The result for H. R. Giger didn’t quite live up to my expectations.


Got to love how Bob Ross style gives everyone a huge afro.


David Hockney was a big disappointment.

Edward Hopper was pretty impressive.


I wonder who is analyzing the weights of the model. At what level is the dimension of "artist name" represented, what is above it and what is below ?


... boy I was hoping for Barbara Krueger.


Features John Wayne Gacy...


Fix the list: Hunter Biden is missing


[flagged]


> What is this list?! No Picasso, Renoir or Vrubel present

Comments like this are why people think HN is nothing but a place to receive a shallow dismissal of your efforts.

It's a list. Make your own.


“Be the change you want to see in the world” has kept me from posting many unhelpful comments.


> Comments like this are why people think HN is nothing but a place to receive a shallow dismissal of your efforts.

whatever


What is this?! One word comments?!


Yes, it’s been like this for a while.


We are going to need a new patent system.


Patents have nothing to do with protecting art: they are supposed to protect inventions.


How can we set up Stable Diffusion on our own Linux servers, so we can generate NFTs or whatever?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: