Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The first time I saw an AI creating images in like 2018 or so, I remembered the scientists called it "dreams" and I thought it was so appropriate. The constantly morphing landscape and how things blend together to form new things and how almost everything is wrong when look at closely. It was an interesting description, and I didn't care much more than that.

Then the AI got better, and still the dreamlike things never go away. The constant problem with the hands, words never properly formed, clocks always look wrong, etc. But now it is more cohesive, more "solid" but still malleable. Like a lucid dream. Still straddling the border between consciousness and unconsciousness but there is a hint of control and direction now.

As someone who lucid dream, I think the AI images and the LLMs are just like a person dreaming right now. They sort of have control, but actually not really. Kinda hard to explain but even when I lucid dream, I know it is a dream and I can bend it to my will but at the same time, it isn't possible to control my thoughts. Trying too hard to assert control and I wake up. So it is still a state of unconsciousness for me and not at all comparable to the "me" when I am fully awake.

Of course, the AIs can't wake up if we use that analogy. They are not capable of anything more than this state right now. But to me, lucid dreaming is already a step above the total unconsciousness of just dreaming, or just nothing at all. And wakefulness always follows shortly after I lucid dream.



A more conservative, but less entertaining and less poetic model is:

You lucid dream when your "explainers" and "predictors" are operating on the random feedback and subtle noise that's present while your mind is defragging and moving data from short term to long term memory.

Random noise, with low signal, fed into something that is trained to fill in details kinda explains how Gen AI works, up to a decent approximation.

Your mind has a lot of these predictors and detail fillers. All those are trained on your experience. I think dreaming is either tweaking those predictors and you get to watch, or it's in fact a junk signal you're not supposed to be able to see, but you happen to be kinda awake enough to form memories from it.

Using words like "wake up" imply deeper and darker depths to these networks that I doubt exist.


Descriptions of psychological phenomena organized around made up nouns are not necessarily more grounded in reality than ones that make creative use of existing verbs.


You'll notice we're all in lala land here. You can have a meaningful conversation at a speculative level.

But the nouns are analogies for regular old predictors and detail fillers we know exist. For examples, point to your visual blind spot quick. Or go watch your eyes move with a mirror. Or do any of the really neat experiments in the Coursera course "Buddhism and modern psychology".

I'm extrapolating, and yeah, landing right in the middle of "who knows".


I feel like lucid dreaming is in a way the baseline and that when we're awake, reality constantly knocks us back into groundedness. In a lucid dream there is no error correcting signal to keep our fabulations in check. Things tend to drift away like the silhouettes of an infinity mirror. Reality is like the flash of solid pixels when switching away from the hysteresis stream in a shared window Zoom call.


As someone who lucid dreams frequently, and has done a good deal of experimentation in that state, the fundamental difference between your sleeping state and your awake state is that some of your brain is just 'off'.

The parts of your brain that are on do their best to match your experiences (or just replay experience), but for things that concern the off parts (counting, reading, reasoning...) those elements are just missing. When you look at a clock it doesn't look like nonsense, it doesn't look like anything, as the part of your brain that interprets that information, and would therefore be necessary to simulate it, just isn't present. You 'see' all of the visual qualities of a clock kind of jumbled together the same way a lot of generative AI produces them.

The feeling I get from messing around in those states a lot is that there is no 'baseline', just some things that are on, and some that are off. When you dream, it's the visual cortex that is on, so things concerning light and shape and texture exist and you experience a slimmed down consciousness composed of those pieces. When memory turns on, you experience references to other things, and chain thoughts together (I think this is what separates normal and lucid dreams).

So far, those are the only two things I can turn on without waking up. Anything that uses reason and bam I'm awake. If I count, read, or recognize inconsistency, it's over.

Consistency is possible, but only forward facing. Like you can string together thoughts and events with purpose, and a lot of them will be consistent, but the moment you look backward and try to ask "did y follow x" the lights come on.


> but for things that concern the off parts (counting, reading, reasoning...) those elements are just missing

> Anything that uses reason and bam I'm awake.

Tangent, I can reliably wake up from nightmares / bad dreams by going "oh that doesn't make any sense, that can't be". There's a moment of the bad dream "trying to adjust" but then it feels like something just gave up.


Wow, absence of error correction explains why dreams are the way they are... No wonder my dreams always start out normal and mundane, but end up with extremely exotic events in the end (even though I don't realize it during the dream).


I did notice that AI 'art' basically got less interesting as it got 'better' (read: higher fidelity). The collages of conceptual spaces in the article are infinitely more interesting than any of the prompt-driven attempts from the popular models. They actually communicate something about the thing they're depicting by showing the ways in which the model deviates from it. That's a real artistic expression.


Less interesting, yes.

Less appealing and useful, absolutely no. Look at how the midjourney sub grew. It was ultra niche when it was v3 'dream scenary'. Its mega popular with v5 'ultra-realism'. Art involves technical execution, significantly so, not just 'artistic expression'.


The explosion in popularity of AI imagery isn't because the results became more artistic, but because they made it easier for people to mimic technical execution without understanding or caring about making art.

Put differently, it made it possible for them to have the technical execution without being able to understand or change why certain things are appealing to them. That's why most of it looks the same and why the vast majority of the results don't have any appeal beyond satisfying the ego of the person who generated it.

In a way, the appeal of 'modern' AI generated imagery to its users is similar to someone who thinks chess is cool, doesn't care enough to understand it, but wants to play anyway, so they use a mediocre chess engine to generate their moves against other players and rely on playing a massive number of games to make their win count look good.


> The explosion in popularity of AI imagery isn't because the results became more artistic, but because they made it easier for people to mimic technical execution without understanding or caring about making art.

It's almost like when people first started buying really fancy DSLRs with no other training, and took a lot of magazine-quality amateur photos.

SD raises the bar a little though-- it trained on a large body of professional images, so generating anything non-fantastic is probably going to be framed and composed half-decently.

...which is the problem. It's all too perfect. Wabi-sabi is dead; everybody can impersonate a professional artist/photographer now, so everything they're producing is the equivalent of photorealistic, award-winning, 8k motel art [...by Greg Rutkowski].


I disagree on the "all too perfect" part. It's pretty much exactly like when people first started buying fancy DSLRs and so they took a lot of photos that were technically great but weren't interesting content wise.

The images generated by SD etc are technically great, but the content is all 'same-y' with no individuality, to the point that people used to seeing images of that kind of art can very easily recognize when they're looking at AI generated images, even if they don't have any of the telltale technical errors (like poorly defined fingers).

The difference is just that getting a fancy DSLR was still on the path to potentially developing photography skills, while using image generators isn't. So while those interested enough in photography might've eventually developed their skills further, those using image generators just become abusive and disrespectful towards actual artists for not 'acknowledging' them as artists.


It's perfect until you want something specific, then it's a pain in the ass. Even with LoRA models and ControlNet and everything else, the lack of any essential control is always going to be limiting.


To play devil's advocate though, there's nothing wrong with creating something just to satisfy your own tastes. The problem comes with trying to pass yourself off as a professional artist using these tools.


Yeah, I agree with that, nothing wrong with people generating images for their own satisfaction and even sharing them as long as they're clearly tagged as AI generated. Just like how computer assisted chess is allowed to be its own competitive thing called 'Advanced Chess'.

However, just yesterday I came across someone trying really hard to play at being a pro artist, trying to show recordings of their drawing and painting process when they were obviously using AI (by roughly tracing over the AI image, masking the AI image underneath, then recording themselves erasing the mask, revealing the AI image underneath).

In certain artistic sub-fields I'm interested in, like with anime art, I've seen a lot of vitriol thrown around by AI 'artists' because they haven't been through the learning process through which they could empathize with actual artists and the basic courtesies that exist in the community. This has been largely responsible for my negative opinion on the social value of recent developments in AI generated content.

To give some examples, there was an incident where an artist had been streaming themselves drawing something, only for someone to take the incomplete image from the stream, have an AI 'finish' it and post it as their own before the stream even finished. Then there was a more serious incident where someone finetuned a model on a specific popular artist's work and started to sell generated images under the original artist's name. This one ended with the site hosting the content having to change their rules to ban monetization of AI art.


> Art involves technical execution

I think it's a stretch to call Midjourney output art and technical execution. I think it's more "content" and "regurgitation". And most, not all, of what trends on the subreddit is pop culture mashups. It grew because people love Star Wars and Disney and Game of Thrones and so on and so forth.


> Less appealing and useful

Further instrumentalization of art to produce a median state of being appealing is IMO not a laudable goal. Everything Adorno predicted about the general course of the culture industry in Capitalist economies was more or less correct and AI art is just accelerating these trends.

Map this same logic to how we use language in creative works: would it ever be desirable to reduce all of language’s diversity and variability to a similar state? How is that distinguishable from the kind of Orwellian horrors that Americans love to decry?


The dream-like appearance of AI generations is really interesting. Humans, when creating art, use our representations for objects to control muscles to move a brush or pencil across (digital) paper. What we see from AI is what you get when you remove the "muscle module", and directly apply the representations onto the paper. There's no considering of how to fill in a pixel; there's just a filling of the pixel directly from the latent space.

It's intriguing. Also makes me wonder if we need to add a module in between the representational output and the pixel output. Something that mimics how we actually use a brush. How we consider what to paint where.


>Then the AI got better, and still the dreamlike things never go away. The constant problem with the hands, words never properly formed, clocks always look wrong, etc.

Might be a bit behind the times, Midjourney seems to have completely solved the hand issue from what I can tell.


Read the next sentence after your quote.


> Of course, the AIs can't wake up if we use that analogy.

One day in the next decade or two, they will. And it will be terrifying and glorious.


As I see it, Stable Diffusion is a camera you can tune to take pictures from alternate realities. The images it creates are of real people and subjects, just in some parallel universes.


Photo-surrealism is one of the most interesting facets of AI artwork in my opinion, and the closest thing to legitimate artistic expression since by definition you're trying not to replicate something. It's a shame all anyone wants to do is anime porn.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: