Hacker News new | past | comments | ask | show | jobs | submit login

Diffusion Bee is the easiest way to run Stable Diffusion locally on your M1 Mac. Comes with a one-click installer. No dependencies or technical knowledge needed.

Features: - Full data privacy - nothing is sent to the cloud - Clean and easy to use UI - One click installer - No dependencies needed - Multiple image sizes - Optimized for M1/M2 Chips - Runs locally on your computer




Does it work with pornographic or potentially pornographic prompts?


All I want is the ability to use it in peace without getting a threatening message whenever I ask for something vaguely edgy - and that need not be porn either.


And the real questions come out.

Seriously though, I imagine this is less a case of whether this specific implementation permits pornography, but whether any porn was included in the dataset it was trained on. No matter how good AI is, it only knows what it knows.


"Stable Diffusion" is referring to one specific set of models, which were trained on a dataset including some porn.

The official implementation has a second model that detects pornography, and replaces outputs including it with a picture of this dude https://www.youtube.com/watch?v=dQw4w9WgXcQ (Not kidding). Removing that is a really simple a one line change in the official script.


> Not kidding

I was amused by this when reading the source. Here’s the function that loads the replacement.

https://github.com/CompVis/stable-diffusion/blob/69ae4b35e0a...

It looks like removing line 309 of the same file would disable the check, but I haven’t tried it.


This is the line to change to disable the nsfw filter: https://github.com/CompVis/stable-diffusion/blob/69ae4b35e0a...


I dont see this line in V2.1


I guess the question is whether this toggle is available in the GUI, or if someone has to edit source code


For the CLI version, there was no UI, you have to open the python file and stub out the NSFW check. (trivial though)


If you're curious you can see some of the imagery used to train SD here:

https://laion-aesthetic.datasette.io/laion-aesthetic-6pls/im...

This is just a small fraction of the imagery and it does include pornography.

I respect that if you're online and reading HN, you're probably mature enough to handle seeing pornography. So if you're curious to see some of the training data that made it in: choose from the dropdown "-column-" and change it to "punsafe" and set the value in the adjacent field to "1", then press Apply.

Obviously this will show pornography on your screen.

An article which talks about the imagery and how this browser came about is here: https://waxy.org/2022/08/exploring-12-million-of-the-images-...


I mean I don't know why we have to talk around this point: yes, you can definitely generate pornography with SD. Be prepared to see a lot of weird things, because that super-realism really means it just drives right into the uncanny/body horror valley at full speed.

But the second point here is also wrong: the whole reason these models are interesting is because they can generate things they haven't seen before - the corpus of knowledge represents some type of abstract understanding of how words relate to things that theoretically does encode a little bit of the mechanisms behind it.

For example, it theoretically should be able to reconstruct human like poses it has never seen before provided it has examples of what humans look like and something which transposes to an approximate value - an obvious example in the context of the original question would be building photorealistic versions of a sketched concept (since somewhere in it's model is an axis which traces from "artistic depiction of a human" to "photograph of a human" in terms of style content).

Of course, most people aren't very good at drawing realistic human poses - it's a learned skill. But the magic of deep learning is really that eventually it doesn't need to be - we would hopefully be able to train a model which can be easily copied and distributed which represents the skill, and SD is a big step in that direction (whether it's a local maxima remains to be seen - it's dramatic, but is it versatile?)


>it can generate porn

It's not good at penises or vaginas, just breasts and butts. I can find what I like pornographically without ai. But the ducking and dodging around nudity and sexuality is childish and tiresome and we ought to discuss this topic as disinterestedly and nonchalantly as we do the other things it excels or struggles at. Penises and vaginas are not somehow more vile than other things.

Sure enough, sexual content ought to be age-gated and there is potential for abuse (slapping a person's face onto explicit imagery without their consent isn't cool yall). But are we working toward AGI or not? Because at some point it's gonna have to know about the birds and the bees.


I think you mean vulva, not vagina[1]. Vaginas are mostly internal.

[1] https://www.allure.com/story/vagina-vulva-difference-planned...


> to know about the birds and the bees

:) made me chuckle. AGI would need to know how to be evil, as well, right?


isn't it obvious yet?

we're creating the speakwrite machine.


Are we talking about Artificial God Intelligence?


> But the second point here is also wrong: the whole reason these models are interesting is because they can generate things they haven't seen before

No, it's essentially generating mashups of its training data, which can be very interesting.

So a model that hasn't been trained on a lot of porn will of course do a very bad job at generating porn.


Mashups is the wrong way to think about it. It's generalizing at a higher level than texture / image sampling and it can tween things in latent space to get to visual spaces that haven't been explored by human artists before.

It requires a good steer and prompting is a clumsy tool for fine tuning - it's adequate for initialization but we lack words for every shade of meaning, and phrase weighting is pretty clumsy too, because words have a blend of meaning.


"It's generalizing at a higher level than texture / image sampling and it can tween things in latent space to get to visual spaces that haven't been explored by human artists before."

The very fact that the model is interpolating between things in the latent space probably explains why its images haven't been explored by human artists before: because there is a disconnect between the latent space of the model and genuine "latent space" of human artistic endeavor, which is an interplay between the laws of physics and the aesthetic interests of humans. I think these models know very little about either of those things and thus generate some pretty interesting novelty.


I think of artistic endeavour as a bit like the inverse of txt2img, but running in your head, and just projecting to the internal latent space, not all the way to words. It's not just aesthetic, it's about triggering feelings through senses. Images need to connect with the audience through associations with scenes, events, moods and so on from the audience members' lives.

Aesthetic choices like colour and shapes and composition combine with literal representations, facial emotions, symbolic meanings and so on. AI art so far feels quite shallow by this metric, usually only hitting a couple of notes. But sometimes it can play those couple of notes very sweetly.


Its only a question because devs went out of their way to limit it, to everyones surprise

Its like if Photoshop broke itself when you tried to modify or create anything nude or provocative, as the default

That would just be weird and thats what these AI software devs have done

so everyone patches that contrived feature flag, but nobody knows if they patched it


Fun fact: Photoshop does exactly that if you open images of currency. See: https://helpx.adobe.com/photoshop/cds.html


As if anyone who really will print money can't open in GIMP.


Probably useful to stop 12 year olds from "accidentally" committing felonies though


It used to just give a warning


It has to be. There was a sub (now closed) called r/UnstableDiffusion, where people shared their porn output from SD.

The filter should be easy to remove and there are already people who simply removed the filter.


Based on the upvotes, inquiring minds _really want to know_!


It can make porn images, but I haven't been able to get consistently good results. There's probably ways to tweak the parameters to improve things, but I haven't figured that out.

It mostly understands what naked people look like, but the images I've generated involve a lot of accidental body horror. You get a lot of people with extra arms, weird eyes, or body parts in the wrong places. The fact that they explicitly removed porn from the training set comes through pretty clearly in the model.

I suspect it could be improved a lot with some specialized retraining. As far as I know, nobody has done that work yet.


In particular it is very confused about genitalia, it tends towards hermaphrodite.


How do you see upvotes? I’ve never had any visibility to upvoted comments (downvoted of course yes), or at least it’s not obvious at all to me and see no indication of it..


You can see the total score (upvotes - downovtes) for your comment where you see the upvote/downvote button for everyone else's comments. Or at least I can.


I’ve always been curious about that, too. I don’t think I see anyone else’s votes, up or down.


Yes (I checked)


Thank you for this. How hard would this be to port to ipad?


Probably isn’t fast enough.


The Pro and Air models have M1 chips, roughly on par with a MacBook Air


Oh wow! Didn’t realize.


iPad Pro got it in spring 2021, so an M2 refresh seems likely too. October event along with an M2 MacBook Pro refresh? Or maybe not until spring 2023.

Another comment mentions RAM capabilities. Unfortunately that’s tied to the storage tiers instead of being something you can pick separately, so if you want 16 GB of RAM you have to buy the 1 TB or 2 TB models. Meaning for a 12.9” iPad Pro, if you want 16 GB you’re looking at an $1800 tablet. Not ideal.


RAM might hold it back?

Still, probably not too hard to build for iPad or iOS.


The ipad pro has a 16gb option. It's essentially the same hardware as the macbook.


do you mind putting a license on your repo? as it is right now Diffusion Bee is technically not open source.


If it works, it's incredible!


It works, well, incredibly.


how is this possible without a dedicated gPU? thought stable diffusion would require far more horsepower


M1s, especially the Max and Ultra, have pretty decent GPUs.


Horsepower can make it go faster, but the major limitation is graphics memory. Graphics cards with under 12 GB of memory can’t handle the model (although I believe there are lower-memory optimizations out there), which means you need a pretty high-end dedicated graphics card on a PC. But because Apple Silicon chips have reasonably fast on-chip graphics with integrated memory, it can run pretty efficiently as long as your Mac has 16gb or more of RAM.


They have integrated GPU that I believe Apple claimed was comparable to RTX 3090 (perhaps since debunked though or at the least maybe a misleading claim).


Apple compared M1 max to RTX 3080 (mobile) which was a stretch.

M1 ultra was compared to RTX 3090 which was a larger stretch.

The M1 max deliver about 10.5 tflops The M1 ultra about 21 tflops.

The desktop RTX 3080 delivers about 30 tflops and RTX 3090 about 40.

Apple’s comparison graph showed the speed of the M1s vs. RTXs at increasing power levels, with the M1s being more efficient at the same watt levels (which is probably true). However, since the graph stopped before the RTX GPUs reached full potential, the graph was somewhat misleading.

The M1 max and Ultra have extra video processing modules that make them faster than the RTX GPUs at some video tasks though.


Tflops isn't an orange to orange comparison.


I believe thats cherry picked data. More specifically apple says it’s comparable to the 3090x at a given power budget of 100 watts. They don’t mention that the 3090 goes up to 360 watts.


The relative feebleness of x86 iGPUs is partly about the bad software situation (fragmentation etc, the whole story of how webgl is now the only portable way) and lack of demand for better iGPUs. AMD tried the strategy of beefier iGPUs for a while with a "build it and they [software] will come" leap of faith but pulled back after trying for many years.


Thanks for sharing. Can you say a little bit more about what prompted you to make this?


Well, all the other offline tools seem didn't seem very intuitive to install, for someone without technical knowledge.


Even for someone with technical knowledge this is a breath of fresh air...why go through the trouble of even writing a script? I just wanna click a button and see something




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: