Hacker News new | past | comments | ask | show | jobs | submit login

No, almost certainly not - it's very heavily censored. So far I haven't seen anyone successfully generate anything you could call erotica. (One interesting consequence: the anime is not very good and it can be a struggle to generate anime at all https://www.reddit.com/r/AnimeResearch/comments/txvu3a/anime... )

If you mean just could such a model architecture be trained to generate furry erotica? Yes. Erotica, and furry art in general, in Tensorfork's experience, tends to be somewhat harder than regular images because of the more chaotic placement of everything such as limbs, but not that much harder. You might need half again as much compute to get equivalent quality results, perhaps, but probably not, like, 10 times as much.




If this is the same old CLIP model, I tried making anime with CLIP+GANs (usual prompt "chitanda from the anime hyouka") and it got the hair color right but nothing else - seemed to have first page of Google Image Search level knowledge.

There have to be a lot of limits once you get far enough out of its training data. Even if you could guide it by image prompts, it has to know what that image is meant to represent.

Also what most people mean by "anime style" isn't anime style, of course - they actually want one of Avatar-type cartoons, advertising key visuals, game characters, or pixiv/DeviantArt fanart. None of that is drawn the same way as TV anime.


They trained a new noise-aware one on the same dataset, I believe. The CLIP is frozen though in the second stage where they train on a cleaner dataset for the image generation via diffusion conditioned on a CLIP image embed.

They get the image embed from another model that can predict image embeds from text embeds to encourage generalizing when sampling.

It does suffer from some of the same issues with binding attributes and counting stuff, as far as I can tell. But that's not necessarily related to its ability to compose concepts.


I'm curious, how expensive is it to train these models? I wonder if it would be possible to crowdfund the training, especially if targeted at a group that is (at least stereotyped) as spendthrift as furries.


AFAIK the costs have not been published, but assuming you'd have access to the data (which is tricky, and getting an different, equivalent dataset may involve lots of time/effort/legal issues/cleanup work), my estimate is that the compute for this would cost in the ballpark of $1m-$10m.

But if you don't want a universal model but just a specific niche (e.g. furries) then you should be able to train a much smaller model on much less data and get something interesting much cheaper.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: