After using SD heavily for a week, I half agree with this. It is incredibly disruptive, and it's wild how much it accelerates the creative process. I'll give you that.
But two things I've noticed:
First, artists will still have a massive advantage over non-artists with this tool. A photographer who intimately knows the different lenses and cameras and industry terms will get to a representation of their idea much faster than someone without that experience. Without that depth of knowledge, someone might have to rely instead on random luck to create what's in their head. Art curators might be well-positioned here since having a wide breadth of knowledge and point of reference is their advantage.
Second, we need the ability to persist a design. If I create a character using SD, I need to be able to persist that character across different scenarios, poses, emotions, lighting, etc. Based on what I know about the methods SD/Midjourney/Dall-E are using, I'm not sure how easy this will be to implement, or if it's even possible at all. There will always be subtle differences and that's where being an artist who can use SD for inspiration instead of merely creation will retain their advantage over a non-artist.
For every tool, a better output will always be created by people who specialize in a craft. Just as photoshop revolutionized photography, to this day you can tell the difference easily between and bad 'shops.
In video games upon release everyone is bad and its an even playing field. But as people practice and gain experience they improve their usage, refine their approaches. You eventually see metas develop and best practices.
This will all be exactly the same. Just give it some time and watch professionals create professional outputs compared to paying for some base level work on Fiverr or whatever.
I agree, this would be an incredible tool. I can see how some of the outputs may help me improve a piece I'm working on, even if I would never use the model's output for my final product.
It looks like a decent way to produce concept sketches.
Currently, working with creatives (including programmers) is a very iterative process for non-creatives.
"I want this."
"No, I meant this."
"Can we try making that line longer?"
"Eh, I'm not feeling it. Why don't we try brighter colors?"
"Ugh. That looks obnoxious. Can we tone down the red?"
etc.
If the non-creative can use this system to reduce that iteration, it will make life easier for them (maybe not for the creative, if they are banking on the hours doing the iterations).
Does SD have the ability to take this kind of micromanagement input though? From what I've seen, it works off of a general descriptive prompt. Will adding a very specific "but put the duck 5 pixels to the left" or a very vague "give it more pop" to the prompt actually have the intended effect?
honestly i feel the opposite. one of the worst situations to run into as a designer is a client who is simply too married to a bad design. i would much rather work with something you scribbled on the back of a napkin than than some highly rendered ai vomit that is just fundamentally shit.
There's one huge difference between Copilot and stuff like this. Art that's 98% correct is awesome. Code that's 98% correct is completely useless.
I think Copilot is going to live off hype for a while then tank and be looked back on as a failed experiment. Whereas I think that this kind of AI will eventually get to a point where it's extremely useful and could change up certain industries (game assets, marketing materials etc).
As a new user of copilot for the last three months, I can't disagree more. I was initially skeptical, and I have noticed it often produces code that looks good but is wrong. However, it still saves me enough time each day to pay for the monthly subscription - in one day. That's a 30x ROI. I imagine it only gets better from here. I wont go back to programming without it.
Can I ask what kind of programming you do for it to be so helpful?
I mostly do maintenance of legacy codebases (also known as codebases, lol) where a lot of the work is figuring out where the changes need to be made and actually making the changes is frequently just a few lines here and there.
When I do have to figure out how to use some API, it's often not an open source one, so Copilot would not have it in its corpus.
I think these kinds of conditions are really common since software tends to last for maintenance longer than it is in initial greenfield development.
So I'm confused what kind of work benefits from Copilot. Just pumping out greenfield development of new websites/webapps that don't use much legacy or closed source code or services, just using existing popular open source libraries in commonplace ways?
The other thing I wonder about is code quality. When I look up API docs and stackoverflow examples, I get to read them all, maybe test some examples out in a CLI/REPL, and then decide carefully exactly what to do myself, what special cases to worry about or not, what errors to handle, etc.
Maybe what I end up writing is even the same as Copilot would have written. But in the process, I learn about finer details of the library and make detailed decisions about how to deal with rare edge cases. Might even end up writing a comment calling out a common pitfall I realized might exist.
My question is -- in order to save so much time with Copilot, are you still able to do all this extra thinking and deciding and learning (in cases that warrant it)? Or would doing that just end up consuming most of the time Copilot "saved"?
In other words do you end up producing code much more rapidly, but at the expense of code that looks more like a junior than a senior wrote it, because it is most concerned with working at all, and does not have time to worry about finer details? At the expense of not being as deeply familiar with the foibles of the API you're working on?
Honest questions as I haven't tried Copilot, and these are the thoughts that make me imagine it won't be of value. A lot of what I know I learned from doing the parts of the work that Copilot would be automating. Sure, Copilot would save me time when initially writing it. But would I then have less deep knowledge available when there's a fire in production because I never explored the fine details of my dependencies as much?
It's really more like a smarter autocomplete. I haven't tried it on a third party API yet, we don't use many at work. I work in a startup on a Python and TypeScript code base. To give an example, last night I was creating a unit test and copilot filled in the assertions. It missed one it couldn't know about, and it got two wrong. But it was a lot faster. The most amazing case to me was with a function to transform URLs to an image resize service. There was a bug in the function, it needed to return URLs ending in .SVG as-is. I went to fix the bug by typing "if" and copilot filled in the `if url.lower().endswith(".svg") return url`. It knew about the bug before I did. Too bad it couldn't do a code review when I originally wrote the function.
They have a 60 day free trial. Try it out, it's one of the most interesting changes in developer tools in a while. I feel like I'm living in the future sometimes when using it.
I haven't tried copilot either, but one of the things I'd be curious about is how well it can conform to a company's coding style guidelines and/or match its coding style with the existing legacy code that's being modified.
One of the major annoyances of working as a team with legacy code is when someone forgets to, or deliberately avoids, conforming their code to the style and techniques of the surrounding code. Nothing grinds my gears like working in a 500 line C++ file, where_every_function_uses_underbars, has consistent 4 space indentions, avoids exceptions, and passes by reference, but right in the middle is that functionThatMiltonWrote that uses camel-case, has 8 space indentions, throws exceptions, and passes-by-pointer.
It works great in our codebase; it uses the current text in the file from above your cursor as reference.
So if you're creating a new file, it isn't always perfect, but once it catches on to your style it's seamless.
I haven't tried copilot in languages less opinionated about style, so I'm not sure how well it handles that case. In Python, TypeScript, and rust it seems to work well. In any case I use auto format on save, so that fixes some of the potential formatting issues.
It seems like you haven’t used CoPilot. Yeah, some of the harder bits I may have to code myself but the amount of boilerplate it reduces is incredibly liberating.
Whoa! Really? Seems like modern software releases would be excited to achieve 98% with the incredible amount of bug fixes/patches released very quickly after the massive beta test known as release day.
Code has to be 100% correct or else it’s considered a bug (assuming it’s syntactically valid).
Code that is 98% correct is actually much worse than no code at all. That’s the kind of code that will introduce subtle, systemic faults, and years later result in catastrophic failure when the company realizes they’ve been calculating millions of payments without sales tax, or a clever hacker discovers they can get escalated permissions by passing a well-crafted request etc.
What's your metric for the "percentage the code is wrong"? Is it how many lines of code were wrong, or how many test cases the code fails?
Presumably if AI-generated code passes every test case, but would fail on edge cases that some human programmer(s) did not anticipate in their suite of tests, the humans potentially might have made similar coding mistakes as the AI if they had had to personally write the code.
No, it didn't but people who took only one semester of art classes may remain under the delusion that art is about making pretty pictures.
People who need illustration or graphics with no particular style can meet their needs with this tool but that is far from art. This replaces commercial illustration, not artists.
Exactly, this replaces general illustrative commercial art. This is great for illustrations at the head intro of fiction stories. However, fine artists that produce art where the object created is merely the totem for their actual art: intellectual conceptualization of an idea, often concerning the finality of life and the immature behaviors we playout within; that type of Fine Art is completely out of reach of modern AIs. I'm sure there are attempts by AI developers to mimic such art, but that mimicking will be gibberish. It requires a sentient being to create Fine Art, because Fine Art is an 360 degree expression of existence, not some pretty picture.
I can see this replacing the clip and misappropriated art in slide presentations that no one is paying for currently with a $20/month service that lets you do "line art angry man in toga at computer" to put into your talk on Kuberentes.
It might replace the "nice pic, do you have it at {size} so I can use it as a desktop image" comments on various social media sites.
I don't see it replacing an actual photograph as a wall hanging (or a painting) because they are subtly wrong in some ways. The reflection in the lake doesn't match the landscape... that one cloud has its lighting at slightly different angle than the rest.
Possibly... but I'm skeptical. For the type of photography that I do, these are things that come from an understanding of the world and its implications. You aren't going to see certain types of clouds in certain landscapes - they just don't form there. For example, having a cloud that indicates fast moving wind in a mountain environment in a plains landscape with a glassy smooth pond.
It's not that you can't paint that picture... but you'll never be able to capture that scene in camera. If someone was presenting that as a photograph, it would feel wrong to me because of an understanding of the meteorological criteria for the scene.
Yet, I am doubtful that I'll have a generated image at 11x17 that holds up to the same scrutiny that I apply to my own photographs.
I am absolutely certain that it will be able to generate images that are completely appropriate for images that you don't look at for more than a minute at a time or are used as complimentary material for other content.
All that said, I am not concerned that I will get more or less sales of my photographs with AI generated art competing. The people who are going to pay for a photograph are going to pay for a photograph. Those who aren't - weren't going to in the first place.
Of course not, it just gained a new "advanced stable diffusion usage" track. Photography didn't displace painting from art school, and even outside art there are plenty of examples to be found. CNC machines are awesome, but every machining class in existence still teaches manual lathing and milling.
Not yet, but I can definitely imagine a future where these tools get more capable and refined, to the point where all the shortcomings listed above will be overcome. Knowledge about cameras and scene composition are already encoded in the networks to some degree, it just needs to become more accessible. There's probably also a better way to seed new images than by starting with random noise, so we could get similar variations easier. We have already made the big step towards creativity and real world understanding of objects and their lighting, the remaining issues are more technical and unless we are incredibly unlucky and run into a true show-stopper, we'll probably all have access to a high quality digital artist that can reduce production times dramatically.
You need to give some information about the scene to the network.
Camera settings is just a short hand to describe the field of view and depth of focus (at the very least). If you make that implicit you'd still need to give the network the steradians, focal length, circle of confusion, etc. etc. etc. that you want your image to use.
You'd need to understand everything in Hecht's Optics to tweak all the parameters of an AI generated image.
That's an implementation problem, not a technical or conceptual one. Diffusion models have shown that they can learn practically all of these things if you make them sufficiently big.
i dont think it would be a massive stretch of the imagination to be able to train a robotic arm to hold a brush and reproduce something that is on screen and do it really well.
Machines like that already exist but specifically with oil painting there is a lot of important information in how was the paint applied. So it is really really hard. Too many decisions in tools used, angles of the strokes, pressure, thickness of the color etc.
You might make painting that looks identical from far away but when you come close it will be completely different.
Nobody gets paid to be a telephone operator or a cobbler or a steeplejack anymore either. If new tools come along that automates and further commoditizes the production of art, I don't see the issue. Automatic driving for trucking, public transport, taxis would economically impact vastly more people, and everyone here seems to be cheering for that.
There's nothing special about art. People will still do it as a hobby, like they do with a lot of other dead trades and professions. In many ways I think the personal aspect of the creation and sharing with people you know is the special thing, rather than the creation or performance of a commercial success (not that I've had any experience with the latter, mind you). And I'm not against the mass consumption of art, just that it needn't be produced by people if it has the same entertainment value then that's great.
I grew up in a car dependent area and moved to a walkable big city in my teenage years, and only then became aware that cobblers were still a thing -- in fact, in NYC, they're not an obscure thing, but as common as bank branches.
I think that's why there's a lot of people online who think cobblers aren't a thing anymore. They're from car dependent areas. If you drive everywhere, shoe soles don't wear out much faster than shoe uppers anyway, so it doesn't make sense to care. But if you move to a walkable city, you'll suddenly find it quite economical, since the soles wear out far faster, and the cost of a sole replacement is less than a new pair of shoes, so you might replace the sole a couple times before discarding the shoe.
Well you got me. I guess chimney sweeps and black smiths and loom weavers and carriage drivers still exist too, not sure about steeplejacks since the passing of Fred Dibnah. But you get what I'm trying to say, hopefully.
You included “cobbler” only because it sounds old-timey to you, and I was pointing out that it’s not.
Repairing expensive shoes is not an automated process. It’s more like fixing a roof leak, landscaping, or changing a flat tire. Jobs for those things still exist and aren’t going anywhere.
You're nitpicking one of the given examples without engaging the user on the point they were trying to make.
To be nuanced, maybe they might have said, "cobblers are less in demand now that many people have moved from owning fewer pairs of shoes they make last through repair to owning more pairs of shoes that they tend to get rid of when they are worn out due to changes in construction materials used in production," but if people have to write like that to make points, nobody will ever make a point.
It's a nitpick, but a little bigger than that. It's as bad as including "bus driver" in the list. Cobblers just shouldn't be included in the category at all.
Cobblers are in just as much demand in most of the world as they always have been. They only fell out of demand in car-dependent areas, which is a small minority of the world population (but a vast majority of the HN commenting population since most of the USA outside of a few cities is car dependent)
I don't know if it has anything to do with construction but doubt it. If you actually walk everywhere shoes don't last very long these days, especially shoes under $100.
There are a lot of other urban services that exist in almost the entire populated world, but that most Americans think quaint because they are not relevant to a car dependent highway world.
All that said, this really is a nitpick and the original point stands very well. Some of us just don't like it when car-dependent people forget that they are a small minority worldwide and instead treat urban walkable people as the insignificant minority! Or rather, HN being a forum for all things interesting, we find it interesting to make it a teachable moment. What could be more interesting than finding out that something that has always seemed obvious to you is actually backwards?
> I don't know if it has anything to do with construction but doubt it. If you actually walk everywhere shoes don't last very long these days, especially shoes under $100.
By construction, I mean the material and design of shoes people tend to wear. I can't say I've ever met someone who takes sneakers or running shoes to a cobbler and these shoes are more common nowadays.
Those kinds of shoes you mention tend not to last very long at all and are not resoleable. If you walk a lot, you find yourself throwing away the $80 shoe after just 6 months.
These kinds of shoes are most of the market because most people don't walk much in the USA. If you walk a lot, you might still not change anything and keep buying the disposable sneakers, throwing away $160 a year.
But if you walk a lot AND are disposed to think critically about the situation, you find that if you pay a bit more for shoes you can make them last many years as long as you resole them periodically. And as I recall, a good $50 sole on a good $150 shoe costs half as much and lasts three times as long as a disposable $80 sneaker.
Not only do you save money (not really a ton) but it actually is more convenient, since even counting resolings, you get more miles between having to go repair or replace your shoe. And you don't have to wear in your leather uppers again. It is truly a luxurious feeling when you come back from the cobbler and have shoes that are worn in and fit your foot just perfectly like a glove... yet the soles are brand new and strong and comfortable and ready for another thousand miles.
Why do you think there's a stereotype of leather boots being popular in NYC? I'm sure the resoleability and longevity in the face of large amounts of daily walking have a lot to do with it.
No I included it because it used to be a common profession and now it is an extremely niche one. I could have also put chimney sweep in there knowing that it's still a thing.
Someone is still going to pay for a person to perform or create art for them. Some professional driving jobs will continue to exist long after most are automated. When I said "nobody", that's what's known as hyperbole.
Yes, but the consequence for society is that professionalization is coming to every single field, and the blue-collar jobs are evaporating, and we had better figure out how to get all those people either white-collar jobs or some other form of income if we want to avoid significant social instability.
It used to be that you could find work as a political cartoonist and draw a picture of some satirizable politician each day - maybe Dr. Oz with a funny-looking vegetable platter, or Biden falling off a bike labeled "build back better," or something - and that would be a career. Now it seems you can just type "Dr. Oz holding a vegetable platter, political cartoon" into one of these AIs and the bulk of the work is automated. Sure, you could spend the rest of the day refining it, but nobody's really looking for perfection or transcendent skill in their political cartoons.
You can still find work making political videos of Dr. Oz and vegetables (e.g. https://twitter.com/JohnFetterman/status/1564432981841907713). Today's image generation AIs cannot do videos like that. Tomorrow's will, before we know it, and again nobody's looking for more than a baseline level of quality there either.
And even the local newspaper that might have been employing a political cartoonist is being swallowed by high-capital-ownership companies that can replace a lot of the writing with AIs. The expectations of quality are a bit higher - though the AIs have mostly gotten the hang of sports writing - but again people don't expect The Smalltown Gazette to have the standards of The New Yorker.
Sure, the highest-quality drawings, feature films, and longform journalism will get a lot better. And that's great! But most people don't work on such things. What are they going to do? If nothing else, how will they remain an audience with money to spend on the highest-quality works?
(I am not advocating for stopping this process, to be clear. Smashing the AIs isn't a coherent proposal, and smashing the physical machines of automation didn't actually work when the Luddites tried. I'm advocating for admitting that this process is happening and drying up employment, admitting that keeping as much of humanity as possible under a good standard of living is important for humanity, and figuring out what to do about it.)
For face generation, I think there are deep neural networks that can generate multiple views of the same face [1], [2]. Stable diffusion already provides the possibility to generate variations. So I don't think it is a stretch to imagine that these existing capabilities will only get better and/or be applied to SD.
[1]: Multi-View 3D Face Reconstruction with Deep Recurrent Neural Networks
[2]: Deep Neural Network Augmentation: Generating Faces for Affect Analysis
I haven't had a chance to try it yet, I'm optimistic but skeptical that it can have true persistence like I'm referring to. Especially since it requires training your own model which requires multiple images of the same asset/character/object/etc.
For what I think you have in mind, I suspect it will eventually not be "image to image", but "<ai thing> to <ai thing> + image", for that guess to be remotely repeatable. That "ai thing" is probably the persistence you're talking about.
I think that thing will neccesarily contained representations of dimension, behavior (physics/bones), and "style". Without the "ai thing", if only using an image/text, the character would have to be impossibly represented in the model, so it could guess all of these things predictably. For example, what that character looks like from a side profile, or behind. What if it's an alien, and its arms should always bend backwards? Could a text representation ever be made to completely describe this, with good reproducibility? Probably not. But, I assume some non-human representation would have a better chance.
As is, if something known is required, I think the behavior of these models can be considered "destructive" to the input image, more often than not. For this reason, I think artists are safe, for the time being. :)
I think this can be seen a bit like the invention of an "index fund" for art. Active investors are still needed to generate the market signals that an ETF can aggregate, but for the majority of people an ETF is preferable to the cost of hiring an active investor yourself. And similarly SD needs artists to generate the signal that it aggregates, but for the majority of people it might be (or might soon be) preferable to use SD to get a "generic"/"average" result instead of hiring an artist yourself.
There is bound to be a smart kid who already turned this idea into a shitcoin (meaning a pump and dumb money grab, not an actual attempt to make a art-index and tokenize it).
It seems someone indeed did this (the p&d, not the index): https://www.coingecko.com/en/coins/artonline
What about using this tech for ideation and artists for production?
You could use Stable Diffusion et al to create new characters based on a prompt, then farm the concept out to artists to produce individual works. Kind of like hiring a super expensive agency to design your new logo or brand identity, then using a stable of in-house designers to translate the concept into UI, ads, etc.
Honestly, I don’t even know if we’ll need humanity for the final product. It’s like using a chess computer to get an idea of a good move… and then a human to approve it? Adjust it? Be inspired by it?
Any signals humans give (painting x is better than y) is another signal to encode. Take billions of such ratings and improve the AI’s taste to superhuman levels.
In short, anything that humans would add to artistically improve the outcome is just another signal to be encoded. It’s weird to write it, but artistic creativity is deciding what new pixels go where, which is a search problem (in a large search space) which AIs are apparently doing great at.
We have a bias: we’re humans, we must be important somehow! But it comes down to a bigger neural net eventually outperforming the one in our heads.
The difference is that chess rules are extrinsic to the players at the board, whereas art is intended to communicate (emotions, narrative, ideas) to humans.
A neural net that can communicate with intent to human minds as well as a biological human is nothing short of strong AI. We’ll get there, but not with generative models.
> It’s like using a chess computer to get an idea of a good move
They did this for Backgammon and found styles of play humanity hasn’t discovered in millennia. Now humans can use those when playing each other and it makes an old boring game feel fresh and exciting.
The datasets these tools use don’t include any context. There’s no sense of what the images in the data might mean to the viewer, or how they relate to the time and place they were made. I would argue that means the tools will struggle to produce meaningful works, even if they become great at making beautiful works.
Useless? One can spend days/weeks/months enumerating different concepts of a design due to a roundtrip between “maybe we try <a prompt>” and an image. Now it’s literally minutes and a designer can draw you some ideas right at your office.
I've tried too, and honestly I'll hire artists again (I did several times). It's easy to come up with nonsense, so in the end I think this is a tool in the hands of artists more than anything else.
For threatening artists it should be much better. Maybe it's not the model, maybe it's the interface human-computer that fails, but in the end it's what we have.
Maybe someone with a lot of time in their hands can iterate enough times to come up with something nice and depicting what they wanted. Not for me.
That’s possible, but my gut feeling is that this turns out to the the same as the DARPA grand challenge moment for self driving cars.
It was an absolutely amazing accomplishment. I legitimately thought I could hold out long enough that my next car would be fully self driving. Truckers were an endangered species.
But here we are 20 years later, and we’re still almost there.
We’ve made amazing progress and I love the self driving features I do have on my car, but how many jobs have been replaced by self driving cars?
I always wondered if we massively overestimate human creativity. Maybe it is ingrained in our culture and our very being. I’ve never heard counter arguments that humans are not that creative.
Creativity demonstrated by Alpha zero chess engine blows Magnus Carlsen’s mind (from his recent interview with Lex Fridman), I wonder if at some point in the future, we’ll finally throw in the towel and get out of the denial phase.
"AlphaZero would sacrifice a knight or sometimes two pawns, three pawns, you can see that it's looking for some sort of positional domination, but it's hard to understand. It was really fascinating to see. "
No, apparently we just massively underestimate how important studying the humanities is.
This is illustration, not art. I'm flabbergasted how many people are so quick to confuse or conflate the two.
For the record, my significant other is an artist and my entire life has been surrounded by creative people. I’m questioning human creativity, say, 50 years from now, and how much machine creativity we currently underestimate. Sure these artworks lack imagination and fidelity today. Appreciating art and humanities has nothing to do with projecting AI capabilities in the future, I’m flabbergasted how many people are so quick to confuse or conflate the two. Maybe we should require people to study logic/reasoning and how important it is.
50 years from now, I have little doubt that will be the case, barring serious civilizational decline. But that isn't the sentiment on this page.
Ditto for me regarding being surrounded by creative people. This isn't going to lose them any business any time soon, not even the younger ones five years from now.
Alpha zero was trained by playing against itself, with a clear win condition. You'd need to quantify what it means for one image to be "more creative" than another to get a similar result.
Without trained models from human creativity, what can AI do ?
These AI emerged because of human creativity.
Picasso created it’s new art form from its own creativity. He created something no one ever though of before. Now AI are fueled with Picasso’s drawing and can produce art that looks like his maybe.
But what about creating something entirely new that has never been fueled into the engine.
Could the AI invent something I’m about to dream tomorrow and this be the exact same copy ?
I think it's undeniable that AI can create novel things, the question is if AI can create novel things that are also interesting. A randomized 600x600 png is novel, but it isn't at all interesting, much of what goes into making a piece of art interesting is not a quantifiable or well-defined goal. That's not to say that AI is better than humans, just the opposite, art is a deeply human object, and I do not know if we would appreciate art made and developed by AI in the same way we do absorbing and creating art in response to each other.
I see your point.
We agree that on a 600x600png there is a finite number of possibilities given a set number of colors.
It could be possible to « brute force » all the images possible which would be the equivalent of trying all the combinations of letters to write a book. So creating things is not really a problem. Creating relevant things from a purpose is.
The fact is, we know no other intelligence except ours. We are the only known being that appreciate art or books, from our subjective conscience.
Can AI create its own art, understood and appreciated by itself ?
Has intelligence a meaning without « humans » ? It’s part of the same thing. So the AI is modeled after ours. So it’s not its own thing and cannot understand what it creates and if what is created is relevant. AI is just an amazingly powerful tool.
I believe the creativity is primarily the process, something we enjoy doing, not the method to _produce_ or _generate_ something. Sort of like dogs like playing fetch. Some creative processes do not create permanent artifacts at all, like all performing arts.
I think there will likely be better ways to bias the training of specific instances - Something like a 'training library' with biased training data that you can plug in for your use case.
For example let's say you're a well known designer with a distinct style, and you train your own instance on your lifetime's body of work. Now you can generate whatever you want in 'your style' (just like you can now ask for a painting in Dali's style).
Now you've turned your style and design process into a factory - for every new client you can create whatever they want, in your style, with multiple examples, in a button press. Perhaps you can even sell that 'training library' to other designers?
> A photographer who intimately knows the different lenses and cameras and industry terms will get to a representation of their idea much faster than someone without that experience
There are already websites which sells tailored "professional" prompts for dall-e, GPT3, etc
I've seen an inpainting technique where you put an existing character then describe what their twin is doing, then crop out the existing, which seems to persist the character at the cost of fewer pixels to work with
Well, it won't replace creativity, it just makes starting out easier.
SQL was designed so that business people could describe queries in English. How did that work out?
Now, of course, we have ORM's, which will get you data, sure, but often in egregiously inefficient ways if not used correctly. If you want to to get it right, you still need to pop the hood and adjust things, and you have to know what you are doing.
> A photographer who intimately knows the different lenses and cameras and industry terms will get to a representation of their idea much faster than someone without that experience.
Sure. Only now the gap has been shrunk by a factor of infinity, because the time to representation was infinite for someone without that experience before.
Ultimately software engineering isn’t about code, it’s about describing in sufficient detail exactly what you want. Of course, this just leads to business analysts and ux designers being the next ones to be replaced after coders… :P
Is it possible to provide another image as a prompt for SD? For example, could you provide a simple drawing of a house and expect it to render a house?
You can do using img2img method based on top of SD. Another option is to create an embedding of a concept based on textual inversion and then use the embedding to guide SD generation. Both methods are possible using this: https://github.com/hlky/stable-diffusion-webui
Thank you for this! I've been playing with the optimizedSD for a while but couldn't get what I wanted out of it. This guide makes sense to me. Gonna give it another shot on the weekend!
It'd have to get itself into the dataset, as if it was 'Snoop Dogg' or some other identifiable person that the AI can reproduce. The degree to which you can generate stuff that is deepfake 'Elon Musk' doing whatever you say, is the degree to which you can invent a character and have the AI generate images (or video) of it following (sort of) your script and directions.
But two things I've noticed:
First, artists will still have a massive advantage over non-artists with this tool. A photographer who intimately knows the different lenses and cameras and industry terms will get to a representation of their idea much faster than someone without that experience. Without that depth of knowledge, someone might have to rely instead on random luck to create what's in their head. Art curators might be well-positioned here since having a wide breadth of knowledge and point of reference is their advantage.
Second, we need the ability to persist a design. If I create a character using SD, I need to be able to persist that character across different scenarios, poses, emotions, lighting, etc. Based on what I know about the methods SD/Midjourney/Dall-E are using, I'm not sure how easy this will be to implement, or if it's even possible at all. There will always be subtle differences and that's where being an artist who can use SD for inspiration instead of merely creation will retain their advantage over a non-artist.
That said, holy crap. This tech is insane.