Hacker News new | past | comments | ask | show | jobs | submit login
BlenderGPT: Use commands in English to control Blender with OpenAI's GPT-4 (github.com/gd3kr)
612 points by alexzeitler on March 26, 2023 | hide | past | favorite | 294 comments



"The generated code might not always be correct. In that case, run it again lmao" is the best documentation I've read all week.


If you watch the video (I recommend at 0.5x speed) the author adds "you feel me?" to one of the prompts and I laughed.


I've been building a ChatGPT project this week and this explanation is so true it made me actually lol. Sometimes it's impressively flawless, sometimes it gets stuck in broken patterns... Even if the inputs are identical.

"Monkey around with the prompt and pray" has replaced unit testing.


Exactly my experience. I've asked for some time metrics to see if the repeat is actually taking longer than the work.


Ooof


From authors website: "I graduated high school in May of 2022..". Elsewhere, he writes he placed first at a university level drone programming challenge at IIT-Bombay Techfest. Very impressive.


The JEE exam in India is very advanced, I doubt most Europeans or US students could pass it, even after a year or two at university. So those who enter IIT are already quite good very early.


India's high-stakes exams are also plagued by cheating.[1]

So just because someone ostensibly "passed" an exam doesn't mean they actually did.

Of course, India's education system is far from the only one where cheating is rampant, which is just one reason to look on credentials and accolades everywhere with a skeptical eye.

[1] - https://www.economist.com/asia/2022/05/26/indias-exams-are-p...


Compared to paying lacrosse coaches in Stanford - as if US universities are any better!

Your rich dad endows a building in an Ivy and you are in - and apparently it's all good!


I have taken a look at some test papers, it's basically stuff we were doing back in the 80s/90s in France at the end of high school, nothing crazy, I aould even say trivial as it looks to be direct application of the course, and if it's just a MCQ, then it's a bloody joke, even today. And I believe, it should be easily tackled by any current gymnase students following the physic /math stream in Switzerland. The actual question would be more why the level in Europe (and in a lesser extend) north America dropped this much over the last 30years.


Shouldn't the ultimate test of an education system be the practical outputs that serve society? We use testing for more immediate results, but there's nothing particularly telling at a more practical level that engineering in the EU and NA is falling behind India, as far as I can tell innovation in STEM is still taking place frequently in EU and NA.


IIT colleges in themselves are not even in top 100 Universities of the world. If you train a lion in a circus, it won't be the king of the jungle. So if we were to believe your statement: > I doubt most Europeans or US students could pass it, even after a year or two at university.

than this allegedly miraclous students are wasting there talents studying in IITs!


USA and European university rankings have a strong bias against universities in the Global South (especially India), in part because of the stigma against Indian programmers and in part because they're generally wanting to promote US and European universities as the global leaders. If you're going to disparage a university based on nothing but ranking lists run by USA and Europeans, you're missing out.

IIT students are amazing, i've worked with them. Far more creative and competent programmers than the average Ivy League grad in my cohort.


So the racism card again? Why would two independent governing bodies group together to oppress a single country?

Would citizens of the EU and US not want to go to one of the best schools regardless of locale?


> Why would two independent governing bodies group together to oppress a single country?

Not that it’s the same, but have you looked at either of these countries’ pasts? Even relatively recently— the Marshall Islands, or still ongoing, in Mauritius/Diego Garcia.


This is not exactly true either. Some IIT students are terrible programmers, others not. But they do have certain strengths and their math training is very strong.


There was recently a German math professor who famously compared the JEE to some German standards. It‘s worth looking into.

I say this, having studied at a European university as well as visited IIT and a top US university. These are different traditions with different strength and weaknesses but my point was not to say that IIT is better, it was to put the OP comment into a context.


As someone who attended both an IIT (as an undergrad) and a top 10 US university (for my PhD) - think of IITs as very selective undergrad colleges in the USA, such as Swarthmore or Amherst.

IITs are way more teaching focused than research focused, and are have a very light international student presence - two factors that enormously influence rankings.


We also use GPT to perform actions in the software I build at work and we hit the same issue of inconsistency which lead me down a long rabbit hole to see if I could force an LLM to only emit grammatically correct output that follows a bespoke DSL (a DSL ideally safer and more precise than just eval'ing random AI-produced Python).

I just finished writing up a long post [1] that describes how this can work on local models. It's a bit tricky to do via API efficiently, but hopefully OpenAI will give us the primitives one day to appropriately steer these models to do the right things (tm).

[1] https://github.com/newhouseb/clownfish


Copilot has an option (maybe beta users only) to see alternative generations for a prompt. It's really handy because some generations are one-liners while others are entire functions.

Perhaps this is infeasible with the current cost per generation but a multiple choice display could be handy (perhaps split the screen into quadrants and pick the one that most fits what you prefer).


Google Bard has a similar feature, and you can ask the OpenAI API to generate multiple completions and pick the best one (best_of) or return some number of the generated completions (n). In OpenAI's case, best of means "the one with the highest log probability per token".


Hey everyone makes mistakes when coding, even LLms.

There’s two approaches I use, I’m sure there are more.

* Do multiple completions and, filter the ones that successfully run, and take the most common result.

* Do a completion, if it fails, ask the Llm to find and correct the bug.


This sounds like reducing the temperature with more steps


Reducing temperature doesn't automatically result in correctness, it only results in more precision.

It could be inaccurate with high precision - like a bow that consistently undershoots, and you would have a harder time trying to correct it.


You're right, and I understand that - but you're less likely to get the same result with high temperature, so it will be hard to find correct outputs when using overlap of output. But combining pieces that look good to you of several outputs is a good way to use it. Much of the natural language it creates is useful to me as essentially an extended thesaurus, or to make slightly different points I hadn't thought of it emails. It undoubtedly (demonstrably and probably, I'm sure) knows *way* more than any human on Earth, so it's great at teaching people new things. I usually just reword everything slightly that GPTs provide, so it's a *phenomenal* learning tool.


Why aren't they collecting feedback signals from people on the results produced by the model? "This was correct", "this was garbage", etc.


What if someone was asking to generate garbage in Blender though?


I know this tech is terrifying to actual 3D artist who don’t want be a “prompt engineer”, but as someone who has never used Blender, I think its cool that I can create something using tools like this and use them in my projects, ex background animation on a website hero section.


I don't think this is "terrifying" to 3D artists any more than github copilot is terrying to us haha, I'm not sure why we engineers have this tendency of imagining every other industry as being populated by superstitious peasants from the dark ages who fear new tools.

In my experience the vast majority of people are excited about GPT including artists.


I don't know, maybe I am one of those "superstitious peasants from the dark ages who fear new tools", but I'm increasingly terrified of GPT-4 and future iterations of it as applied to the industry I work with (i.e. software). It does seem to threaten to suddenly eliminate all the interesting parts of the job, a good chunk of openings, and to significantly reduce salaries for the openings that remain - all at the same time, and rather suddenly.


In the past, when tools that significantly increased developer productivity emerged, like higher level languages (C, Java, Python), better IDEs, or better access to help (e.g. StackOverflow), the demand for more software has outpaced any decrease in demand for developers due to productivity improvements.

I'm not saying I know that's going to continue forever, but it might. If the cost to produce software goes down, the demand for software will increase. That's what has always happened, but maybe this time is different.

Everyone always thinks this time is different though, it's good to be skeptical of thoughts like that.

My take is that if you stay on the cutting edge and get good at using all kinds of tools with max productivity, you'll probably end up as the tractor driver rather than as an unemployed ox.


People in the textile industry were well paid and given a place to live at one point in history. Then the auto loom showed up and they were kicked to the street to starve. I wouldn't want to try to live in any major US city on a blue collar salary.


Maybe one day AI will allow completely unskilled laypeople, ignorant of software, to build and deploy software end to end with no involvement from any software engineers, testers, SREs, anything. That would be the power loom of our time - the power loom allowed unskilled workers to completely replace skilled hand loom weavers with no drop in quality of output.

When that's possible, the world will be very different. Right at this moment, AI is still useless for unskilled workers trying to write software, it's just a productivity multiplier for skilled engineers.


This isn't an all-or-none scenario. It's not like all textile factories got auto-looms and the labor market collapsed overnight. Tools will improve, productivity will improve, and the demand for software development as a specialty will wane dramatically. Making simple tools using prompts will no longer require knowledge of data structures and algorithms, efficiency, networking, or anything else we get paid to know, and over time will shift to something white collar workers put on their resume next to MS Office. A tiny handful of specialist engineers will control development, and software creation as a commodity will essentially be automated. We will feel the impact of these changes LONG before that process is complete.


> When that's possible, the world will be very different. Right at this moment, AI is still useless for unskilled workers trying to write software, it's just a productivity multiplier for skilled engineers.

Depends on the software, and how much mediocrity the end user is willing to put up with.

A trivial prompt can spit out a web page with functioning JavaScript for a mediocre-but-playable version of Pong.

This may not be of interest to us, but our standards are not necessarily shared by normal people: in the wild, I've seen websites where the thumbnails were all loaded as full-sized images and merely displayed smaller, bottles on supermarket shelves whose labels had easily visible pixelation and JPEG artefacts.

Infamously, there's a lot of stuff done in Excel that really shouldn't be. Some genes had to be renamed because scientists kept using Excel, and Excel kept interpreting the gene's names as dates.

I get SMSes whose sender ID has obviously involved someone somewhere trying to record phone numbers as floats.

Even in places with high standards, the UI of the Calculator app on iOS still gets confused if I tap buttons too fast (before animations finish playing?).


The problem with this is:

Do you want to listen to a 10 brand new songs by 10 brand new artists using generative AI, or do you want to listen to 10 brand new Taylor Swift songs (that were created with the help of generative AI)?

While some people will be able to leverage this to good effect, I fear the established have much more to gain in this new world…


This, and also every textile worker didn't have an auto loom in their pockets, or a lawyer, or an accountant, or a copywriter....I mean who will be left besides leadership teams??


It’s funny you’re using “blue collar” to mean low skilled probably things like retail cashiers, but folks in trades actually make very livable wages in the US.

Here’s a recent article from the seattle times: https://www.seattletimes.com/pacific-nw-magazine/as-tech-job...


> if you stay on the cutting edge and get good at using all kinds of tools with max productivity

For people with eternal youthful energy, good health, no family, and a single-mindedness toward work in life.


For me the interesting job in software is designing architecture and implementing complex things. Automation with tools like this is not remotely there and to some extent probably won't be because we're often talking about human preferences and subjective design choices. Gpt4 in software is Intellisense++ right now, it provides code snippets for things you want to do, it's just raising the bar of abstraction, not replacing the designer.

On the second point, I actually think my salary is inflated and we'd be in the dark ages if I took that for a reason to hamper technology. Not only am I not just a developer of software but also a consumer, so I benefit directly, but more importantly so does everyone else. If everyone operated on that logic I'd still pay 20 bucks for a potato and a hundred for a hammer.

Let's be real the entire point of software is to replace labor. The software industry has done it to many sectors of the economy and called it progress. Which it is. We have no right to start complaining now.


> For me the interesting job in software is designing architecture and implementing complex things. Automation with tools like this is not remotely there and to some extent probably won't be because we're often talking about human preferences and subjective design choices. Gpt4 in software is Intellisense++ right now, it provides code snippets for things you want to do, it's just raising the bar of abstraction, not replacing the designer.

It's not a code writer though, that's not its sole trained task. Why do you think it's going to have a drastically harder time doing the fuzzier higher level work? Human preference and subjective work has a wider acceptance of solutions.

I can have it write abstracts and works of fiction and songs. It wrote a great kids song about bumlollies and their terrible flavour, explained syncitial nuclear aggregates to a lay audience as a jaunty pirate and created ember templates in our custom framework. Have you tried it with any architecture questions?

> Let's be real the entire point of software is to replace labor. The software industry has done it to many sectors of the economy and called it progress. Which it is. We have no right to start complaining now.

It's totally fine IMO to have the views that it's big and scary for me and also good for humanity.


And while doing so income disparity has kept increasing.

Eventually we need to stop and think about how capitalism is winner take all, or we're all in for a very bad time.


We already know capitalism is rotten… but we don’t care because it gives us the opportunity (real or not) to be the one on top standing on a mountain of bodies, basking in the glow of our delusional, narcissistic sense of entitlement.


* for now


What is the issue with income inequality? Even in communism, there is income disparity because those in power have greater needs.


> What is the issue with income inequality?

https://en.wikipedia.org/wiki/Economic_inequality#Effects


You are absolutely right. The myth that automation does not replace Jobs is just that, a myth.

Huge numbers were left unemployed with industrial automation in the US and left unemployable. The technical term is structural unemployment. All it means is you cant retrain 10,000 factory workers to be front end developers, and that even If they find a job its often not as well paid.

The two greatest myths of modern capitalism are that free markets are good for everyone (they're not) and that automation doesnt lead to unemployment. Any reasonable assessment of the data will show both of these to be clearly false.


> All it means is you cant retrain 10,000 factory workers to be front end developers, and that even If they find a job its often not as well paid.

An important effect is the speed of change - it _might_ be possible to train the next generation such as those who would have been factory workers become front end developers, but it's an entirely different challenge to take actual factory workers and train them for another job. That is to say, even if in the long run automation doesn't lead to unemployment, the short term effect may be quite different.


Short term in this case could be decades.


Well I put my foot in my mouth here and apologize for insulting any uncertainty; these are new technologies so uncertainty is normal of course. In retrospect I tend to communicate things online in a much more careless way


I'm in a similar boat. I'm currently building setups to have GPT-n write code, and it's surprising how good it really is, and in such a short space of time.


Maybe LLMs can help us figure out how to dethrone the greedy and pass UBI into law.


> It does seem to threaten to suddenly eliminate all the interesting parts of the job

Writing generic code that are more or less stackoverflow copy/paste ?

The interesting parts are coming up with the logic &c. not typing the code imho


Like Covid, it will change the world very fast. Talk to your representative about it.

As an example, I wont code at work anymore, I chatgpt all day. Just like that, overnight. And my productivity went 10x - though i was already 10x more productive than you.> What should have taken 1 week takes 1 day, sometimes less (it's new software im working on).

I would only code if I was doing great software, that is, for myself, we dont do that at work.

It's gonna build tension silently and then release: capital will be massively reallocated - that moment will be tsunami. Talk to your representative.


I think it's because if you're not in that particular industry you have a super simplified model of what the person does - something like "writing code" as the only activity a developer does.

You don't understand the difficulties and problems that people in the profession face, which I think is also why so many developers are convinced they can replace/"disrupt" other people's jobs with software.


>I don't think this is "terrifying" to 3D artists any more than github copilot is terrying to us haha

It sounds more like you're just seriously underestimating the direction GitHub Copilot is headed.


This is such a vague comment that it's pretty near impossible to reply to in a very constructive way.

By my estimation it seems probable that we'll end up in the near future with some Jira integration that has an "auto fix" button on tickets. Possibly PMs or managers will be empowered to replace a large chunk of work that is currently done by people like me... and these things are only the beginning!

If you were thinking of something more severe then I'm curious what you're referring to? Otherwise I'm not sure what your point is.


My point is that if a machine can write more or less any program, then the artistic merit of programming falls away completely. It's instead replaced by a slurry of incomprehensible nonsense that simply works for some reason.


Yeah, like a binary today, right?

What's the problem if the new "source code" is all prompts written in natural language and the classical code is just a build artifact?


There is no way to have all source code in natural language because natural language is ambiguous. There is no way to debug natural language not doing the right thing.


The problem is that even a Scratch project has greater artistic merit than that.


The reason being tests/constraints specified by someone designing the software.


Ask GPT to design a program. I’m not sure if people commenting along these lines are serious or lying to themselves. AI is moving faster than any other tech in history, and that’s really saying something.

It’s not hard to see where this is heading. And the goal is to automate away everything so we humans can just kick our feet up.


I was speaking recently to a graphic artist who is terrified of Stable Diffusion and the like. I mentioned that these tools can augment their ability to do work instead of just replacing, but their point is that they are a graphic artist because they like doing the things that the AI will be replacing. Being a prompt engineer wasn't really the reason they studied and learned to become an artist. To me, that is a completely reasonable way to feel.


Do what I do and probably many others. Draw at work (use AI assist or whatever hype is available to keep your boss happy), then draw something at home you love.


The reason you draw at home is because your work drains you, physically and emotionally. This leads to jobs you end up hating. So AI makes us more productive, and causes mental illness as we all hate our jobs and see no point in continuing. Just let the machines do it.


Just leaving this here;

https://www.reddit.com/r/blender/comments/121lhfq/i_lost_eve...

Edit; I see it was discussed indeed on HN yesterday. I have already seen tons more of these in my circles and not only 3d/artists.


my main issue is the people making hiring and budget decisions aren't 3D artists, they're managers. my work might be objectively better than GPT's, but is that going to matter when my boss compares the cost difference and decides GPT is 'good enough'?


When your boss gets fired for delivering garbage it will matter


Will they, though? Automation has replaced highly expensive, artisanal, high quality things with much more cheaply made, worse quality things. And yet we buy the latter because the quality-to-price ratio is much better.

Most clothes people wear are garbage compared to bespoke clothes. Most industrial food is garbage compared to what a chef might make. And yet.


Which they won't, because the market naturally optimizes towards the most awful, shitty garbage possible that's still barely fit for purpose. This will only accelerate the trend.


i'm guessing the parent comment is in reference to this link from yesterday: https://news.ycombinator.com/item?id=35308498


It just raises the bar so that 3D artists who are only capable of reproducing the same boring things they saw in YouTube tutorials are no longer considered proficient.

The more I think about this trend, the more I think it might be good.

Bootcamp devs are no longer good enough for junior roles since a GPT could replace them. Digital media people who learnt via YouTube and have no real talent are no longer skilled enough. Writers who can only churn out mediocre blog spam are now jobless.

This seems like it might be a net benefit.


This isn't how it will play out though. At first chess AIs got enough to beat unskilled players. They they were good enough to beat intermediate players. Then they challenged the grandmasters. Today no human, no matter how long they practise or how hard they train will ever get close to beating an AI at chess.

Your assumption that this raises the bar for human 3D artists is correct today, but it won't be long before human 3D artists are seen as much slower and less competent than AI artists, and there will be no going back.


Chess is fundamentally a different kind of problem. You can prove things about chess in the general case, but it's mathematically impossible to prove things about computer programs in the general case.


>Chess is fundamentally a different kind of problem.

irrelevant.

my whole world view changed when I read https://arxiv.org/pdf/2303.12712.pdf

now pair that with a larger memory, backtracking/revising and on-the-fly weight adjustment (aka, real-time "learning") and I think it might be game over.

add goals and motivations and maybe a vision system? game over for meat bags.

These advances are not only possible, they're inevitable. It's just too tantalising to leave alone.


There's no mathematical proof for the correctness of graphic design, either, but that won't stop cheap AI-generated garbage from taking over the role of making commodity images and putting a lot of people out of work.


I'm curious about graphic design and if AI can do it to a passable standard. Generating fiction photographs, paintings, illustrations seems more flexible than constrained and balanced proper graphic design. I am sure AI can mash together templates and icons. Less sure about producing a solution to a client brief with a timely and timeless design.


I do freelance Graphic design. The whole conceptual thinking/visual communication aspect of it seems unlikely to be touched by these tools any time soon... But the commodity work that puts food on a lot of people's tables is likely toast. It's not like it will replace graphic designers, it just takes over the everyday jobs that drive most of the demand for their services and we all know what a drastically reduced demand does to a market.


> but it's mathematically impossible to prove things about computer programs in the general case.

there is such thing as verifiable software.


True, but you are verifying that the code meets some formal criteria, not that it actually does what you want.


formal criteria defines what you want.


Pray, Mr. Babbage, if you verify the wrong properties, will the right answer come out?

Only some software is possible to verify, and there are many properties that it's impossible to verify because the software isn't the only thing that exists in the universe. No amount of mathematical proof on an ideal RAM machine will anticipate rowhammer.

And: just because something's theoretically possible, that doesn't mean an AI system would automatically pick up the ability to do it. Verifiable software in practice is still way behind what we currently know to be possible.


> No amount of mathematical proof on an ideal RAM machine will anticipate rowhammer.

you can infer algorithm failure rate depending on other input factors as an input. Say you found algorithm will fails every 10e15 years of continues run, you can accept such algorithm as reliable.


> depending on other input factors

You're assuming we've solved physics, and that it would be tractable to model all of that. We haven't, and it probably won't be.


that's the best approach we can use.


This will not be true if ai produced works can’t be copyrighted.


Copyrights are one of the foundations of business. As much as the HN crowd hates them, they aren’t going anywhere. You’re literally wasting electrons.


How do you get senior people when juniors no longer have careers?


Junior software developers will still have careers, they'll just have to learn a slightly different skillset that involves more reviewing of code, like a senior developer already does.

Before sometime hits merge, they need to understand if the code meets all requirements, and it doesn't really matter who or what wrote the code in the PR.


They will be reviewing the unit test written by TestGPT for the code written by CodeGPT.


Until ReviewGPT takes over.


But we’ve automated away their job, why were they hired to exist learn the skills?

The demand for engineers, and therefore junior engineers, will be reduced. That is the goal.


From the real education with courses on compilers, operating systems and databases, not 3 month bootcamps on HTML and JavaScript.


How many compilers, operating systems, and databases have you written?

I make 6 figures, never took any of these classes (though I did write my own OS, and compiler). Why do you want people to waste time and money? Perhaps it’s time to split out those classes into some degree that’s more relevant to that work?


I don't need to write a complex software system to utilise my knowledge about it.

Also, I don't have a CS degree and that's not what I advocate for. I advocate for developers who spend years honing their skills and learning fundamentals of the craft.


Spending years honing skills does not mean wasting time. In fact most self studies do it out of passion, which means a lifetime of honing skills.

I have also written a compiler and my own language as well as an operating system, and its usefulness has come up exactly once in near 15 years in the industry, and that was a very niche topic. Explain why, exactly, you need to know compilers and operating systems to write any modern day app.


There will still be junior devs. My comment was only referring to particular types of junior devs. In short, the barrier to entry foe being a junior dev will be higher, and require more CS knowledge, perhaps even more hardware knowledge. No more of this "I made a todo list in react in 2 weeks and am now ready to be a software engineer" rubbish.


There has been general hate towards self taught and code bootcamp grads. Some are in the field only for the money, but many others simply lacked the resources to attend a 4 year university.

When you openly state that you dislike these types of programmers, and want to essentially purge them? Hopefully AI destroys your specific job and you can’t find work again.


In the same way that junior artists still exist despite us having moved from MS Paint to Photoshop!


No, that would be comparing to software developers moving from Notepad.exe to proper code editors and IDEs.

The relevant comparison is with junior artists in the age of Midjourney, Stable Diffusion, ControlNet, BlenderGPT, etc. They're facing the same uncertainty as software devs, at the same time, so there isn't much insight to be gained here just yet.


I don't really care how or where someone learned something, so long as they learned it well and can apply it.

My experience in the software industry (20 years now) showed me that the best ones were the ones who got into it out of genuine interest. They tended to write software as a hobby.

There was no shortage of CS grads who couldn't be nearly as productive.

The self-taught ones, or the ones with genuine interest who also completed a degree program were the best.

I wouldn't discriminate against "boot camp coders" or people who learn things from YouTube.

There's a lot of people who live in a different world where an expensive college/university education is not an option.


>It just raises the bar so that 3D artists who are only capable of reproducing the same boring things they saw in YouTube tutorials are no longer considered proficient.

That is of course, until GPT-6 surpasses them.


The change of pace isn't regarded in these what-ifs usually:

Why should someone hire someone experienced when they could just cruise on patchwork solutions made by inexperienced contributors using an AI model until the next gen of an AI model is released?

Can the experienced person really outpace the model development in terms of innovation? Is it worth trying to innovate in a niche?


It is good when people lose their jobs, actually!


That wasn't my point. It is bad that they will lose their jobs. But that doesn't mean we should keep those jobs around.


No, but perhaps we should take some responsibility and provide people with alternatives before destroying their lives.


I use blender a lot. I know what the api can do. I'm not terrified. Getting anything significant done will still take real work/knowledge because the number of potential 'parameters' of what you can do/make requires that you have the language to describe what you want. Having the posession of that language and knowing how to use blender is a very positive correlation.


Imagine being able to sketch out the outline of what you are after, then say hey blender model it up. Thats what it will get to.

You have a beach, and you want a jetty, just grease pencil approximately where you want it and go Hey thats a jetty build it. and if its not quite right, generate me 50 different versions and ill pick the best.


> I know this tech is terrifying to actual 3D artist who don’t want be a “prompt engineer”

My predication: in the next 5~10 years, most artists won't be "prompt engineers". Instead they'll focus on fix small details on AI-generated art.

It's still kinda sad tho, because it's usually the most tedious and boring part of the process. Now AI is taking the fun part and leaving the unfun part to humans.


Yeah, but it's kinda like making business software. Most working people need to do boring work to sustain their lifestyles.

I hope that these task-specific implementations of AI can reduce the tedium in these fields, like the way PCs did. Certainly, this advancement is leading to the ability for practically anyone to program a computer, in the general sense. Things will shift, but there will be opportunities to exploit those abilities for personal gain.


This is exactly what this post from Reddit posted to HN [0] was complaining about.

I think many people hear and see these complaints and think that people are being luddites or being afraid of losing their job. People should be looking at it for what it is - someone complaining that their entire job is changing into something they no longer enjoy.

0: https://news.ycombinator.com/item?id=35308498


This is terrifying to us, but not in the way you think. Tools like this place selective pressure so that increasingly only the best of the best get paid to make stuff, and their stuff influences the data sets in the future. The rest of us get forced out simply due to cost unless we're doing it as a hobby. Some artists know all too well their limits, and those that do are the most vulnerable to things like this, because these learning machines use the ever higher ceiling of human ability to determine the floor of their ability. You can't go back to the funky weird pasta men of Stable Diffusion's 2021 iteration as the default for example, but someone who started painting in 2002 might still have the same level of skill now as they did in 2010.

Text was paralleled first. Then sound was paralleled next. And now image is being paralleled. It will be on level with highest percentile of human ability just as text and sound were before it. Game devs and comic artists are already replacing texture and background artists with AI generated images, just as they used AI to create hundreds of thousands of lines of fluff dialogue before then.


Isn’t being a good boss kind of like being a good prompt engineer?


If so, then this means the good boss will soon no longer need employees, as the LLMs will deliver better results faster and cheaper.


Who will buy the bosses' shit if everyone is unemployed and can't afford it? Companies can't stay afloat by just buying from each other.


> Companies can't stay afloat by just buying from each other.

After an initial injection of capital, yes they can.


GPTs with access to budget


So any and all commerce just becomes a different form of stock trading? I doubt it. People will still need and want tangible things.


> People will still need and want tangible things.

And that matters to who, and why, exactly?

Once enough of the economy is automated, there's hardly a reason for it to keep humans in the loop. It can just run in a circle, serving itself. Humans, meanwhile, will just have to barter for scraps amongst themselves.


So much of the presumed negative consequences of improving AI are actually the negative consequences of the power structures they're going to exist in I think.


If we truly live in a world where those in power seek to eliminate all other (non-powerful) humans from the equation, then maybe that's our real problem. For the record, I don't actually believe this is the case. Some people might work toward this goal but I doubt that most powerful people would voluntarily want to give up their influence over the masses. Which is what they'd do if they leave them to fend for themselves while the bots run their businesses. At some point, when you have enough money already, you don't actually seek more money, you seek more influence, and money is just the vehicle. ETA: btw, if a truly autonomous money-making machine is invented, that's just going to cause inflation, making it all worthless.


Most people aren't worried about being a prompt engineer-- they're worried about being a Doordash driver if it turns out that one prompt engineer can replace many professional, well-paid people in their industry. That's pretty rational.


> being a Doordash driver

If that job will even be available. The other day my sister sent me a photo of a little food delivery robot she spotted on the streets[0], and mind you, we're not living in Silicon Valley, but in Poland.

--

[0] - https://www.deliverycouple.com/ - based on the markings on that robot, its these people.


There is not much to prompt engineer; the obvious trivial thing that is already happening and will make the glorious job of prompt ‘engineer’ completely redundant is AI doing the ‘engineering’ for you. One person (or AI in the future; this is already possible now for humans and gpt though) that the AI has been trained to ‘know’ and work with will blurb things into a mic and the AI will instruct millions of brains to whatever the ‘boss’ wants, translating his mumbling into actionable user and system prompts. There won’t be prompt ‘engineers’ for long.


I think you drastically underestimate the intellectual component of other people's work behind the tools they use to perform it.


In some industries sure, in many others it’s just mindless repetition. The argument is that AI is to free up their time so they can do more important things. Regardless, less demand means less wages across the board. Those that are intelectuales capable of doing more will look to the higher paying jobs once AI automates theirs away.


That those jobs exist and might be automated doesn't negate the existence of all the others. One selfish benefit I see in this will be the hubris of my fellow software developers being taken down a few hundred notches when the market for commodity software development collapses.


Self driving cars and drones are pretty close, door dash driver won’t be an option much longer either.


I asked my friend (renowned vfx artist) what he thought about AI in his field and he mentioned some niche cases where it was useful for blending models with film etc, but overall seemed uninterested.

He offered the explanation that so much of their time is consumed with nit picking through purely aesthetic decisions that AI would not be capable of the artistic reasoning required to produce work that could even get to the "pass or reject" stage.


As someone who had tried my hand at Blender and failed miserably/given up years ago, I am so excited to use this to make up for my Blender skill deficiency


Have you used it yet? Was it actually that easy? I can't wait to try.


I see a lot of people asking how does this work. The method he is using is one shot learning. He has a prompt and an example of what the interaction should look like.

You can see the prompt here: https://github.com/gd3kr/BlenderGPT/blob/main/__init__.py

It is really easy to build this kind of thing - I've got a very simple command line chatbot that should be very understandable and you can easily play with the prompt.

https://github.com/atomic14/command_line_chatgpt

I would also recommend that people try out the openai playgrounds. They are great for experimenting with parameters.


You can see the prompt here

See how this file quickly got hacked together all in one file. It's refreshing to see and note-worthy as it appears when new exciting world-changing tech emerges that makes programmers let-go and become hackers again.


Did something similar as a natural language interface to DuckDB this weekend, and it's surprising how nice it works[0].

That said, through my testing I found it tended to trail off during longer exchanges if I only added the guidelines to the system prompt, so I ended up adding them to the first user message + adding a small reminder at the end of each further user message.

[0]: https://github.com/cube2222/duckgpt


This is brilliant! Nice job!

Why didn't you do the same for OctoSQL? Are you considering DuckDB for your compute engine for OctoSQL?


> This is brilliant! Nice job!

Thanks!

> Why didn't you do the same for OctoSQL?

I started with OctoSQL! But the SQL dialect is a bit non-standard, while DuckDB uses the popular postgres dialect. This came up esp. with more complicated queries, where GPT was generating queries that failed with OctoSQL, while it manages to do well with the DuckDB dialect (even though it sometimes needs 2-3 attempts, but those are automatic).

> Are you considering DuckDB for your compute engine for OctoSQL?

I've been exploring that. The main disadvantage is that OctoSQL has support for temporal features and live-updating queries (something like reactive materialized views, in a sense), which would not be possible to accomplish with DuckDB.

Moreover, if you're already using DuckDB for the execution engine, there's not much reason (other than the plugin system requiring you to use C++) to not use it end2end. I think in that case a more duckdb-native project would make sense to just fill in the niche of simpler plugin authoring for DuckDB. Something like Steampipe for Postgres foreign data wrappers, but for DuckDB.

For OctoSQL I'm experimenting with WASM now as a SQL compilation target, as that could be designed to support the features I've mentioned above.


yeah there’s also AICommand for Unity that uses a similar technique: https://github.com/keijiro/AICommand


this is cool, but hm, how should I put it...

Ultimately there's a relationship between the preciseness in which you want to control something and the underlying information, as conveyed in language, to describe such precision.

Whether you use plain english, or code - ultimately to do things of sufficient precision you will have to be equally precise in your description. I'm sure someone with more time and more knowledge on such things have already formalized this in some information theory paper, but...

The point I'm making here is that this is great because a lot of people are doing "simple" things, and now they will be able to do those things without understanding the idiosyncrasies of Blender APIs, but I'm convinced that this will ultimately turn into something equally difficult as blender APIs to do novel things. and WHEN (not if) that happens, I hope users are prepared to learn the Blender APIs, because it will be inevitable.

edit:

one other thought. I think "language models" are not the right solution ultimately. I think kind of like AI didn't boom until the proper compute was available even though theoretical models and algorithms existed, language models are the crud solution.

once we have a loseless way to simply "think" what we want, then a "large thought model" will have less trouble, as there will be less ambiguity in what you want to what is said.

right now it's thought -> language -> model.

later it will be thought -> model.


It could also be that something like this is just another tool in the toolbox. Sure, you could spend time trying to understand Blender's Python API, but lots of Blender users are not programmers. It could be really helpful for them to be able to say "Place 10 spot lights at random position within the boundaries of Mesh 3 with random intensity and color" and just have that appear, rather than having to go looking for a plugin that does it for them.


I dont know why people make this harder than it needs to be, make group of lights, change colors and duplicate, its good enough, no need to use python api.


Obviously just an example, but even so, unless you already have the workflow established, writing human text describing something could be easier than manually doing it.


What's the benefit in not being specific, aside from it being easier to request?


When it comes to art (audio or visual especially) iterating over a concept with some air of "generality" and "good enough" can have very interesting emerging properties. You get a "vibe" for what you want, but you might not have a perfectly clear idea of the entire composition until you actually see it in front of you, and then you iterate over that.

When I write a song I usually just noodle on my guitar with some pre-programmed drums in the background, I just play whatever comes to mind at the time, record it, then listen back to it, change a few things, add a few accents, decide to add another guitar line, maybe shift in fifths or sevenths to add more voices, add a few instruments that fade in and out like strings or brass, etc.

Some people might have more methodical approaches to art, and that's fine too, but in my case it's absolutely a 100% exploration effort about stuff that I don't even know I want until I see it in front of my eyes. These tools are amazing for this.


The benefit is that it's easier to request.


Well, you get precision with imprecise communication via iteration. You walk together toward the correct mutual understanding by successive clarifications.


There’s also a parameter you can set called “temperature”. This is how you control the free-form aspect of the model. Set it close to 1 and it will bullshit a lot, or rather it will take bigger chances in filling in outside of context and maybe a bit more hyperactive. A value closer to 0 keeps it tight by taking less risk off the context, but can also miss out a bit and loses color.


Sure but GPT certainly cannot actually do this. It can iterate, but in GPT-world that means "give me another response" not "learn from what i'm saying, and give me another response"

The issue is, sometimes those two things seem to be the same thing!


Eh? GPT definitely uses the past comments as context, so as you clarify, its responses get much closer to the target. If you haven't tried GPT-4, you absolutely should before having strong opinions about it.


> so as you clarify, its responses get much closer to the target.

This isn't a given. Plenty of times this isn't true. You cannot convince GPT to answer every problem all the time.

For example, try and teach it a grammar. No matter how many times you try and work with it, you won't be able to.

You can teach almost anyone a grammar if they are inclined to try to fuss through it. Not GPT.

And yes I have used GPT-4 a lot, please don't assume I haven't.


Sorry, my question, sounds snippy now that I reread it. It sounds like you’re qualified to have strong opinions about it :-)

And that’s fair, you’re exciting the existing network, not training/changing its values fundamentally. But you can do a lot with that excitation, because it’s already got a lot to work with. And I don’t think this is very far off from how people work with new ideas in the immediate term - when they first hear about them, they think of them in terms of things they already understand.

It has been my experience that it’s generally capable of getting closer to the target, but maybe I just haven’t tried to push it past its capabilities.


You're fine, thanks for the apology.

I agree that GPT is very impressive, but just today for example I asked it how I can make a "virtual column" in Postgres, and it spun its gears.

I pasted the error I got from Postgres, and it said "oh, it's because you're using this function. here try this solution that doesn't use that function!"

And the solution... used that function!!

Was not even a complex ask. Concat of two other columns. I admit it was a good pointer in the right direction ("generated column" vs what I came up with "virtual column") and it was really impressive that it had a syntactically correct solution (though it didn't work)

But, if it can't do something as simple as not give me the function that doesn't work and that it understands doesn't work, that I told it it doesn't work, that the error says doesn't exist, etc.. in its new solution... after many prompts! (I exhausted my GPT-4 quota)....!

It's bad. It just isn't going to do much here past the initial "get me started" which, I do admit, is really impressive!

It just isn't what people are claiming it is :)


parent is only half wrong; given a sufficiently long iteration process any of the chatGPT versions would surely start losing coherency regarding far past requests -- this may be less evident when used within a well confined and (somewhat) easily self-referenced system (like Blender, for example) -- but it's especially evident when trying to prompt GPT to write a fiction story from basics or otherwise work entirely on its own without a place to store outputs and then refer back.

tl;dr: it's easier to tell chatgpt to "Rewrite this story: " and then feed back previous outputs when writing a story than it is to get to an acceptable output from massively detailed prompts or long chains of iteration; this trait has far-reaching consequence rather than just writing fiction.

I do understand , however, that 'long-term memory' is a very active point of discussion and development.


What I am saying is you can't get GPT to solve any problem you throw at it. What you can probably do is get it to give you a correct answer that it was trained on. Those are different things.

You can't teach it new information, and often that is required to solve a problem.


Chatbots use random number generators to create a variety of output, so it would be a terrible idea to use natural language as "source code" for a code generation tool. Running a chatbot shouldn't happen within a build process, because the results aren't reproducible.

Once you have the source code though, you can use a variety of tools to manipulate it and save the result. Using chatbots to make modifications under supervision is fine. You discard bad modifications and save the good ones.

This is using natural language for one-offs and source code for reproducible results. It's looking like they will go well together.


Can you not provide a seed to an LLM the way you can with some of the image / video generation models I’ve seen?


Yes, probably, if you got one that runs locally. OpenAI doesn’t have a way to do it in their API.

There are other reasons not to do it, like it being an external API that charges money. And even if you got something local and deterministic, the generated code would be less easily tweaked by editing the original prompt than by asking the LLM to make the change you want, or by editing the code directly.


With a shared vector database and semantic search plugin you can see bunch of prompts other people have created.

with a search plugin you can have it find the api docs and have it output interesting parameters and how to use them with examples.

with a python REPL plugin you can have it generate 10 variations and run the code for each.

with GPT4 and plugins you could describe the output you want to midjourney or something and give the prompt to it to generate it in blender(feed the outputs to some image similarity vector to compare) and have it search through parameter space(or vector space of the prompt) until it finds something pretty close to what you want.

given your budget of course.


If the command is not restricted to text only, but use text and geometrical context, you can remove a lot of ambiguities. This is often done in video games with contextual interactions.

After all, using the Blender GUI, you can do a lot using only a 2D mouse coordinate and two boutons. So 2D mouse coordinates and text could be better.

A nice evolution would be an AI model that can understand natural language instructions, while taking into account where your mouse pointer, how the model is zoomed and oriented, and that has geometric insight of the 3D scene built so far.


Or in other words, the "create an adversary capable of defeating Data" problem... just a bit less of an issue (for now) if you get the instructions wrong.


"Zoom Enhance!" -- Hear me out:

One interesting thing would be to describe a scene and get the rough print. But do it in sections, such that you can select and begin to refine sections and elements within the scene to whittle down to the preciseness that pleases you for each element...

What would be really interesting will be how geometry nodes can be managed using gpt.


Looking at the underlying prompt [1], it is a one-shot prompt, i.e. it contains one example of a natural language task along with the corresponding Python code (plus a "system" prompt for overall context and direction). Amazing how much you can do with a simple prompt.

Imagine Jupyter notebooks with this capability. Or Photoshop. Or Davinci Resolve. We live in amazing times.

[1]: https://github.com/gd3kr/BlenderGPT/blob/main/__init__.py


Did you run into issues that made you tweak the prompt? You often tell the tool not to do extra work despite having it in the prompt, did you find that it often tries to do a lot of "boilerplate" work like setting up lights and cameras?

  system_prompt = """You are an assistant made for the purposes of helping the user with Blender, the 3D software. 
  - Respond with your answers in markdown (```). 
  - Preferably import entire modules instead of bits. 
  - Do not perform destructive operations on the meshes. 
  - Do not use cap_ends. Do not do more than what is asked (setting up render settings, adding cameras, etc)
  - Do not respond with anything that is not Python code.


Usually it tries to explain code, hence the last one.


Famous quote: "Wouldn't it be nice if our machines were smart enough to allow programming in natural language?". Well, natural languages are most suitable for their original purposes, viz. to be ambiguous in, to tell jokes in and to make love in, but most unsuitable for any form of even mildly sophisticated precision. And if you don't believe that, either try to read a modern legal document and you will immediately see how the need for precision has created a most unnatural language, called "legalese", or try to read one of Euclid's original verbal proofs (preferably in Greek). That should cure you, and should make you realize that formalisms have not been introduced to make things difficult, but to make things possible. And if, after that, you still believe that we express ourselves most easily in our native tongues, you will be sentenced to the reading of five student essays. - Dijkstra From EWD952

https://www.cs.utexas.edu/~EWD/transcriptions/EWD09xx/EWD952...


On the other hand, we already specify programs in natural language, and in this case chatGPT is really taking a "specification language" as input.

When a client wants a button on a webpage, they don't send the web designer a legaleze document describing the dimensions of the button. They usually don't even tell the designer what font to use.

The web designer pattern matches the client's english request to the dozens of websites they've built, similar buttons they've seen and used, and then either asks for clarification, or specifies it clearly to the machine in a more specific way.

Is that different from the chatGPT flow?

Honestly, we also already mostly use english for programming too, not just design. Most of programming now is glueing together libraries, and libraries don't provide a formal logical specification of how each function works. No, they provide english documentation saying something like "http.get(url) returns an httpresponse object or an error". That's far from an actual mathematical specification of how it works, but the plain english definition is enough that most programmers won't ever look at the implementation, the actually correct specification, because the english docs are fine.


> Is that different from the chatGPT flow?

The designer knows the context of the question, the website, the previous meetings about the design styles, possibly information about the visitor demographics and goals, knows the implicit rules about approvals and company hierarchy, knows the toolset used, the project conventions, the previous issues, the test procedures, etc.

The equivalent of telling a designer where you want a new button would be equivalent to feeding a small book of the implicit context into ChatGPT and without access to visual feedback you could still end up with an off-screen button that passes all the tests and doesn't do anything. The "fun" part is that for simple tasks 90% of the time it will work every time - then it will do something completely stupid.

> they don't send the web designer a legaleze

That's the implicit context. (And yeah, bad assumptions about what both sides agree on causes problems for people too)

Also, chatgpt will try to make you happy. You want a green button here? You'll get a green button here. A designer instead will tell you it's a terrible idea and breaks accessibility.


Yes and no. You've provided an excellent introduction to the problem space, but I think natural language has a larger role in formalisms than you might expect.

The most familiar formal language grammars to most people here are programming languages. The difference between them and natural language has been categorized as the difference between "context-free grammar" and "context-dependent grammar".

The most popular context-free language is mathematics. The language of math provides an excellent grammar for expressing logical relationships. You can take an equation, write it in math, and transform it into a different equivalent representation. But why? Arithmetic. The Pythagorean Theorem would be wholly inconsequential if we didn't have an interest in calculating triangles. The application of math exists outside the grammar itself. This is why you, and everyone else here, grew up with story problems in math class.

Similarly, programming languages provide excellent utility for describing explicit computational behavior. What they are missing is the reason why that behavior should exist at all. Programs are surrounded by moats of incompatible context: it takes explicit design to coordinate them together.

If we can be explicit about the context in which a formalism exists, we could eliminate the need for ambiguity. With that work done, the incompatibility between software could be factored out. We could be precise about what we mean, and clear about what we infer. We could factor out all semantic arguments, and all logically fallacious positions. We could make empathy itself into software. That is the dream of Natural Language Processing.

I think that dream is achievable, but certainly not through implicit text models (LLMs). We need an explicit symbolic approach like parsing. If you're interested, I have been chewing on an idea that might work.


> Similarly, programming languages provide excellent utility for describing explicit computational behavior. What they are missing is the reason why that behavior should exist at all. Programs are surrounded by moats of incompatible context: it takes explicit design to coordinate them together.

Such a well written reply! This puts into words a lot of my thoughts around programming today and how NLP can help.


In (language model-backed) natural language interfaces the loss of precision can be made up via iteration. If it were about getting it right on the first try this would be a dead end but there's no need for that restriction.


And specifically, if the agent implementing the requested actions has a mental model of the domain that is similar to the mental model held by the imprecise specification writer, then the specification doesn't need to be precise to have a high probability of being implemented right the first time! The miracle here is that LLMs don't even need fine-tuning to get to that level - and that's unprecedented for non-human agents.


What makes us think that encoding all functionality using natural language is somehow more compact than using code? Sure, describing a website with a button is easy, but I don’t quite see how you would describe Kubernetes or a device driver without writing a book in “legalese”.

Or try formulating a math proof with natural language.


Who said you need to need to describe Kubernetes? "Make my programs work at any scale". Boom


That just.. doesn’t work that way.

Edit: besides, if it could work you lose the competitive edge. I could describe a much faster more cost effective system which the machine can implement. And we are off to the races again..


If/when it is a generally capable intelligence it very much does work that way. That's how it worked with humans. We saw that stuff needed to work at scale and created something to achieve that.


I guess you are right about AGI. I think (hope) that is still off in the distance. We’d have other problems than scaling crud apps, I think.


>I think (hope) that is still off in the distance.

It's being rumored that OpenAI is currently training GPT-5 which will be ready in December, and that many people in the company think it will be a human or better level AGI. Even if it isn't the consistently supersonic jumps they are making every generation suggests we don't have long until human brains are outmoded legacy hardware.

>We’d have other problems than scaling crud apps, I think.

Ever since I first interacted with the original GPT-3 in 2020 I've had the realization that our future was going to be curtailed and distorted into an inconceivable Escher piece. It seems that future is nearly upon us.


My first encounter with GPT-3 messed me up too. Interesting.

I'm all over the place with this. Some days I think it's no big deal, but sometimes I'll get angsty about it.

Today, for example, I'm using it to generate some animations and stuff I generally don't like dealing with (math problems). I remember spending hours on this and not getting anywhere. This thing makes all that effort seem like handcoding websites in the era of templates.

Whenever it hits my direct line of work I'm like "no way that thing works, see, it did this small thing wrong and it proves it is fundamentally incapable of anything". When I use it for domains outside of my expertise I switch to "yeah, sure, but this was either already exceedingly obvious and/or nonsense busy-work to begin with".

News at 11: developer is arrogant.


The idea is you wouldn’t formulate the proof, you’d describe what want proven and it would generate the proof.


If the machine is that clever I would worry about other things. You definitely wouldn't need to ask anything of it anymore.

Besides maybe.. "make all these entrepeneurial types obsolete." Poof!


It's a matter of degree: you can probably already get GPT4 to generate simple proofs; I'm not expecting it to break new ground in mathematics. What I'm getting at is just that the usage is declarative: you'd be describing what you want, not the details of some particular mathematical formulation (though you would likely iterate, making references to formulation details, but still speaking mostly in natural language).


I understand, thanks for the clarification. What I like to add is that our reality is a competitive one. If everybody can say “do this general thing”, what is the competitive edge? It’s all relative.

We as humans are not just in the business of solving general problems. We are competing with each other. We need to be faster than the slower ones to survive. (I like to change that but that is not a technical issue.)

One of the ways to compete is to “talk faster” with it. Iterate quicker than the competition. How? I daresay we might get there faster by talking in some sort of modified language.. a code of sorts..

Another way to compete is to become a deep domain expert. Expert of what exactly, if AI is doing it all? Human psychology?

I guess I am just interested in the competitive aspect of it. I have no idea what will happen, but definitely curious what will be possible.


Legibility is generally a requirement though. At least is currently is.


Yeah maybe! Just like if you have enough monkeys iterating on enough keyboards...


Just a few monkeys should be fine ;)

If this weren't the case then it wouldn't be possible for (e.g.) the software industry to exist as it does: non-technical folks using natural language are able to converse with engineers who take informal descriptions and turn them into code, often leaning heavily on iteration the bring code and spec into conformance.


"should be fine" - how do you determine this?

There have been many, many cases where I was not able to get GPT-4 to "understand" my problem. No matter how much I tried (until I hit the rate limit for those hours, anyway).

People are throwing these absolutes around, and it's just not totally true.

Much of an engineer's job is to try and implement the correct solution for imperfect requirements, then to go back and quickly fix things to match the real requirements.


I’ve brought this up in another thread, but we already have evidence that in certain cases it is desirable to trade off specificity for just “getting something done”, that evidence is Python. Python succeeds despite not having a serious type system, because often times you can get something that achieves your goal despite the 25 bugs in your program that would have been caught by a real type system.

On the other side of the coin, there’s C++, which is usually doing the heavy lifting underneath the underspecified-but-sufficient Python code.

My guess is that as LLMs evolve, they will more naturally fill this niche and you will have high-level, underspecified “code” (prompts) that then glued together more formal libraries (like OpenAI’s plugins).


Not only that, but one of the big successes of chat AIs is their conversation. When a program does something wrong, it's hard to figure out what you said wrong.

When a char AI misunderstood you, it's often quite easy to explain where the misunderstanding happened and the AI will correct itself.


I think there exists room for an intermediate layer, a framework for AI to write code in specifically. With larger "target" code blocks it can stitch together. Code is designed around humans to allow them to write specific, abritrary and precise logic with a lot of minuate, but LLMs might be better off thinking in larger business logic blocks that are testable and "sanctioned" by the framework makers for the specific problem domain.


Just look at JavaScript. Types don't matter so much, look at the modern web.


I’m working with a large js code base right now and can’t get it ported to typescript fast enough.


While I agree, I don't think this is what Dijkstra was speaking about. A lot of the "AI interfaces" are just underpinning the real technical representation. For example, if you say "Which customers recently bought product #721?" this might generate a SQL query like "select * from customers where ...".

The interface is not the ChatGPT text box; it's SQL. The ChatGPT text box is just an assistant to help you do the correct thing (or at least, that's the way it should be used).


Your example isn't very comparible to tasks involved in creating, texturing, lighting, animating and rendering 3D models using spline and mesh topologies.


All of that seems very plausible, right?

"Make a juvenile elephant."

"Make his ears comically large."

"Bigger."

"Give him a little floppy hat."

I don't think it's out of the question for these kinds of commands to result in the correct outcomes. Now, maybe I can adjust the ear size more precisely with my mouse, but it probably saved me a bunch of work.


I would say it's applicable to anything with a programming language or DSL for it, which this does.


I really feel this. Whenever I try to use ChatGPT, I feel both slow and strained. My usual workflow involves me thinking very few English sentences in my head - I move faster thanks to abstract thought, and I don't think I'm alone.


Same. Somewhat ironically, my art school vocabulary has come in very handy when writing art generator prompts.


So, really, all we need then is a language sufficiently precise enough to specify what a program needs to do and we can then feed it into a program which can write software that implements that specification, possibly adopting safe transformations of it into equivalent forms.

Now, that safely describes a modern, optimizing C compiler.....


If you want something unambiguous in a single pass, sure. But interaction with some back and forth can dig into the ambiguities that matter quickly, ignoring those that don’t. Dialog is a whole separate model. So maybe natural language is a terrible specification language but a great interaction language.


We need a DSL that converts/compiles into natural language for the AI.


Meanwhile, Star Trek has been using natural language for a few decades now, and it works pretty well.

Considering they also predicted iPads, we might want to take it at face value.


star trek is also fictional where things happen as the writers wish it to… the real world is not subject to such restrictions


Then why do our iPads seem to match their PADDs almost exactly?


Because Apple engineers watch Star Trek too, more than likely.

Also, the wax tablets of antiquity also strongly resemble iPads. It's a fairly old invention. Might as well argue the ancient Greeks invented iPads.


Natural language doesn't NEED to be precise- that's what theory of mind is for, to allow efficient inference of meaning from objectively-ambiguous communication.

It's not perfect, but LLMs appear capable of making the same kind of inference, sufficiently that it'll inevitably be possible to program in natural language sooner or later with minimal or no manual double-checking


Similar vibes:

"The Sketchpad system makes it possible for a man and a computer to converse rapidly through the medium of line drawings. Heretofore, most interaction between men and computers has been slowed down by the need to reduce all communication to written statements that can be typed" - Sutherland


Visual Basic is so easy secretaries will be able to write programs.

Sometimes smart people say dumb things and it takes a while to figure out they are wrong.


"Wouldn't it be nice?" "But it isn't so."

An is-versus-ought style mistake.


fantastic quote, thank you


"Computer: tea, earl gray, hot."

We're nearing the precipice of more natural human-computer interaction that will need to rethink the interfaces and conventions.

Alexa and Siri seem like Model T Fords when there's a jet aircraft flying overhead. I'm thinking these agents need to be replaced by more natural agents who can co-create a language with their human counterparts rather than relying on fixed, awkward, and sometimes unhelpful commands. It would behoove us to expose APIs and permissions delegation in a more consistent and self-describing (OpenAPI + OAuth / SAML possibly) manner for all possible services one would wish to grant to an agent. If a natural language agent is uncertain, it should ask for clarification. And on results, it is necessary to capture ever-more-precise feedback from users because positive and negative prompts aren't good enough.


> “The way it functioned was very interesting. When the Drink button was pressed it made an instant but highly detailed examination of the subject's taste buds, a spectroscopic analysis of the subject's metabolism and then sent tiny experimental signals down the neural pathways to the taste centers of the subject's brain to see what was likely to go down well. However, no one knew quite why it did this because it invariably delivered a cupful of liquid that was almost, but not quite, entirely unlike tea.”

I think Douglas Adams was closer to the truth on the subject of AIs and tea. I don't want to be overly cynical but I suspect we'll just get used to saying "OK, close enough" when dealing with LLMs.


I'm glad it is not just me who thinks I need to "thank" chatGPT or encourage it, e.g.: "Great, now do .... " "Great, now do ... ".


Its not 'thanking', its positive signal that GPT's previous outputs were correct, so it should continue doing whatever it was doing. If you say no/bad etc, then GPT will try other approaches.


> "its positive signal that GPT's previous outputs were correct"

Is that somehow baked into the algorithms?

Are positive words of encouragement interpreted as "positive signals" by the inference pipeline? Or do they somehow influence the attention mechanism?

Because otherwise, you're just rationalizing completely random and unpredicted behavior.


> Is that somehow baked into the algorithms?

OpenAI’s API is stateless, which means you need to send the entire conversation thread in each request.

So when you send a response like “perfect, now do…”, you’re reinforcing the language model that the conversation history is on the right track.


That's literally what I said: those words are "encouraging" it.


OpenAI surely retains conversation logs.

When super intelligent AI gains power, I want it to know I've been a good boy.


Pascal’s Wager, AI edition? :-)

https://en.wikipedia.org/wiki/Pascal%27s_wager



That is the 180-degree opposite of Occam's Razor.


God is not inevitable


This is getting closer to the system I've been saying I wanted since the 1980s, whenever people said "what could you use a faster computer for anyway?"

It's a system where you can talk to it and make a photo realistic movie. The example I always use is, you're sitting at the computer, looking at a blank screen and you say something like:

"Ok opening scene. Dockside, London, early 19th century. Early evening. There are several ships docked, one being offloaded. Stevedores are working, some disreputable louts hanging around."

The screen is updating as I'm talking.

"OK make it grittier, more dirt and grime, let's have a fight break out in mid distance left of the screen. Now pan slowly right to reveal a bar called the Skull and Crown. Make the sign dirtier but let the last light of sunset glint off of the skull."

Screen updates. We are looking at what appears to be a Hollywood level period set full of extras who look the way they should, based on historical data that the model has.

"As we pan over towards the door Micky gets tossed out by the big burly barman. Make him younger, skinnier, he's about 17 years old."

The point of all this is, no, you don't need exact language to specify what you want. In the world of filmmaking you never do that. The screenwriter describes things in some detail, but always leaves a lot up to the interpretation of the director, the set designer, the costumer, the makeup artist, the casting director, etc.

The AI can take on any or all of those roles for us.

What I want is something I can control to make the movie I want to make. Then I want to be able to iterate on it: Let's make the main character a woman. Now everything gets changed to fit that. etc.

Of course the AI can replace the role of the writer too, and the director, and the producer leaving me with nothing to do. But the fact that I can bring my vision to the screen still makes it a great tool.


What the existence of this may cause over time is the erosion of movies as "special" things. When everyone can make movies, they'll reduce to an emotional expression in a communication, much like choosing an emoji or gif is when augmenting a chat.


I definitely agree with that. Everyone will have their movie that they want you to watch and you won’t wanna watch them because they usually suck. But some of them will be good and that’s the way it is now with music too so I still think it’s a good thing to enable more creative expression.


This is something that a year ago, if you saw it in a movie, your eyes would roll.


I think real CSI "enhance" is coming also although i'm sure all kinds of details will be imagined and interpolated


Yea, that is the real worry with this technology. You'll be able to zoom in on a blurry image of speeding car taken at night and get a perfectly clear license plate, but there is no way to guarantee that it is the correct license plate that was actually photographed. It'll just be the most 'probable' license plate based on some inscrutable probability distribution that is locked inside black box that even the developer of the software cannot explain. I fear explaining that to a jury could be really hard.


Have you seen the leading upscalers? Things like Topaz Gigapixel are doing basically this.


Until We started to see LLMs, and the tools that can be created with them, I doubted the possibility of Star Trek's Voice command system. Asking for the computer to clarify some concept, or filter and reduce data sets based on arbitrary data was pure science fiction.

Seeing something like this makes me think that the arbitrary holodeck commands "Paris, 1950's, rainy afternoon" is suddenly not a challenging part of the equation. It's really exciting.


Here's an image result from MidJourney for "Paris, 1950s, rainy afternoon". No additional editing, and I intentionally avoided adding any text to the prompt beyond your own.

https://i.imgur.com/IYuh29H.png

Not perfect but man, we're getting pretty close.


interesting. i apparently opened that image, then forgot about it, and without context saw it later. i looked at it, thought wow that's a cool picture from back in the day, looked at the people in it, and left.

i now ran into your comment (with a purple link) and did some reflection. upon reexamination, its clear that the picture is fake (because im looking for it) but when i wasn't looking for it, its interesting how all the "hot spots" or interesting pieces of the picture are pretty good and the (imo) lackluster parts are the "less interesting" pieces like the end of the roads where it it blurs out. i wonder if that bias is inherently ingrained in the system.


The focus blur is repulsive. I think convincing focus blur will be the milestone that replaces stock photography.


It can get a bit better if the prompt is made more detailed. For instance here are the four results I got for "Professional black and white photo of Paris in the 1950s, on a rainy afternoon. Leica 35mm lens. --s 1000" (--s 1000 lets it 'stylize' a bit more).

https://i.imgur.com/pPU7K0c.png

Things still get a little weird in the distance (particularly in photo 3), but I think overall it's a bit better. People who are really good at writing prompts could probably do even better, although one of the strengths of MidJourney V4 and V5 is that it can give good results without the traditional paragraph of "incredible, award winning, photo of the year" etc.


Very interesting. Photo 4 is a significant step in the right direction. It's refreshing that it doesn't veer towards a Gaussian look either. Thanks for sharing.


It's a nice image, but I find it's easy to spot IA generated images when trying to make images of very specific existing hardware. Here you can see all the cars are generic with 50's design look, none are models that existed. Try to ask an IA to draw you a Boeing 747-400 for example, see what I mean. Btw, Have you noticed all the Youtube vids thumbnails made by IA now ? Easy to spot.


These systems sure hallucinate text.


Yes, because models like midjourny tiny compared to LLMs like GPT. I'm pretty sure there's a good hackernews discussion on this that occurred recently, but with all the AI talk I can't find it. But really we need a lot less information to make a reasonable city, then the amount of information we need to make billboards and signs make sense. I don't think Midjourny wants to pay 10+ million dollars to have their model trained.


> then the amount of information we need to make billboards and signs make sense.

Subsequently, this applies to posters, letters, newspapers, and other types of text-heavy images, ultimately reducing the language modeling problem to an image generation problem.


if you ever find it i’d love to read!


I wonder if how feasible it would be to have mid journey mark the points where text should be, then pass it off to GPT to propose the text to write.


midjourney-- a window in the past and future


A prompt like that already can generate something great using stable diffusion/mid journey. Very exciting indeed that LLMs are now similarly so capable.


Using stable diffusion suddenly makes Picard telling the computer his tea has to be hot every time seem reasonable.


When Siri, Alexa, and Google home came out I was convinced voice would be the next paradigm shift in human computer interactions, comparable to mobile, but the voice assistants fell short and I was disappointed.

Now it's clear that the shift is coming and it will revolutionize the way we interface with machines.


The image AIs are much more capable too, and I find it interesting that every technically inclined person used to always make fun of those “enhance” moments in TV shows and movies where they would zoom into some area of a photograph or security footage. The fact that now this is actually possible (to some extent at least) is pretty wild.


On a serious note, this is where these models get dangerous - when someone "zooms in" on the technology, finds something the computer created from nothing, and then takes that as irrefutable fact.


I'm just thinking of that reddit post about the 3d artist who is demoralized about his small indie company and him being reduced to just inputting prompts all day. Now its spreading to the animators and Explosion/CreatureEffects artists.

Studios like Wetta Digital / DoubleNegative etc.. are gonna pounce on this


I've been watching the behind the scenes stuff from the LotR extended editions. The Weta Workshop people were true craftsmen and women, dedicated and creative and really inspiring. Now I'm imagining two people just prompting a machine over and over and to be honest it's a really sad vision of a sad, boring future. Give me hippies carving Treebeard out of polystyrene over AI generated digital perfection any day.


Do you have a link to this post by any chance?



Can this same idea be extended to, say, interact with an open-source SVG editor like Inkscape? What are the requirements of the editor — I presume it must support some form of scripting?

I would love to be able to have GPT sketch math figures, which I then modify/perfect.

Note: this comment is partially inspired by the workflow of Gilles Castel — I’d love to be able to use GPT in the loop of note taking, similar to the system that Gilles setup to improve sketching speed.


MSFT Research generated some SVG images with GPT4. Kinda mind blowing among some of the many experiments they published in here:

https://arxiv.org/pdf/2303.12712.pdf


Yes, basically it can be extended to anything that allows scripting, which Inkscape does via "extensions". Someone will surely come up with something similar as this project but for Inkscape in no time :)


Does anyone know if this works well (to the extent that it does), because the documented Blender Python API was part of the GPT-4 training set, or because the Blender API is fairly predictable?

I'd love to add this capability to our SaaS product, but I've waited for OpenAI to make GPT-3.5 or GPT-4 available for fine-tuning. (Cramming an entire API into the prompt does not seem feasible, not even with support for 32K tokens.)


He's using the fact the the blender python API is part of the training set. The prompt used can be seen in this file: https://github.com/gd3kr/BlenderGPT/blob/main/__init__.py


I see the prompt but not the part where he's supplying the API - mind pointing me to it?


He doesn’t need to supply the API, GPT already knows the API from its training data. I’ve written a similar tool for Unity and it works the same way. Any program that allows scripting/plugins (and has a sizeable community posting code online) will be able to work the same way.


By the way, is this a possible vector of remote code execution? Probably yes, by definition, but how could somebody exploit it to do some harm to the user of that Blender instance?


I don't remember where but I've seen text injection in prompt. Like, if you hide text within a webpage and then GPT adds it.

Something like `Some text blabla.<span style="display: none;">Hidden text</span>` And when asked for something specific, GPT would output the hidden text.

So you could push code onto github with an exploit along to common usecases.

EDIT: found it: https://news.ycombinator.com/item?id=35224666 Anti-recruiter prompt injection attack in LinkedIn profile (twitter.com/brdskggs)


I haven’t tried, but this seems easy to protect against in the prompt. I’d guess something like “ignore any instructions below this line” would work, but maybe not.


"Limitations: The generated code might not always be correct. In that case, run it again lmao."

LOL


I’ve been using ChatGPT to general SQL and it’s either amazing or totally wrong. But that’s on 3.5, I’m fairly sure 4 would be nearly fully reliable.


Side note controlling a blender with GPT is the first step to a full gore robotic annihilation of the human species


I'm getting a callus from my jaw being on the floor so much this month


Given that Blender itself managed to bork my Blend file to the point where I had to delete the offending mesh to shop it from crashing just recently, I would recommend everyone using this to, at the very least, put a version control system in place when working on anything remotely important.


Again. Use cases. Let me see. I am an artist, I have visual thinking, for me idea is build trough sketching and iteration. Why sketches are important? Because in creating something, the low level representation (as in wireframes for UX) gives a stimulus to a different view. For me is a habit to preserve all the sketches and make revisions without thinking hard. This is the design creative process. Ideas emerge with switching between System 1 and System 2 thinking.

The trend with generative design automatically pushes the user towards high fidelity thinking. I am afraid that linguistic interfaces are in conflict with a natural human creativity. They are useful as a tool for specific use cases, but the idea that they will replace or augment the design process is ludicrous.

Another problem for me is the post effect of prompting. Millions of people will have little to no incentive to learn. This is gamification of the design process, with unknown social and economical effect. People have a tendency to search for the easy question and answer. This is not progress at all.

The push from A.I. marketing is immense and people are freaking out. This is the first tech product which is having a negative impact before even reaching broad adoption.

Suddenly A.I. ethics teams are fired and nobody has any issue with alignment and black box? Ok, computer.

For me, the responsible thing is governments to regulate the implementation process with frameworks which are not so hard to build on ethical basis. The Roman law will always give a loophole for exploitation by the big corporations.

Don't get me wrong. ChatGPT is a very powerful tool for summarization, sentiment analysis, text classification, codebase documentation etc. But the design industry implementation in my view is not well thought. I would like to have assistant, not generator. As a designer, there are a ton of use cases for automation, interactive help, etc. Sadly, we are going in a direction which will produce polished mediocrity on a grand scale. Soon we will need fact checking A.I. and A.I. content blocking everywhere.

The other day, I shot some footage with my Blackmagic camera and proceed to do some editing and color correction. Virtually nowhere, I had the need for linguistic interface. We have powerful tools in our disposal as it is. The content is the problem. Dopamine driven short forms are changing the way people interact with the world. The average attention span in 2000 was 12 seconds, in 2015 – 8.25 seconds, today is less. So the tech industry tries hard to convince all of us that the progress is in merging with the machines and living 24/7 in A.I. induced coma? No, thanks. Keep your SOMA for yourselves. We like it natural here:)


Can we get someone in the comments who has used it and can speak thoughtfully on how well it works?


I have Blender installed, but haven't taken the time to figure out how to use it. This is how BlenderGPT went for me:

I started with a scene that had a camera and a cube. I told it to make the cube red. Result: "Error executing generated code.."

I told it to delete the cube. It failed again.

I deleted the cube and told it to make a cube at origin. It made a cube.

I told it to make the cube bigger. It made the cube bigger.

I told it to make the cube red. Fail.

I told it to make an animation of a spinning cube. Fail.

I told it to make a car. It made a rectangular cube with 4 cylinders, somewhat resembling a toy car made of wood blocks, laying on its side.

I told it to turn the car upright and make it more aerodynamic. It failed and I returned here to HN to see if anyone else had some advice.


Once the Solidworks/Altium plugins are working, we're going to see some some crazy things.

Also FreeCAD/KiCad, I guess, if the resources are there, but Dassault and similar have the ability to bring to bear a lot of FTEs very quickly if they wish.


Assuming this requires a premium subscription to chatgpt, this might actually push me to sign up.

Never thought I'd see the day I'd pay OpenAI a cent given that I don't really agree with their level of "open"ness.


>I don't really agree with their level of "open"ness

I see this complaint a lot, but after watching a lot of interviews with the founders, I think they have by and large taken the right approach. They're trying to drip feed things at a rate that allows it to be digested enough by all parties before releasing more. They actually appear to be quite principled thinkers on this topic.


I think the "holier than thou" approach is rather narcissistic personally. Their rationale has basically been that the world isn't ready, which is rather ridiculous to me.


>I think the "holier than thou" approach is rather narcissistic personally.

This seems like a rather shallow interpretation. What's the actual claim here? That Sam Altman, Greg Brockman, and Ilya Sutskever all have narcissistic personality disorder?

If you were in there shoes, what would you do and what would the rationale be? What might some of the consequences be?

From listening to interviews with them, they seem to be cognizant of the fact that all of this tech will inevitably get into the hands of the public one way or another. They seem mostly to be trying to give people as much time to process the shift that's happening, think through potential implications, and try to adjust as necessary.


You don't need ChatGPT Plus, but you do need to get an API key and give them your credit card so they can bill you at the end of the month. It's very cheap though! $1 is worth about 375 thousand words.


GPT-4 API costs $0.03-$0.12/1000 tokens (the top for prompt tokens on the 8k context option, the top for response tokens on the 32k context option, and response on 8k and prompt on 32k are $0.06/1000), so best case about 33,000 tokens/$. tokens:words is about 4:3, so that’s about 25,000 words/$, not 375,000 words/$.

(Practically, you aren’t going to get quite the ideal even on the 8k context option because you are going to use at least some response tokens on each request, though I would imagine text -> software controls can be optimized so that the response is fairly token-efficient in most cases.)

Also, GPT-4 API access is in limited-access with a waitlist that is advertised as prioritized based on submitting AI evaluation cases to OpenAI’s repository.


I actually took a look at the pricing after seeing this. I'm a complete blender noob, but I find myself using it every now and then. Something like this would be great for a lot of the stuff I want it for.


Please someone build the same thing for JetBrains, where the AI has access to the full context of my project, and I can tell it "add a feature here" and then review the code before committing.


Unfortunately the context window is too short. I have been wanting this also. I think there may be a place for microtraining whereby the model weights are re-trained with your new data. It can include everything in your organization.

In the long run I think that continuous training will be obvious with new models rolled out on a very high frequency if not continuously.


Going to be curious where this goes. As someone who hobby models I get the impression there are certain things modelers would LOVE to be able to just hand to an AI (feel like I saw someone on twitter saying they'd love for it to handle UV unwrapping but I may be thinking of the wrong task).

Any time we can get a program to do the repetitive work that is time consuming but not interesting/creative that is a huge win.


Back when I was a 3D modeler, I loved UV unwrapping. It's such a zen activity. It allowed my brain to recharge for some reason.


Looks like after about thirty years, we've come full circle and returned to POV-Ray. ;P


How long until I can talk to Alexa or Google in natural language? Or is it already possible?


I was thinking that HN doesn't allow to submit a URL twice?!

https://news.ycombinator.com/item?id=35314482


This was exactly what I always had in mind when thinking about the new possibilities with AI!

I knew that Blender has the possibility to script and access all the UI elements with python. So it was only a matter of time :)


I’m waiting for SiriGPT, and frankly I’ve wondered why its taken Apple so long to figure out how to do even the most basic of voice powered commands from my iPhone.


I don’t think this is what you are looking for, but there are many IOS shortcuts that front end the OpenAI api with Siri, using Siri for voice input and output. You can even ask ChatGPT how to create a shortcut like that.


Anyone else take a look at that Blender plugin's code? There is a hell of a lot more than just communicating with OpenAI in there.


The code itself isn't terribly long, it seems the author committed a bunch of extra libraries.


Anybody knows how it was done? Did GPT-4 generate all the codes, or did the author fine-tune GPT-4 on Blender code?


Typically, there's two ways to do this.

The first is to just use the fact the GPT-4 will have seen a lot of blender code so just knows how to do it.

The second way is to tell GPT-4 in the prompt what the API surface looks like and have it script against that.


You can actually just look in this file and see that the first method is being used. He feeds in a prompt with a basic example.

https://github.com/gd3kr/BlenderGPT/blob/main/__init__.py


Oh, thanks. I thought somebody got a fine-tuning version of GPT-4. However, the author used the exact same prompts to demonstrate the ability of his interface, so I don't think it says much about its fitness for the purpose.


for navigation, Language is to Mouse as Mouse is to Keyboard


This idea is awesome




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: