LLMs and Programming in the first days of 2024

kevindamm · on Jan 2, 2024

Salient point:

> Would I have been able to do it without ChatGPT? Certainly yes, but the most interesting thing is not the fact that it would have taken me longer: the truth is that I wouldn't even have tried, because it wouldn't have been worth it.

This is the true enabling power of LLMs for code assistance -- reducing the activation energy of new tasks enough that they are tackled (and finished) when they otherwise would have been left on the pile of future projects indefinitely.

I think the internet and the open source movement had a similar effect, in that if you did not attempt a project that you had some small interest in, it would only be a matter of time before someone else did enough of a similar problem for you to reuse or repurpose their work, and this led to an explosion of (often useful, or at least usable) applications and libraries.

I agree with the author that LLMs are not by themselves very capable but provide a force multiplier for those with the basic skills and motivation.

vidarh · on Jan 2, 2024

I find that even beyond "activation energy", a lot of my exploration with ChatGPT is around things I don't necessarily initially even intend to take forward, but is just curious about, and then realise I can do with much less effort than I expected.

You can often get much of the same effect by bouncing ideas of someone, who doesn't necessarily need to know the problem space well enough to solve things but just well enough to give meaningful input. But people with the right skills aren't available at the click of a button 24/7.

nerdponx · on Jan 2, 2024

I feel very left out of all this LLM hype. It's helped me with a couple of things, but usually by the time I'm at a point where I don't know what I'm doing, the model doesn't know any better than I do. Otherwise, I have a hard time formulating prompts faster than I can just write the damn code myself.

Am I just bad at using these tools?

pmontra · on Jan 2, 2024

I give you an example. I took advantage of some free time in these days to finally implement some small services on my home server. ChatGPT (3.5 in my case) has read the documentation of every language, framework, API out there. I asked it to start with Python3 http.server (because I know it's already on my little server) and write some code that would respond to a couple of HTTP calls and do this and that. It created an example that customized the do_GET and do_POST methods of http.server, which I didn't even know exist (the methods.) It did do well also when I asked it to write some simple web form. It did not so well when things got more complicated but at that point I already knew how to proceed. I finished everything in three hours.

What did it save me?

First of all the time to discover the do_GET and do_POST methods. I know that I should have read the docs but it's like asking a colleague "how do I do that in Python" and getting the correct answer. It happens all the time, sometimes it's me to ask, sometimes it's me to answer.

Second, the time to write the first working code. It was by no means complete but it worked and it was good enough to be the first prototype. It's easier to build on that code.

What it didn't save me? All the years spent to recognize what the code written by ChatGPT did and to learn how to go on from there. Without those years on my own I would have been lost anyway and maybe I wouldn't been able to ask it the right questions to get the code.

leansensei · on Jan 4, 2024

Regarding your last point about what it didn't save you, it reminded me of this blog post: https://overbring.com/articles/2023-06-23-on-using-llm-to-ge...

jay_kyburz · on Jan 2, 2024

I've been learning boring old SQL over the last few months, and I've found the AIs quite helpful at pointing out some things that are perhaps too obvious for the tutorials to call out.

I don't mind taking suggestions about code from an AI because I can immediately verify the AI's suggestion by running the code, make small edits, and testing it.

calf · on Jan 3, 2024

But this is my pet peeve when people claim this; the only reason is because the AI code is small and constrained in scope. Otherwise the very claim that humans can easily verify AI code quickly would, like, violate Rice's Theorem.

renonce · on Jan 8, 2024

That's an underutilization of the complexity theory. Since not all problems are formulated as being Turing-complete, there are better theorems to apply than Rice's Theorem, such as:

* IP=PSPACE (you can verify correctness of any PSPACE computation in polynomial time)

* NIP=NEXPTIME (you can verify correctness of any NEXPTIME computation with two non-cooperative provers)

* NP=PCP(1,log(n)) (you can verify correctness of any NP statement with O(log(n)) bits of randomness by sampling just O(1) bits from a proof)

What these means is that a human is indeed able to verify correctness of the output of a machine with stronger computational abilities than the human itself.

simonw · on Jan 2, 2024

I'd reframe that slightly: it's not that you are bad at using these tools, it's that these tools are deceptively difficult to use effectively and you haven't yet achieved the level of mastery required to get great results out of them.

The only way to get there is to spend a ton of time playing with them, trying out new things and building an intuition for what they can do and how best to prompt them.

Here's my most recent example of how I use them for code: https://til.simonwillison.net/github-actions/daily-planner - specifically this transcript: https://gist.github.com/simonw/d189b737911317c2b9f970342e9fa...

steppi · on Jan 2, 2024

I've developed a workflow that's working pretty well for me. I treat the LLM as a junior developer that I'm pair programming with and mentoring. I explain to it what I plan to work on, run ideas by it, show it code snippets I'm working on and ask it to explain what I'm doing, and whether it sees any bugs, flaws, or limitations. When I ask it to generate code, I read carefully and try to correct its mistakes. Sometimes it has good advice and helps me figure out something that's better than what I would have done on my own. What I end up with is like a living lab notebook that documents the thought processes I go through as I develop something. Like you, for individual tasks, a lot of times I could do it faster if I just wrote the damn code myself, and sometimes I fall back to that. In the longer term I feel like this pair programming approach gives me a higher average velocity. Like others are sharing, it also lowers the activation energy needed for me to get started on something, and has generally been a pretty fun way to work.

x0x0 · on Jan 2, 2024

Here's what I find extremely useful:

1 - very hit or miss -- I need to fidget with the aws api in some way. I use this roughly every other month, and never remember anything about it between sessions. ChatGPT is very confused by the multiple versions of the APIs that exist, but you can normally talk it into giving you a basic working example that is then much easier to modify into exactly what I want than starting from scratch. Because of the multiple versions of the aws api, it is extremely prone to hallucinating endpoints. But if you persist, it will eventually get it right enough.

2 - I have a ton of bash automations to do various things. just like aws, I touch these infrequently enough that I can never remember the syntax. chatgpt is amazing and replaces piles of time googling and swearing.

3 - snippets of utility python to do various tasks. I could write these, but chatgpt just speeds this up.

4 - working first draft examples of various js libs, rails gems, etc.

What I've found has extremely poor coverage in chatgpt is stuff where there are basically no stackoverflow articles explaining it / github code using it. You're likely to be disappointed by the chatgpt results.

Ldorigo · on Jan 2, 2024

As the article says, it helps to develop an intuition for what the models are good or bad at answering. I can often copy-paste some logs, tracebacks, and images of the issue and demand a solution without a long manual prompt - but it takes some time to learn when it will likely work and when it's doomed to fail.

okwubodu · on Jan 2, 2024

This is likely the biggest disconnect between people that enjoy using them and those that don’t. Recognizing when GPT-4’s about to output nonsense and stopping it in the first few sentences before it wastes your time is a skill that won’t develop until you stop using them as if they’re intended to be infallible.

At least for now, you have to treat them like cheap metal detectors and not heat-seeking missiles.

nickpsecurity · on Jan 2, 2024

I had good results by writing my requirements like they were very, high-level code. I told it specifically what to do. Like formal specifications but with no math or logic. I usually defined the classes or data structures, too. I’d also tell it what libraries to use after getting their names from a previous, exploratory question.

From there, I’d ask it to do one modification at a time to the code. I’d be very precise. I’d give it only my definitions and just the function I wanted it to modify. It would screw things up whereby I’d tell it that. It would fix its errors, break working code with hallucinations, and so on. You need to be able to spot these problems to know when to stop asking it about a given function.

I was able to use ChatGPT 3.5 for most development. GPT 4 was better for work needed high creativity or lower hallucinations. I wrote whole programs with it that were immensely useful, including a HN proxy for mobile. Eventually, ChatGPT got really dumb while outputting less and less code. It even told me to hire someone several times (?!). That GPT-3-Davinci helped a lot suggests it’s their fine-tuning and system prompt causing problems (eg for safety).

The original methods I suggested should work, though. You want to use a huge, code-optimized model for creativity or hard stuff, though. Those for iteration, review, etc can be cheaper.

countWSS · on Jan 2, 2024

Its useful for writing generic code/template/boilerplate, then customizing it by inserting your own code. For something you already know better, there isn't a magic prompt to express it, since the code is not generic enough for LLM to understand as a prompt.

Its best usecase is when you're not a domain expert, need quickly to run some unknown API/library inside your program inserting code like "write a function for loading X with Y in language Z" when you barely have an idea what is X,Y,Z. Its possible in theory to break-down everything to "write me a function for N" but the quality of such functions is not worth the prompting in most situations and you better ask it to explain how to write a function X,Y,Z step-by-step.

jacobr1 · on Jan 3, 2024

This is exactly where I get the most value. For example, I know write a bunch of custom chrome plugins. I'm not much of javascript guy, but I can get by and validate the code. Usually what I want to do is simple, but requires making an api call and basic parsing. All stuff I could probably figure out myself in an hour or two. But instead I can get an initial version done in 2 minutes and spend 5-10 debugging. I probably wouldn't even try otherwise.

ithkuil · on Jan 2, 2024

I felt the same every time I tried to get some help in a subject matter where my knowledge was quite advanced and/or when the subject matter was obscure/niche.

Whenever I tried it on something more common and/on in some stuff I had absolutely zero familiarity it did help me bootstrap quicker than reading some documentation would have

That tells a lot about how hard it is to write/find documentation that is tailored exactly to you and your needs

teaearlgraycold · on Jan 2, 2024

You can use LLMs as documentation lookups for widely used libraries, eg: the Python stdlib. Just place a one line comment of what you want the AI to do and let it autocomplete the next line. It’s much better than previous documentation tools because it will interpolate your variables and match your function’s return type.

wharvle · on Jan 2, 2024

Yeah, I’m not sure how often these tools will really help me with the things that end up destroying my time when programming, which are stuff like:

1) Shit’s broken. Officially supported thing kinda isn’t and should be regarded as alpha-quality, bugs in libraries, server responses not conforming to spec and I can’t change it, major programming language tooling and/or whatever CI we’re using is simply bad. About the only thing here I can think of that it might help with is generating tooling config files for the bog standard simplest use case, which can sometimes be weirdly hard to track down.

2) Our codebase is bad and trying to do things the “right” way will actually break it.

jacobr1 · on Jan 3, 2024

One thing you might find useful is using it to write unit tests for legacy code, that more easily allow you to refactor the crappy codebase.

packetlost · on Jan 2, 2024

No, you're likely just a better programmer than those relying on these tools.

ParetoOptimal · on Jan 2, 2024

That could be the case and likely is in areas where they are strongest, just like the articles example of how its not as useful for systems programming because he is an expert.

If you ask it about things you don't know it was likely trained on high quality data for and get bad answers, you likely need to improve your writing/prompting.

packetlost · on Jan 2, 2024

Except it's most dangerous when used in areas that you're weakest because it will confidently spit out subtly wrong answers to everything. It is not a fact engine.

ParetoOptimal · on Jan 3, 2024

That only matters if you use it for things where failure actually causes damage.

somewhereoutth · on Jan 2, 2024

However could it be that the 'entertainment effect' of using a new (and trendy) technology like LLMs provides the activation energy for otherwise mundane tasks?

datagram · on Jan 2, 2024

In my experience, it genuinely lowers the activation (and total) energy for certain tasks, since LLMs are great at writing repetitive code that would otherwise be tedious to write by hand. For instance, writing a bunch of similar test cases.

apwell23 · on Jan 2, 2024

This is true. I write way more documentation now since llm does all the formatting, structure , diagrams ect. I just guide it at a high level.

robluxus · on Jan 2, 2024

Do you use a specialized LLM for diagrams?

danielbln · on Jan 2, 2024

I'm not OP, but I just ask GPT to turn code or process or whatever else into a mermaid diagram. Most of the time I don't even need to few-shot prompt it with examples. Then you dump the resulting text into something like https://mermaid.live/ and voilà.

apwell23 · on Jan 2, 2024

I just ask chatgpt to generate text diagrams that i can embed in my markdown.

times_trw · on Jan 2, 2024

No. I've completed projects that have been on the back burner for years in hours. They weren't on the back burner for lack of interest but mainly for lack of expertise in a specific stupid area.

It's not an exaggeration to say that I now do two weeks of programming a night. Of course a lot of times the result gets thrown away because the fundamental idea was flawed in a non-obvious way. But learning that is also worth while.

Klathmon · on Jan 2, 2024

It's revolutionized our in house tooling at work.

No longer do I need to give PR feedback more than a couple times, because we can just ask chatgpt to come up with a lint rule that detects and sometimes auto-fixes the issue. I use it to write or change Jenkins jobs, scaffold out tests, diagram ideas from a monologue brain dump, write alerting and monitoring code, write and clean up documentation.

Most recently I wanted to get some "end of year" stats for the teams, normally it would never have happened because I don't have half a day to dedicate to relearning the git commands, and how to count the lines of code and attribute changes to teams and script the whole process to work across 20 repos.

20 minutes later with chatgpt I had results I could share within the company.

It's just allowed me to skip almost all of the boring and time consuming parts of handling small things like that, and instead turns me into a code reviewer who makes a few changes to make it good enough then pushes it out

datameta · on Jan 2, 2024

Does your company have any considerations for feeding chatGPT source code? Would it not be safer to use a local LLM?

Klathmon · on Jan 2, 2024

Not any more than feeding source code to Github. (I personally feel "source code" is very rarely the "secret sauce" of a company anyway). But where I work we not only have the blessing of the company, we are encouraged to use it because they've clearly seen the benefits it brings.

A local LLM would be preferrable all things equal, but in my experience for this kind of stuff, GPT-4 is just so much better than anything else available, let alone any local LLMs.

kolinko · on Jan 3, 2024

It's safe if you use the OpenAI API, or you have an enterprise chatgpt account - they have the same privacy guarantees as other cloud providers there.

lifeisstillgood · on Jan 2, 2024

Wait what?

ChatGPT does diagrams for you? Writes documentation?

Klathmon · on Jan 2, 2024

Absolutely!

I found a plugin a while back called "AI Diagrams" that generates whimsical.com diagrams for me. Combined with the "speech to text" systems in chatgpt means I can just start babbling about some topic and let it write it all down, collect it into documentation, and even spit out a few diagrams from it.

I generally have to spend like 10 minutes cleaning them up and rearranging them to look a bit more sane, but it's been a godsend!

Similarly I sometimes paste a bunch of code in and tell it to write some starter docs for this code, then I can go from there and clean it up manually, or just tell it what changes need to be made. (technically I tend to use editor plugins these days not copy+paste, but the idea is the same)

Other times I'll paste in docs and have it reformat them into something better. Like I recently took our ~3 year old README in a monorepo that goes over all the build and lint commands and had it rearrange everything into sets of markdown tables which displayed the data in a much easier to understand format.

Shocka1 · on Jan 2, 2024

Same here - I feel as though it's turned me into a super programmer. One of my favorite uses is converting a C# model with a bunch of properties into a SQL table along with their corresponding stored procedures. I used to have boilerplate code and would have to copy/paste every property, along with their SQL datatypes. One model might take me 10 to 20 minutes to get translated to a table, and the stored procedures would take me another 20 minutes. Now it's all done in 5 minutes tops, and I'm not having to nitpick datatype issues I may have screwed up in the manual process.

kodablah · on Jan 2, 2024

I had a project recently: building an advanced Java class file analyzer. I knew a lot about ow2 asm libraries, but it saved me a lot of digging time remembering exact descriptor formats. Also it helped me understand why other static analysis libraries weren't good enough for me for stateful reasons.

For me, ChatGPT is doing two things: 1) saving trivial StackOverflow and library code walking to answer specific question, and 2) helping the initial project research stage to grasp feasibility of approaches I may take before starting.

michaelbrave · on Jan 2, 2024

to add to that I've heard a lot of people who have things like ADHD saying LLM's are life changing more than say neurotypical people. For those of us with bad ADHD doing simple things are the hardest, just taking out the trash or opening the mail is nearly impossible, your internal dialog is screaming at you for hours to just do the thing, but your body won't listen. They call it executive dysfunction, but it's the bane of my life

LLM's helping to just start the thing is actually a huge deal.

kolinko · on Jan 3, 2024

Also, just throwing a bunch of disorganised thoughts into it, and letting it organise everything.

moffkalast · on Jan 2, 2024

Agreed, I've now got a sizable project going that I would probably procrastinate on attempting for years without GPT 4 laying the initial groundwork, along with informing me of a few libraries I had never heard of that reduced wheel reinventing quite a bit.

xnorswap · on Jan 2, 2024

First automation acts as a powerful lever and enabler, then it replaces you.

In 5 years time you may well be more productive than ever. In 15 years I doubt there'll be many programming jobs in the form they are recognisable today.

jgilias · on Jan 2, 2024

Historically, automation has always made society richer and better off as a result.

There are two guys outside of my window at this very moment getting rid of a huge pile of dirt. One is in an excavator, the other in a truck. There’s a bunch of piles that they’ve taken care of today. Two centuries ago this work would’ve taken a dozen people, a few animals of burden, and a lot more time.

Where are the other ten hypothetical people? I don’t know, but chances are they’ve been absorbed by the rest of the economy doing something else worthwhile that they are getting paid for.

atleta · on Jan 2, 2024

There are two problems with this argument. The first, and easier to accept one is that while society might be better off, in the long run , as a result the affected individuals will probably not. We tend to generalize from a single historical example, the industrial revolution and, more specifically, the automatic loom, and in that case the displaced workers ended up doing worse. Better jobs and opportunities only got created later.

The other problem is, of course, is that all the historical examples (the data) are too few to generalize from while we do see how these examples are different from each other. As technological evolution progresses, automation gets more and more sophisticated, it can replace jobs that require more and more skills and talent. In other words, jobs that fewer and fewer people were able to do in the first place. This means that the bar for successfully competing in the labor market gets higher and higher and it will get to a point where a substantial number of people will just be plain uncompetitive for any job.

Or, at least that was one of the morels until LLMs were invented. (Mostly everyone thought that automation would take over the opportunities from the bottom up in general.) Now it seems that indeed white collar jobs are more in danger for now. But I digress.

The point here is that past examples are false analogies because AI (and I moslty mean future AI) is funcamentally different from past inventions. It's capabilities seem to improve quickly but we're mostly stuck with what evolution gave us. (We, as a species, are evolving but it's very slow compared to the rate of technological evolution and also we, as individuals, are stuck with whatever we were born with.)

Shocka1 · on Jan 2, 2024

I don't speak for the parent, but what they said seems true - Henry Hazlitt covers this phenomenon pretty well if you are ever interested. Your two points I think are also true. It won't be nice to everyone... Such is life. That being said, my practical mind is telling me to get ahead of it, whether that be learning new skills or whatever it takes to stay competitive. If that means picking another industry/profession entirely, so be it. You do what you gotta do.

ElevenLathe · on Jan 2, 2024

The other problem with the excavator argument is that all that amazing productivity isn't being pulled from thin air by cleverness -- it's burning (limited, polluting) fossilized solar power from a million years ago. We haven't avoided all the work of moving the dirt around, but only transferred the burden from ourselves and our horses to our unsustainable energy sector. AI (and all software, really) is similar in that it tries to do the same to intellectual activities: instead of requiring a cheese sandwich and an afternoon, some tasks can be instead be done with a few cents worth of electricity and a few hundred milliseconds, but I still eat the cheese sandwich for lunch, so in the final analysis we've saved time but no other resource.

We of course wouldn't have to make this explicit calculation if we could incorporate all this knowledge about fossil fuels directly into the prices of goods and services, but this is very difficult thing to do, and so far nobody has managed to do it in a global way.

It may still make material sense to use ChatGPT to create slides for middle management meetings, but that is not at all certain in a world with a significant price on emissions (though, to be fair, almost no human activity from the past fifty years stands up to this test either).

kolinko · on Jan 3, 2024

IIRC, Microsoft, which is hosting OpenAI, has already their server farms on carbon-neutral electricity, and is heading towards full carbon neutrality on the lifecycle of their server farms.

So doing stuff yourself is less carbon efficient than letting ChatGPT do its job.

As for calculating co2 pollution into the prices - we're slowly doing that, e.g. EU is setting up its carbon tax that applies to companies abroad.

The issue is that if we were to instantly include true costs of carbon removal into everything, the economy might collapse at worst, and at the least poor people might not afford food, heating nor other basic necessities any more. It will take time to do it sensibly.

ElevenLathe · on Jan 3, 2024

I'm skeptical that industrial civilization can exist at all (at least on anything like a billions-of-people scale) with priced-in emissions, but I guess we'll see. Widespread famine and suffering are in store one way or the other (either because of climate change, or because of what we will need to do to fight climate change).

kolinko · on Jan 3, 2024

So far we’ve been exceeding the predictions for switching to carbon neutral economy, so I wouldn’t be too sure of that.

idopmstuff · on Jan 2, 2024

I think the other thing people miss is the impact of our existing infrastructure on the speed at which these new technologies can be deployed.

Society had a lot of time to get used to the printing press, the advances of the Industrial Revolution and the internet. This is because the knowledge had to spread, and it's also because a ton of equipment had to be manufactured and/or shipped all over the world. We had to make printing presses, design and build factories, and get a critical mass of people internet-capable computers.

AI is fundamentally different, in that the knowledge of AI can spread instantly because of the internet and because the vast majority of the world already has access to all of the hardware they need to access the most powerful AI models.

Soon we'll have humanoid robots coming, and while they obviously have to be built, we are much more capable now than we were 50 years ago at building giant factories. We also have an efficient system of capital allocation that means that as soon as someone demonstrates a generally useful humanoid robot and only needs to scale production, they'll have access to basically infinite investor money.

daxfohl · on Jan 2, 2024

Those ten hypothetical people would likely have actually been zero because the task wouldn't have been worth hiring ten people.

thatjoeoverthr · on Jan 3, 2024

Important, and often ignored! Technological improvements change the cost structure of activities. Consider texting. We send billions of texts every day. This does not replace billions of couriers. Rather, more people work to maintain our phone supply and telecom than we ever had running around with letters. It's not a promise but it is a pattern.

xnorswap · on Jan 2, 2024

I agree, on a societal level it's a great benefit.

On an individual level, however, I'd suggest keeping an eye out for opportunities to retrain.

As an analogy, I'd rather be like the coal miners in the 80s who could read the writing on the wall and quietly retrained into something else rather than those who spent their better years striking over cuts to little avail.

It's a very daunting prospect seeing a path to unemployability, though.

Depending how quickly the change happens, it could be a gentle transition, or it could upset a lot of people.

jay_kyburz · on Jan 2, 2024

Yeah, but what do you retrain to? I can't think of any industry that isn't being threatened with massive automation.

daxfohl · on Jan 2, 2024

Well, the AI job market is pretty hot, so that's an option. And I expect as things mature it'll only create even more opportunity. Projects that nobody previously would have considered, because they'd have taken too long, required too much training for too many people? Now they can happen! And for each of those things, jobs are created, not taken away.

Imagine yourself as CEO. What do you think is your most likely train of thought? A/ "I can fully replace my labor force with bots" or B/ "My employees now have superpowers to do things we couldn't even conceive of two years ago". While there are certainly some scenarios where the first choice is appropriate, the latter sounds far far more likely in most scenarios to me. Why would you contract when there's suddenly so much opportunity to expand?

direwolf20 · on Jan 2, 2024

What AI jobs?

dash2 · on Jan 2, 2024

This comment and the grandparent can both be right. Society gets richer because automation replaces jobs, leaving us more time to do other jobs.

andsoitis · on Jan 2, 2024

also, two centuries ago, we didn't have trucks and the concomitant infrastructure, jobs, wealth, etc. that machines brought along.

iExploder · on Jan 2, 2024

> Where are the other ten hypothetical people?

Zoned out somewhere in the backstreets of SF

direwolf20 · on Jan 2, 2024

Or dead because Trump cut off their welfare.

mjr00 · on Jan 2, 2024

Same as looking back 15-20 years ago from now, though.

Very few jobs now where you put together basic webpages with HTML and CSS (Frontpage, then Wix/Wordpress etc replaced you). Very few jobs where you spend 100% of your time dealing with database backups and replication and managing the hardware (the cloud replaced you). Very few jobs where you spend all your time planning hardware capacity and physically inserting hard disks into racks (the cloud replaced you, too).

andsoitis · on Jan 2, 2024

I share a similar intuition, but am skeptical that my imagination is in the right ballpark of what it will look like 15 years hence.

What do you think programming will be like in 15 years and where is the high-value work by human programmers?

danenania · on Jan 2, 2024

I think AI will shift the value from those who know how to program toward those who know which programs are most useful to write. In other words, we’ll all become product engineers.

vmfunction · on Jan 3, 2024

>I think the internet and the open source movement had a similar effect, in that if you did not attempt a project that you had some small interest in, it would only be a matter of time before someone else did enough of a similar problem for you to reuse or repurpose their work, and this led to an explosion of (often useful, or at least usable) applications and libraries.

Being developing before when Mac had b&w screen (system 6?). Had so many good ideas just didn't work on them for whatever reason, and eventually someone did them. Maybe it is just tech people run into the same problem, or collective tech unconscious, the longer you have being around, eventually people in the industry will try to solve the problems. These occurrences goes to show me that there are really aren't many true original ideas out there, a lot of it comes down to implementation, PR, funding, geopolitics, your users and luck.

netcraft · on Jan 2, 2024

When it comes to programming, I agree completely. The sweet spot for any use of LLMs is that you already know enough about the subject to verify the work - at least the output - and know enough how to describe in detail (ideally only salient details) what you want. Huge +1 to it helping me do things faster, do things that I wouldnt have otherwise done, or using it for throwaway, mostly inconsequential yet valuable programs.

But another area I have found it extremely helpful in is exploring a new topic entirely, programming or otherwise. Telling it that I dont know what im talking about, don't necessarily need specifics, but here is what I want to talk about and want it to help me think through.

Especially if you are that person who is willing to take what you hear and do more research or ask more question. The entrance to so many fields and subjects is just understanding the basic jargon, listening for the distinctions being made and understanding why, and knowing who the authorities are on the subject.

_teyd · on Jan 2, 2024

And it's equally and inversely harmful to junior developers who keep prodding it until it generates an abomination they don't understand that manages to pass the build. People who are learning need help, but the kind of help that LLMs in copilot form provide aren't the right fit.

It would be interesting to train a copilot model that is specifically intended to ask clarifying questions and be a partner in determining a solution, rather than doing its best to generate code for a vague or incorrectly specified question from a junior.

makk · on Jan 2, 2024

Have you tried prompting it to ask clarifying questions and be that partner? Perhaps no (extra) training required.

_teyd · on Jan 2, 2024

In my opinion the junior developers are not equipped to guide their teacher. If they knew they were asking to turn an incorrect assumption into code in the first place, they already wouldn’t believe the confident hallucination they get in reply.

CaptainFever · on Jan 2, 2024

> And it's equally and inversely harmful to junior developers who keep prodding it until it generates an abomination they don't understand that manages to pass the build.

This sounds like shotgun debugging.

_teyd · on Jan 2, 2024

While on LSD in this case.

ajmurmann · on Jan 3, 2024

It might not be good for juniors, but, as Antirez points out, is great for adapting to new frameworks or APIs, especially with search results having become so much more noisy lately

anileated · on Jan 2, 2024

Last month I tried to use LLMs for things I didn’t know and couldn’t easily find. Every time they were either subtly wrong or outright hallucinated premises which led me to waste time until I realized they were wrong.

If not for the unwarranted confidence in incorrect responses, I could say they were at least not much worse than what I could piece together from what I knew. As it stands, they are OK filling in for a rubber duck and as autocomplete.

_giorgio_ · on Jan 2, 2024

Surely a bad LLM, not chatGPT 4

jay_kyburz · on Jan 2, 2024

I can't tell if you are joking or not.

_giorgio_ · on Jan 5, 2024

netcraft · on Jan 2, 2024

I'll add another thought here - what I really want many times is a custom LLM like GPT, but trained on a particular language or framework or topic. I would love to go to a website for a new language and be able to talk about its documentation and ask questions of an LLM to help me understand. Huge bonus points if it was trained on real world code examples of that language or framework and I could have it help me write a new program or function right there. More bonus points if its tied in with an online repl where it can help me right inline.

crq-yml · on Jan 3, 2024

Ease of retraining/refinement is something I'm really hoping for.

There are an endless number projects to make a "cleaner, revised X", where the coding itself is rote and has already been done at some point, it's just shoved into slightly different semantics that will be a bit more optimal or secure or configurable. It's something that an LLM feels like it's "tip of the tongue" capable of, and in more trivial cases you really can tell GPT to "rewrite this from JS to Python" and it works. But it's limited by just interpolating what's in the training set, when what you want is "port all these standard libraries to my experimental language, and also make a build system for them".

itomato · on Jan 2, 2024

The One and Six Pagers I have it write based on loose criteria help me refine the criteria and in some cases, uncover methods that would not have been evident otherwise.

miki123211 · on Jan 2, 2024

I think the most under-appreciated aspect of LLMs, one on which the article touched on but didn't directly address, is being the "developer that knows everything" aspect.

No matter how senior of a programmer you are, you're eventually going to encounter a technology you know very little about. You're always going to be a junior at something. Maybe you're the God of Win32, C++ and COM, but you get stuck on obscure NSIS scripts when packaging your software. Maybe you've been writing web apps for the last 25 years and sit on the PHP language committee, but then you're asked to implement some obscure ISO standard for communicating with credit card networks, and you've never communicated with credit card networks on that level before. Maybe you've been writing iOS apps since the first iPhone and Mac apps before that, spent a few years at Apple, know most iOS APIs by heart and designed quite a few yourself, but then you're asked to implement CalDAV support in your app and you don't know what CalDAV is, much less how to use it. An LLM can help you out in these situations. Maybe it won't write all the code for you, but it'll at least put you on the right track.

reactordev · on Jan 2, 2024

>"No matter how senior of a programmer you are, you're eventually going to encounter a technology you know very little about"

Or worse, you've filled your head with different tech and now you need to rehash and brush up on stuff you learned prior but swept under the rug for new stuff. It's a strange sensation. Naturally you just go with the median of whatever your company you work for is doing - then find yourself in this situation where it's "been a while" since you worked on CSS. Or it might take you a weekend of study to bring back those Python dataclass skills.

sime2009 · on Jan 2, 2024

I've found LLMs great for vague questions about functions and APIs whose details I've long forgotten. Recognising the right answer when it appears is often faster than digging through random results on google.

Salgat · on Jan 2, 2024

At its heart GPT is the world's best googler. As long as you can find it on Google, an LLM can probably do a better and faster job of finding and curating the information for you.

jay_kyburz · on Jan 2, 2024

errr, I don't know about that. I like to ask the AIs about products I built and know everything about as a test, and while they often get the top level facts correct, there is always some crazy hallucination that you won't find googling.

pjmlp · on Jan 3, 2024

Today it won't, tomorrow it certainly will, and it is time to look for something else as a job.

Thankfully I will most likely already be retired.

anotherpaulg · on Jan 2, 2024

The code was written mostly by doing cut & paste on ChatGPT…

I am constantly shocked by how many people put up with such a painful workflow. OP is clearly an experienced engineer, not a novice using GPT to code above their knowledge. I assume OP usually cares about ergonomics and efficiency in their coding workflow and tools. But so many folks put up with cutting and pasting code back and forth between GPT and their local files.

This frustrating workflow was what initially led me to create aider. It lets you share your local git repo with GPT, so that new code and edits are applied directly into your files. Aider also shares related code context with GPT, so that it can write code that is integrated with your project. This lets GPT make more sophisticated contributions, not just isolated code that is easy to copy & paste.

The result is a seamless “pair programming” workflow, where you and GPT are editing the files together as you chat.

https://github.com/paul-gauthier/aider

adamgordonbell · on Jan 2, 2024

I like aider. But is there a way to use it to just chat about the code?

I use LLMs to chat about pros and cons of various approaches or rubber duck out problems. I need to copy code over for that, but I've not found aider good for these kinds of things, because it's all about applying changes.

I usually have several back and forths about the right way to do things and then maybe apply some change.

anotherpaulg · on Jan 2, 2024

Glad to hear you're finding aider useful!

Sure, there's a few things you could keep in mind if you just want to chat about code (not modify it):

1. You can tell GPT that at the start of the chat. "I don't want you to change the code, just answer my questions during this conversation."

2. You can run `aider --dry-run` which will prevent any modification of your files. Even if GPT specifies edits, they will just be displayed in the chat and not applied to your files.

3. It's safe to interrupt GPT with CONTROL-C during the chat in aider. If you see GPT is going down a wrong path, or starting to specify an edit that you don't like... just stop it. The conversation history will reflect that you interrupted GPT with ^C, so it will get the implication that you stopped it.

4. You can use the `/undo` command inside the chat to revert the last changes that GPT made to your files. So if you decide it did something wrong, it's easy to undo.

5. You can work on a new git branch, allow GPT to muck with your files during the conversation and then simply discard the branch afterwards.

spenczar5 · on Jan 2, 2024

Can I recommend an additional option? I would enjoy being able to enable a “confirm required” mode which presents the patch that will be applied, and offers me the chance to accept/reject it, possibly with a comment explaining the rejection.

adamgordonbell · on Jan 2, 2024

Awesome, these might help.

What I feel like I want is /chat where it still sends the context, but the prompt is maybe changed a little, to be closer to a chatgpt experience.

I haven't dug into the prompt aider is using though, so I could be wrong.

Great tool for refactoring changes though! Keep up the good work.

FiberBundle · on Jan 2, 2024

> OP is clearly an experienced engineer

You think? He's the creator of Redis.

BaculumMeumEst · on Jan 2, 2024

For one thing the ChatGPT web interface is useful for much more than just programming. If you're already paying for a sub, it makes sense to cut and paste instead of making additional payments for the API. On top of that people have different thresholds for the efficiency gains that warrant becoming dependent on someone else's project, which is liable to become paid or abandoned.

danielbln · on Jan 2, 2024

Yeah, I can ask ChatGPT to "do some web research, and validate the approach/library/interface/whatever", which is a useful feature to me.

electroly · on Jan 2, 2024

I really like the idea of aider but when I tried it, it didn't work. The first real life file I tried it on was too big and it just blew up. The second real life file I tried was still too big. I was surprised that aider doesn't seem to have the ability to break down a large file to fit into the token limit. GPT's token limit isn't a very big source file. If I have to both choose the files to operate on and do surgery on them so GPT doesn't barf, am I saving time vs. using Copilot in my IDE? Going into it, I had thought that coping with the "code size ≫ token limit" problem was aider's main contribution to the solution space but I seem to have been wrong about that.

I hope to try aider again but it's in the unfortunate category of "I have to find a problem and a codebase simple enough that aider can handle it" whereas Copilot and ChatGPT come to me where I am. Copilot and ChatGPT help me with my actual job on my real life codebase, warts and all, every day.

anotherpaulg · on Jan 2, 2024

I'm sorry to hear you had a rough experience trying aider. Have you tried it since GPT-4 Turbo came out with the 128k context window? Running `aider --4-turbo` will use that and be able to handle larger individual source code files.

Aider helps a lot when your codebase is larger than the GPT context window, but the files that need to be edited do have to fit into the window. This is a fairly common situation, where your whole git repo is quite large but most/all of the individual files are reasonably sized.

Aider summarizes the relevant context of the whole repo [0] and shares it along with the files that need to be edited.

The plan is absolutely to solve the problem you describe, and allow GPT to work with individual files which won't fit into the context window. This is less pressing with 128k context now available in GPT 4 Turbo, but there are other benefits to not "over sharing" with GPT. Selective sharing will decrease token costs and likely help GPT focus on the task at hand and not become distracted/confused by a mountain of irrelevant code. Aider already does this sort of contextually aware selective sharing with the "repo map" [0], so the needed work is to extend that concept to a sub-file granularity.

[0] https://aider.chat/docs/repomap.html#using-a-repo-map-to-pro...

ilaksh · on Jan 2, 2024

Try again since the token limit increased in November by a factor of 16 (128000 now for GPT-4 Turbo 1106 preview instead of 8000 for GPT 4).

danielbln · on Jan 2, 2024

Mind the cost though! A single request with a fully loaded 128k token context window to GPT-4 Turbo costs $1.28.

aaronscott · on Jan 2, 2024

Like others, wanted to say thank you for writing Aider.

I think you've done a fantastic job of covering chat and confirmation use cases with the current features. Comments on here may not reflect the high satisfaction levels of most of your software users :)

Aider helps put into practice the use cases that antirez refers to in their article. Especially as someone get's better at "asking LLMs the right questions" as antirez refers to it.

lhl · on Jan 2, 2024

I've given Aider and Mentat a go multiple times and for existing projects I've found those tools to easily make a mess of my code base (especially larger projects). Checkpoints aren't so useful if you have to keep rolling back and re-prompting, especially once it starts making massive (slow token output) changes. I'm always using `gpt-4` so I feel like there will need to be an upgrade to the model capabilities before it can be reliably useful. I have tried Bloop, Copilot, Cody, and Cursor (w/ a preference towards the latter two), but inevitably, I end up with a chat window open a fair amount - while I know things will get better, I also find that LLM code generation for me is currently most useful on very specific bounded tasks, and that the pain of giving `gpt-4` free-reign on my codebase is in practice, worse atm.

anotherpaulg · on Jan 2, 2024

There is a bit of learning curve to figuring out the most effective ways to collaboratively code with GPT, either through aider or other UXs. My best piece of advice is taken from aider's tips list and applies broadly to coding with LLMs or solo:

Large changes are best performed as a sequence of thoughtful bite sized steps, where you plan out the approach and overall design. Walk GPT through changes like you might with a junior dev. Ask for a refactor to prepare, then ask for the actual change. Spend the time to ask for code quality/structure improvements.

https://github.com/paul-gauthier/aider#tips

jeswin · on Jan 2, 2024

Shameless plug: https://github.com/codespin-ai/codespin-cli

It's similar to aider (which is a great tool btw) in goals, but with a different recipe.

ParetoOptimal · on Jan 2, 2024

I currently use gptel which inserts into my buffer directly and has less friction than copy paste.

Aider seems super cool, will check it out. What kind if context from the git repo does it share?

anotherpaulg · on Jan 2, 2024

Glad to hear you'll give aider a try. Here's some background on the "repo map" context that aider sends to GPT:

https://aider.chat/docs/repomap.html

antonvs · on Jan 2, 2024

If I was doing it all the time, I might care. As it is I don't really find that workflow painful. It reminds me of the arguments about how much it helps to be able to touch type or type very fast. Actually inputting code is a minor part of development IME.

_giorgio_ · on Jan 2, 2024

I edit cell-sized code that uses Colab GPUs, asking a lot of questions to chatGPT l,so copying and pasting is not a problem for me.

ilaksh · on Jan 2, 2024

Thanks for making aider. I use it all the time. It's amazing.

Wowfunhappy · on Jan 2, 2024

For the past few days, I have been trying to fix a bug in a closed-source Mac app. I otherwise love the app, but this bug has been driving me crazy for years.

I was pretty sure I knew which Objective-C method was broadly responsible for the bug, but I didn't know what that method did, and the decompiled version was a nonsensical mess. I felt like I'd hit a wall.

Then I thought to feed the decompiler babble to GPT-4 and ask for a clean version. The result wasn't perfect, but I was able to clean it up. I swizzled the result into the app, and I'm pretty sure the bug is gone. (I never found reproduction steps, but the problem would usually have occurred by now.)

I never could have done this without GPT-4.

HarHarVeryFunny · on Jan 2, 2024

This sounds rather like the junior/bad developer who makes a bug disappear (at least for time being) by changing the order of functions in a source file or some such.

Admittedly a complete rewrite of a piece of code, even without understanding what you are doing (e.g by using an LLM), is unlikely to have the same bugs as the original implementation (but may have different bugs), but hopefully no-one is doing this for code where bugs have any significant consequence (e.g. system downtime, cost to customers).

Wowfunhappy · on Jan 2, 2024

Just to be clear, I do understand the new version of the method. I don't entirely know how it fits into the larger system, but that's to be expected when I literally don't have the source code.

When I cleaned it up, I took out some complexity which I believe was responsible for the bug, at the cost of some performance. According to GPT-4, the original version was checking file descriptors to decide when to do work. My version just does the work every 5ms.

ruszki · on Jan 2, 2024

So the parent commenter tried to tell you, that they (and I too) heard this story from junior and bad programmers in the past 20 years all the time, and they didn’t use LLMs. It doesn’t matter whether you use generative AIs or not, it’s a bad way of thinking, and long term it’s not beneficial to anybody. The real problem is that you didn’t dig deeper when you figured out that that code change fixed the problem.

Wowfunhappy · on Jan 2, 2024

But I actually think this is a reasonable way to fix a hard-to-pin-down bug in any context, at least temporarily. (In my case, I don't intend to go back because it's not my app and mostly for personal use, but that's beside the point.)

There was a tradeoff between performance and complexity. The high-performance, high-complexity version was buggy, so I switched to a simpler option at the cost of some performance.

This isn't where the LLM was significant. The LLM was able to make sense of unreadable decompiled code, similar to how the author had ChatGPT translate from compiled assembly code back to C. (Giving GPT-4 the actual assembly never occurred to me, in hindsight I should have tried that first.)

ruszki · on Jan 2, 2024

My job is exactly to fix code which was not understood by its creators. And this “I have no idea why, but it works” (until it doesn’t) is the main cause of most of the problems which I work on.

For example, at my current company the developer who introduced a “clever” navigation system didn’t know how HTML forms should be used, and why servers still allowed what they did. It worked. Now, 20+ years later that sole developer’s stupid decision, and lack of HTML best practices will cost my company a few million dollars (and by the way already cost probably a few million). A missing day of learning (and by the way a clear sign, that that developer should’ve never trusted with this task).

Senior developers learn this, and I’ve never seen that better developers would be satisfied and would say “yeah, I fixed it”, when they don’t understand the what and how completely, even when it’s not strictly necessary. They burned themselves enough times.

rictic · on Jan 2, 2024

You've got me curious, how did the developer's misunderstanding of forms cost millions? Was it submitting duplicate orders? Blocking submission of valid orders causing lost business?

I think it's an error to bring that up here though, where we're talking about someone patching a closed source app for their personal use. Is it worth the cost/benefit of decompiling and studying the app's code sufficiently long to be highly confident of the fix?

Sloppiness and "good enough" has its place. So does full effort correctness.

ruszki · on Jan 3, 2024

Maybe you’re right. I hardly disagree, but those are just opinions.

The main culprit is that the backend is Java EE, and they used a single form for multiple things. This is a 20+ years old software, so they had to used request parameters and attributes everywhere. It’s impossible to locate where a given parameter is used or where attributes are created and used exactly. That’s the first problem. Second, is this single form per page thing. They use that HTML form for everything. Even page navigation, they just discard unnecessary fields on server side. This means that they change that form from JavaScript all the time. Sometime the generated action is not used at all, it’s overwritten with every event. And a single input can be used for multiple things. Third, they used a very flexible framework, which was outdated 10 years ago, so it definitely needs to be replaced.

Add these together, you have terrible spaghetti code for 350+ pages with 1000+ endpoints. There is no separation of code even on HTML level. I know where this “use only one form” came from. I found the ancient doc of that framework, where it mentioned. The problem is that they meant a form object on server side per page (and not per endpoints), and not on client side. And they fucked up completely, because they started to use multiple form objects per page, and single client side HTML forms.

So now, replacing that old framework starts with a hefty refactoring, creating individual HTML forms for example. It will take at least half a year for a team, just the refactoring, because it’s very difficult to split those forms, since request attributes and parameters used everywhere.

Shocka1 · on Jan 2, 2024

This web dev? If so, it must be some special stuff for it to still be in prod after all these years.

couchand · on Jan 2, 2024

This post is absolutely devastating to me. Salvatore is surely one of the most capable software engineers working today. He can lucidly see that this supposed tool is completely useless to him within his area of expertise. Then, rather than cast it off as the ill-fitting, bent screwdriver that it is, he accepts the boosters' premise that he must find some use for it.

Just as any introductory macroeconomics class teaches, if one island has superior skill in producing widgets A, it doesn't matter how terrible the other island's skill at producing B is, we'll still see specialization where island A leverages island B. So of course antirez's relative ability in systems programming would relegate the LLM to other progamming tasks.

However! We do not exist in isolation. There is a multitude of human beings around us, hungry for technical challenges and food. Many of them have or could obtain skills complimentary to our own. In working together, our cooperative efforts could be more than the sum of their parts.

Perhaps the LLM is better at writing PyTorch code than antirez. Just because we have an old bent screwdriver in the garage doesn't mean we should try to use it. Perhaps we'd be better off heading to the hardware store today.

antirez · on Jan 2, 2024

If the LLM is better than me at writing Torch code, it is a great idea for me to use an LLM to write my model definition, since the exact syntax or the reshaping of the tensors is not so important to me. If I want to create a convnet and train it on my images, for my own usage, I don't need to bother some Torch expert to do it for me. I can do it myself, if I understand enough about convnets themselves and not enough about Torch syntax / methods. The alternative would be to study the details of Torch in a manual, and the end result would be the same: the important thing in this task is to govern the ML concepts, not the details of MLX, Keras or PyTorch.

couchand · on Jan 2, 2024

> I don't need to bother some Torch expert to do it for me.

This is, indeed, the core of our disagreement. You seem confident it would be a bother to others to ask for help. I'm confident that there are many who would value the opportunity to collaborate with you. I feel sure that whatever analysis you're doing could benefit from the sounding board of a domain expert, and you'd both benefit from the exchange.

EDIT: to clarify, being "famous" has nothing to do with it. Each of us has worth and we all would gain by working with others.

ctoth · on Jan 2, 2024

Hi Couchand.

I'm a mediocre programmer who uses GPT for a ton.

Are you volunteering to answer my questions on all the obscure stuff I ask it? Because I don't really know anybody else who will.

Anyway, my email is in my profile, write to me if this is something you're up for!

Edit: Here's a list of the stuff I asked it over the weekend:

- Discuss the pros and cons of using Standardized-audio-context instead of just relying on browser defaults. Consider build size and other issues.

- How to get github actions to cache node_modules (not the npm cache built-in to the actions/setup-node action.

- Howto get the current git hash in a GitHub action?

- Rewrite a react class-based component in functional style

- How to test that certain JSX elements were generated without directly-comparing React elements for my ANSI color to HTML parser?

- Does it make more sense to keep a copy of the original text in an editor, or just hold on to something like a crc32 to mark a document dirty?

- Can you set focus to a window you create with window.open? (You sure can!)

- Rewrite the rollup.config.js for a library of mine to produce a separate rollup config per audio worklet bundle

- Turn this tutorial for backing up a Mastodon instance into a script

- Refactor a standalone class to split it in half so each class manages precisely one thing.

- I have some code I'm writing for turn-by-turn directions. I have the data structures already, let's write code to narrate them.

- What's with this weird type error around custom CSS properties with React?

krainboltgreene · on Jan 2, 2024

If only there was a website that allowed people to ask an absurd amount of questions, for free, and get people's insight within a semi-reasonable amount of time.

hobs · on Jan 2, 2024

That's a pretty big exception to the general case though if your argument is "you are a famous person and people would love to collab" - at 2am your time, exactly when you are motivated? And what if you are not a world famous programmer, what then?

To me its like saying you shouldn't play solitaire because you are a world class poker player, and there's plenty of people who want to game with you. They are orthogonal concepts - just communicating with people can be more work than just reading on your own.

ativzzz · on Jan 2, 2024

You're right, but not always. Some people work differently than others and would rather headbutt against a wall on their own for hours than work with other people.

In the long term, it is beneficial to have experts as your collaborators. From my experience though, true collaboration is unlocked once you have established a personal relationship with someone, which takes time and repeated effort. Until then, the collaboration is no better than searching the internet or asking chatGPT.

Establishing relationships with people is hard and takes a lot of work, and frequently doesn't work out like you hope. ChatGPT is a close enough approximation for smaller tasks like the OP describes

couchand · on Jan 2, 2024

It's hard, and many of us (myself included) and not great at it. That makes it seem easier to reach for the simulacrum. But at what cost?

Homo sapiens's superpower is social cooperation. My concern is that these systems will abet the existing social forces which seem to be causing unprecendented levels of isolation of adults, which will continue to drive smart people away from collaboration and towards solitude, at a level far beyond simple prefences would suggest.

We already have enough trouble hearing each other through the noise, and understanding what each other has to say. I don't have the answers but I'm looking for them and I do hope other humans will, too.

ilaksh · on Jan 2, 2024

You said yourself. They are "social forces". In other words, problems that people created. Not technology problems.

I think there is an incorrect worldview that tries to blame human problems on technology.

It's quite true that isolation is an increasing problem. But the idea that instead of using an AI that can spit out a comprehensive answer in seconds, we should all pretend that such tools don't exist, and start constantly asking for help with every idea or request instead, while waiting 3-100 times longer for a less thorough response, is ludicrous.

It's a great idea to collaborate more and try to avoid isolation. But those are societal problems. They are not caused by the latest tools.

Also, as far as humanity's "super-power" as being collaboration, this is quite a shortsighted comment. I believe that well before AI achieves "super" level IQ, it will vastly outperform humans due to other advantages. One of those advantages is speed. Another is the ability to communicate and collaborate much, much more rapidly and effectively than humans.

One type of digital life that may take over control of the planet (possibly within decades rather than centuries) would be a type of swarm intelligence with the ability to actually "rsync" mental models to directly transfer knowledge.

ativzzz · on Jan 2, 2024

I think people tend to cooperate only when there are tangible benefits such as:

- survival

- making money

- sex

- enjoyment via social interactions (like parties, hangouts, etc)

It just so happens that for the majority of our civilization, to get those things, we've had to cooperate, but as we develop technology, our ability to get those things increases and our reliance on others decrease (though in a weird way it increases since technology is complexity so society becomes larger, more complex, and more inter-dependent)

We are still in an unprecedented technological boom of computing so we are adjusting on the fly to it. Like the OP says, AI can greatly accelerate independent learning, but eventually that learning plateaus. Once it does, we have to go back to collaboration, but until we find that limit, I think it's human nature to push on.

mistrial9 · on Jan 2, 2024

no - predatory behavior in groups is completely missing from this list. To extend that ides, IMHO much of "business" and some government, falls into this category easily. This is true for all large language groups world wide, irrespective of politics details.

ativzzz · on Jan 3, 2024

Those fall under making money and surival (and honestly sex and enjoyment as well)

dash2 · on Jan 2, 2024

The argument here seems to be that there is a free supply of software developers available for the taking. Software developers are quite well paid, which suggests this is not true.

kreetx · on Jan 2, 2024

It's more of the scale you do your thing. For the quick and dirty thing, I'm going to write Torch model within the hours - where would I find a collaborator who is willing to start immediately within that time?

politician · on Jan 2, 2024

Local inference of LLMs is essentially free. There doesn’t exist a sufficiently deep pool of experts of all knowledge spaces who are also willing to work for free, who can be trivially identified, contacted, and scheduled.

madeofpalk · on Jan 2, 2024

I'm not going to ask a human - even a coworker - for every intellisense suggestion. Neither of us would get value from that exchange.

e12e · on Jan 2, 2024

At least the LLM is in the same time zone...

ParetoOptimal · on Jan 2, 2024

If the pytorch piece is for something low priority, its probably not worth reaching out to someone else.

Salgat · on Jan 2, 2024

For small tasks this is fine, but for larger projects you have to be careful. The key to being productive with an LLM for coding is to be able to understand the code being generated, to avoid rather nasty bugs that may crop up if the model hallucinates. The worst part is that an LLM will create these bugs in very subtle ways, since it excels at writing convincing code (whether or not it actually works).

nuz · on Jan 2, 2024

The funny thing about LLMs is that there's no rush in adopting them. They're semi useful but not really right now, but you're not gonna be 'left behind' if you don't make use of them. Everyone involved is working their hardest to make them more capable so when that day comes, you'll just use it to prompt what you want. But there's no rush to try to squeeze out anything out of the current generation which mostly lowers productivity rather than increases it.

fhd2 · on Jan 2, 2024

My thinking exactly! There's FOMO going on (and being fueled by people most of which seem to hope to somehow make money off it), but the barriers to entry for using LLMs are just not that high. When the tools are good enough, I'll happily use them. Today, for the work I do, I found that not to be the case yet. I wouldn't advocate for not trying them, but I see no reason to force yourself to use them.

unshavedyak · on Jan 2, 2024

Yea. TBH the tooling is the bigger issue for me, personally. I come across a ton of times where an LLM might work, or could - potentially. Usually refactors. But it requires access to basically all my files, both for context and to find where references to that class/struct/etc are.

Furthermore, i'd vastly prefer a workflow most of the time where i don't even have to ask. Or so i imagine. Ie i think i'd prefer a Clippy style "Want me to write a test for this? Want me to write some documentation for this?" etc helpers. I don't want to have to ask, i want it to intuit my needs - just like any programmer could if pair programming with you.

And most of all i want it to have access to all files. To know everything about the code possible. Don't just look at the func name in isolation, attempt to understand how it's used in the project.

If i have to baby sit an LLM for a simple function refactor to give it all files where the function is used or w/e, i'd rather do it myself with tools like AST Grep or even my LSP in many cases.

I'm very interested in LLMs for simple tasks today, but the tooling feels like my primary blocker. Also possibly context length, but i think there's lots of ways around that.

thenevermind · on Jan 2, 2024

Just to throw my 2c here, since I also want the models to access the whole codebase of (at least) the current project.

I had great impression of sourcegraph’s cody. https://sourcegraph.com/cody few months ago.At least with the enterprise version of the sourcegraph that had indexed most of the orgs private repos.

The web ui (vscode extension was somehow worse, not sure why) was providing damn good responses, be it code generating or QA/explanation about code spanning through multiple repos. (e.g. terraform modules living in different repos)

Afaiu it was using the sourcegraph index under the hood. But I never really deep dived into the cody’s design internals (not even sure if they are actually public)

That being said, I’ve departed from the org months ago and haven’t used cody since then, so take this with a grain of salt, since the whole comment could be outdated a lot.

madeofpalk · on Jan 2, 2024

I think this is an extremely unfavorable interpretation of the article. I wonder if we even read the same thing?

He sees a new tool that others have found interesting, and he identifies ways to use that tool that are useful for him, while also acknowledging where it's not useful. He backs it up with plenty of examples of where he found it not-useless. This is not a revolutionary insight, especially for a developer. We constantly use a variety of tools, such as programming languages, that have strengths and weaknesses. Why are LLMs so different? It seems foolish to claim they have zero strengths.

HarHarVeryFunny · on Jan 2, 2024

You might be surprised at the number of large companies who think that "GenAI" can be used to replace programmers due to non-technical executives having got the impression that a competency of LLMs is writing code ...

Of course they do have uses, but more related to discovery of APIs and documentation than actually writing code (esp. where bugs matter) for the most part.

I also have to wonder how long until open source code (e.g. GPL'd) regurgitated by LLMs and incorporated into corporate code bases becomes an issue. The C suite dudes seem concerned about employees using LLMs that may be publicly exposing their own company's code base, but illogically unworried about the reverse happening - maybe just due to not fully understanding the tech.

rolisz · on Jan 2, 2024

So you're suggesting that instead of asking an LLM, we should spend time on Fiverr/Upwork to find someone to do random coding tasks that might not fall under our expertise? Can you do that for less than 20$/month?

couchand · on Jan 2, 2024

I agree there are difficult coordination problems that our society has failed to grapple with, let alone solve.

esafak · on Jan 2, 2024

I don't understand which part was devastating.

kvz · on Jan 2, 2024

A respectable person offering nuance can be shattering if you identify as pro- or anti something.

I wish we had an easier time talking about ideas with a little more detachment.

Symmetry · on Jan 2, 2024

In a frictionless market it would make sense to do that. But as Coase pointed out nearly a century ago[1] forming and monitoring the relationships that allow specialization involves a certain amount of overhead. At a certain point going through the hiring or vetting process to utilize another person's skills makes sense, but it looks like the author is very far from that point.

[1]https://en.wikipedia.org/wiki/The_Nature_of_the_Firm

throwuwu · on Jan 2, 2024

Bad metaphor and worse that you use it to inform your conclusion. If you must have one then use training wheels, experts don’t need training wheels beginners do, simple as that. Though the utility of LLMs goes much further as the author points out, it can often make the boring or tedious parts easier so you can add assistive peddling to the metaphor. To carry it to the end, once you have four wheels and a motor it’s not long before someone invents the car.

bbor · on Jan 2, 2024

Like… hiring people? I think it’s a little rediculous to say “no don’t use that screwdriver, go hire a workman instead.” To say the least, there are some sizeable economic differences between those two options

nkohari · on Jan 2, 2024

> He can lucidly see that this supposed tool is completely useless to him within his area of expertise.

Did you read the article? Throughout the entire post he clearly says LLMs have a lot of value in his workflow.

sevagh · on Jan 2, 2024

Parent seems to think that Antirez has been begrudgingly pulled along by the tide of LLM hype, not as if Antirez is an accomplished developer whose judgement on tools can be trusted.

antonvs · on Jan 2, 2024

This seems like a very impractical perspective to me. If there really was a "hardware store" that we could head off to to get what we need, it might be different. But in general, that's not the case.

There can also be significant overhead in looking elsewhere for a solution. That's a big part of why so many developers reinvent things. This is often dismissed as NIH syndrome, but there's more to it than that.

You raised "introductory macroeconomics". The economic effect that will most strongly apply in the case of LLMs is that of the technology treadmill (Cochrane 1958): when there's a tool that can improve productivity, it will be used competitively so that those who don't use it effectively, to improve their productivity, will be outcompeted.

This seems like an unavoidable result in typical capitalist economies.

Your point about leveraging hungry humans would require strong incentives to overcome the treadmill effect. Most Western countries don't have many ways to implement anything like that. The closest thing might be unions, but of course most software development is not unionized.

sevagh · on Jan 2, 2024

There is an impedance problem when working on a new project.

At the beginning, when there's 0% of the task done, and you need to start _somewhere_, with a hello world or a CMakeLists file or a Python script or whatever, it takes effort. Before ChatGPT/LLM, I had to pull that effort out from within myself, with my fingertips. Now, I can farm it out to ChatGPT.

It's less efficient, not as powerful as if I truly "sat down and did it myself," but it removes the cost of "deciding to sit down and do it myself." And even then, I'm cribbing and mashing together copy-pasted fragments from GitHub code search, Stackoverflow, random blog posts, reading docs, Discord, etc. After several attempts and retries, I have a "5% beginning" of a project when it finally takes form and I can truly work on it.

I sort of transition from copy-pasting ChatGPT crap to quickly create a bunch of shallow, bullshit proofs-of-concept, eventually gathering enough momentum to dive into it myself.

So, yes, it's slower, and more inefficient, and ChatGPT can't do it better than I can. But it's easier and I don't have to dig as deep. The end result is I have much more endurance in the actual important parts of the project (the middle and end), versus burning myself out on the beginning.

itomato · on Jan 2, 2024

Was I digging too deeply before?

Was I asking the right questions from the beginning, and if not, can I effectively salvage my work?

Sunk costs disappear into a $20 subscription

madeofpalk · on Jan 2, 2024

> I have a problem, I need to quickly know something that I can verify* if the LLM is feeding me nonsense. Well, in such cases, I use the LLM to speed up my need for knowledge.*

This is the key insight from using LLMs in my opinion. One thing that makes programming especially well suited for LLMs is that it's often trivial to verify the correctness.

I've been toying around this concept for evaluating whether a LLM is the right tool for the job. Graph out "how important is it that the output is correct" vs "how easy is it to verify the output is correct". Using ChatGPT to make a list of songs featuring female artists who have won an Emmy is time consuming to verify it's correct, but it's also not very important and it's okay if it contains some errors.

blibble · on Jan 2, 2024

> One thing that makes programming especially well suited for LLMs is that it's often trivial to verify the correctness.

is this why software never has any bugs?

adamgordonbell · on Jan 2, 2024

Yeah, exactly.

Problems where coming up with solution is hard but verifying a possible solution is easy.

And we all know what that class of problems is called.

latexr · on Jan 2, 2024

If something is time consuming, not very important, and accuracy doesn’t matter, perhaps the correct answer is to not do it. The world is already full of irrelevant inaccurate drivel and we’d do well to have less, not accelerate its production.

This is not a comment on your specific example, but on the idea as a whole.

apwell23 · on Jan 2, 2024

I use chatgpt as my thinking partner writing code. I chat with it all day everday to finish work.

My company has approved copilot but Copilot autocomplete has been an awful experience. company hasn't approved copilot chat ( which is what i need) .

But I would love something similar that can run on my laptop for my code to generate unit tests, code comments ect ( ofcourse with my input and guidance).

deergomoo · on Jan 2, 2024

> My company has approved copilot but Copilot autocomplete has been an awful experience

I had the same experience, I feel like I must be crazy because so many of my colleagues have been singing its praises. I found it immensely distracting and disabled it again after a couple of days.

It was like having someone trying to finish my sentence while I was still speaking; even when they were right, it was still annoying and knocked me out of my flow (and very often, it wasn’t right).

fipar · on Jan 2, 2024

I actually find copilot quite useful but I use it from emacs and it only provides suggestions when I intentionally hit my defined shortcut for it, so it never gets in the way. It may be worth for you to try setting it up in a similar way in the tool you’re using it from, as I agree I’d find the experience awful if it was always trying to autocomplete my sentences.

cassianoleal · on Jan 2, 2024

If you use VS Code or a JetBrains IDE, Continue works well with Ollama and it’s really easy to get going.

[0] https://continue.dev/

[1] https://ollama.ai/

unshavedyak · on Jan 2, 2024

Any opinion on what the best experience is, currently?

cassianoleal · on Jan 2, 2024

What do you mean by experience?

If you mean the model, I’ve been happy with Deepseek Coder. Mistral is a popular alternative as well.

unshavedyak · on Jan 3, 2024

By experience, i mean the whole UX - which is as much the interface to the LLM as it is the LLM itself, imo. Ie i'm not terribly interested in constantly telling an LLM what i want, prodding it to do the right thing, etc. I'm more interested in ways it can be easy and correct - even if that's limited in functionality. Ie perhaps automatically suggesting docs, knowing what style to use, etc.

cassianoleal · on Jan 3, 2024

Right. Continue is a bit more hands on, conversational. I've used Llama Coder for code completions and generation right on the editor to some success.

A sibling comment suggested Wingman but I haven't tried it.

leschak · on Jan 3, 2024

yes i'd second deepseek-coder and check out wingman. it's a relatively new extension, snappy and with good ux.

cassianoleal · on Jan 3, 2024

Is this the one? https://marketplace.visualstudio.com/items?itemName=nvms.ai-...

There's another one called Wingman Copilot as well: https://marketplace.visualstudio.com/items?itemName=WingmanC...

leschak · on Jan 11, 2024

yes, it's the nvms one.

moffkalast · on Jan 2, 2024

Fwiw, there are now some local models that rival 3.5-turbo in code chat, like the Codeninja I tried out the other day. Not nearly as good as 4 which iirc runs the copilot backend, but for sensitive data that can't leave the premises it's the only real option. Or getting a dedicated instance from OpenAI I guess.

kromem · on Jan 2, 2024

Perhaps the most important point in the piece, and one that can't be repeated enough or understood enough as we head into what 2024 has in store:

> And then, do LLMs have some reasoning abilities, or is it all a bluff? Perhaps at times, they seem to reason only because, as semioticians would say, the "signifier" gives the impression of a meaning that actually does not exist. Those who have worked enough with LLMs, while accepting their limits, know for sure that it cannot be so: their ability to blend what they have seen before goes well beyond randomly regurgitating words. As much as their training was mostly carried out during pre-training, in predicting the next token, this goal forces the model to create some form of abstract model. This model is weak, patchy, and imperfect, but it must exist if we observe what we observe. If our mathematical certainties are doubtful and the greatest experts are often on opposing positions, believing what one sees with their own eyes seems a wise approach.

ahgamut · on Jan 2, 2024

> Instead, many have deeply underestimated LLMs, saying that after all they were nothing more than somewhat advanced Markov chains, capable, at most, of regurgitating extremely limited variations of what they had seen in the training set. Then this notion of the parrot, in the face of evidence, was almost universally retracted.

I'd like to see this evidence, and by that I don't mean someone just writing a blog post or tweeting "hey I asked an LLM to do this, and wow". Is there a numerical measurement, like training loss or perplexity, that quantifies "outside the training set"? Otherwise, I find it difficult to take statements like the above seriously.

LLMs can do some interesting things with text, no doubt. But these models are trained on terabytes of data. Can you really guarantee "there is no part of my query that is in the training set, not even reworded"? Perhaps we can grep through the training set every time one of these claims are made.

skepticATX · on Jan 2, 2024

Exactly. I think that it’s very hard for us to comprehend just how much is out there on the internet.

The perfect example of that is the tikz unicorn in the Sparks paper. Seemed like a unique task, until someone found a tikz unicorn in an obscure website.

There is plenty of evidence that LLMs struggle as you move out of distribution. Which makes perfect sense as long as you stop trying to attribute what they’re doing to magic.

This doesn’t mean they’re not useful, of course. But it means that we should should be skeptical about wild capability claims until we have better evidence than a tweet, as you put it.

Legend2440 · on Jan 3, 2024

They didn't actually find a unicorn ; they found other tikz animals. It still generalized to the unicorn.

This was the package: https://ctan.org/pkg/tikzlings?lang=en

og_kalu · on Jan 2, 2024

>Can you really guarantee "there is no part of my query that is in the training set, not even reworded"?

I mean..yes?

Multi digit arithmetic, translation, summarization. There are many tasks where this is trivial.

tipsytoad · on Jan 2, 2024

The most useful feature of llms is how much output you get from such little signal. Just yesterday I created a fairly advanced script from my phone on the bus ride home with chatgpt which was an absolute pleasure. I think multi-prompt conversations don't get nearly as much attention as they should in llm evaluations.

danielbln · on Jan 2, 2024

I suppose multi-prompt conversations are just a variation on few-shot prompting. I do agree though, that they don't play a big enough role in eval, but also in the heads of many people. So many capable engineers I now nope out of GPT because the first answer isn't satisfactory, instead of continuing the dialog.

andyjohnson0 · on Jan 2, 2024

> These are all things I do not want to do, especially now, with Google having become a sea of spam in which to hunt for a few useful things.

Seriously, just don't use Google for search. Google search is just a way to get you to look at their ads.

Use a search engine that is aligned with your best interests, suppresses spammy sites, and lets you customise what you want it to surface.

I've used chatgpt as a coding assistant, with varying results. But my experience is that better search is orders of magnitude more useful.

wrkronmiller · on Jan 2, 2024

> Use a search engine that suppresses spammy sites and lets you customise what you want it to surface.

Can you give an example of such a search engine? Which one(s) do you use and why?

thenevermind · on Jan 2, 2024

I see a lot of recommendations for kagi, but no mention of brave search - specifically the (beta) feature called “goggles”. Afaiu it’s a blend of kagi’s “lenses” and the site ranking in search results.

https://search.brave.com/help/goggles

There is a list (search) of public goggles: https://search.brave.com/goggles

The goggles itself are just text files with basic syntax and can be hosted on e.g. github gist. (though you have to publish it to brave)

https://github.com/brave/goggles-quickstart/blob/main/goggle...

Tbh, I can’t really compare brave search to kagi, since I never used kagi (though I’m using Orion - webkit based browser from the same dev and love it). Afaik, brave search is using its own index, thus making the results somehow limited and inferior to kagis. Just wanted to throw some (free) alternative here that works for me. :)

* Note that Brave search, despite privacy oriented, is still ad funded and there was few controversies about brave’s (browser) privacy in the past. (if that’s relevant for you)

* I’m not affiliated with Brave in any way.

andyjohnson0 · on Jan 2, 2024

Kagi. You have to pay - but it prioritises based on content, not ads, and it lets you pin / emphasise / deemphasise / block sites according to your needs.

(No connection with kagi.com except being a very satisfied user)

Terretta · on Jan 2, 2024

Pay for Kagi. It's a tool.

Pay for it, so search results are the product, instead of an ad platform sold to advertisers with you as the product.

dewey · on Jan 2, 2024

Kagi does that, switched to it full time after years of giving alternatives like DDG a go and failing. Can recommend!

times_trw · on Jan 2, 2024

Can you give an example where kagi is better than google?

I've tried a couple of searches on the free tier and they gave pretty much the same results. I only have so many free searches to check too.

dewey · on Jan 2, 2024

It allows me to remove websites from the results. That’s already one of the main selling points for me.

krainboltgreene · on Jan 2, 2024

You can do this within google as well, by the way.

times_trw · on Jan 2, 2024

Sure, but can you please give me an example since I only have so many searches and I've switched over to chatgpt for most of my former googling tasks.

a12k · on Jan 2, 2024

Google has been inundated with SEO spam, and sometimes I want current things so LLMs don't work that well. One example is I was buying a ... wait, actually I was putting together some examples for you to compare Kagi (I am an unlimited subscriber) to Google directly, and none of them work now. My Google results for things like "best running shoes 2024" or things like that returned basically the same results as Kagi, pushing sites like Reddit and Wirecutter and REI blog and other known-good blogs to the top. Tried this in Private Browsing as well.

This is definitely a departure because when I subscribed to Kagi a couple months ago, all of my Google results for similar searches were SEO spam blogs filled with Amazon affiliate links that look like they had just sucked some Amazon reviews automatically into some poor facade to generate affiliate revenue.

These results were a surprise to me. Not sure what changed.

times_trw · on Jan 2, 2024

Yes, that's what I was getting too.

I imagine what changed is that Kagi started getting traction on site like here and some managers at google actually did something about it.

My own test "voynich illuminated manuscript" which used to give nothing but pintrest spam on google. Now there is just one result from pintrest in google and pretty much every result in Kagi is from pintrest.

There is an academic tab which seems interesting. I will give it a try later.

Terretta · on Jan 2, 2024

Above, I suggested pay for Kagi. A search engine is more than just serps:

https://blog.kagi.com/kagi-features

If you prefer LLMs to Googling, then at least consider "phind":

https://www.phind.com/search?home=true

visarga · on Jan 2, 2024

I use phind.com, but perplexity.ai also works well

eurekin · on Jan 2, 2024

I like to use chatgpt for the easy stuff, things I forgot, do to rarely to remember or code in another language very similar to the one I already know.

I do quickly run into bumps, where search is necessary (a lot of times it's some variant of a breaking change in a dependent library). Once I find a good enough issue description, I just slap that back into chatgpt. It handles it very well and sticks for the rest of the conversation. Somehow chatgpt is aware that the context information takes precedence over trained data.

I also have the Kagi subscription, which I'm using for above. I'm very happy with both tools working in tandem and am genuinely happy about that kind of time spending

kvz · on Jan 2, 2024

antirez thank you for talking some sense. I’ve seen skilled devs discard LLMs entirely based on seeing one (too many) hallucinations, then proclaiming they are inferior and throwing the baby away with the bathwater. There is still plenty of use to be had from them even if they are imperfect.

drubio · on Jan 2, 2024

What an ending...

I have never loved learning the details of an obscure communication protocol or the convoluted methods of a library written by someone who wants to show how good they are. It seems like "junk knowledge" to me. LLMs save me from all this more and more every day.

This is depressing or tongue-in-cheek considering who he is -- Redis creator -- and has an older post titled 'In defense of linked lists', so talking about linked lists in Rust is not "junk knowledge" or something an LLM can analyze circles around any human.

It's the best coding nihilism as a profession post I have read though.

antirez · on Jan 2, 2024

There is a misunderstanding going here. A linked list is a pure form of knowledge. What we see today is an explosion of arbitrary complexity that is the fruit, mostly, of bad design. If I learn the internals of React, I'm not really understanding anything fundamental. If I get to know the subtleness of Rust semantics and then Rust goes away, I'm left with nothing: it's not like learning Lisp. Think to all the folks that used to master M4 macros in the Sendmail, 30 years ago. I was saying the same, back then: this is garbage knowledge.

Today we have a great example in Kubernetes, and all the other synthetic complexity out there. I'm in, instead, to learn important ML concepts, new data structures, new abstractions. Not the result of some poor design activity. LLMs allow you to offload this memorization out of your mind, to make space for distilled ideas.

gilbetron · on Jan 2, 2024

Spot on - it is one of the main reasons I haven't enjoyed programming in recent years, so much of it is learning what you call "garbage knowledge". Yet another API, yet another DSL, yet another standard library. Endless reading of internal wiki pages to learn the byzantine deployment system of my current company. Even worse, when I know exactly what I want, but some little dependency or piece of tooling is bad and I spend hours, or days, trying to debug it.

I, too, find LLMs a balm for this pain. They have kind-of-basic level of knowledge, but about everything.

In short, it allows for a more efficient expenditure of mental and emotional energy!

emporas · on Jan 2, 2024

To rephrase it a little bit.

Much of programming, coding and developing is done by a person who is a knowledge worker and writes code. A good proportion of code to be written, will be written just once and never again. The one-off code snippet will stay in a file collecting dust forever. There is no point in trying to remember it in the first place, because without constant repetition of using it, it will be forgotten.

LLMs can help us focus our knowledge where it really matters, and discard a lot of the ephemeral stuff. That means that we can be more of knowledge workers and less of coders. I will push it even further and state that we will become more of knowledge workers and less of coders until we will be, eventually and gradually, just knowledge workers. We will need to know about algorithms, algorithmic complexity, abstractions and stuff like that.

We will need to know subjects like that Rust book [1] writes about.

[1]https://github.com/QMHTMY/RustBook/tree/main/books

jgalt212 · on Jan 2, 2024

> this erudite fool is at our disposal and answers all the questions asked of them,

Yes, but I have to double-check every answer. And that, for me, greatly mitigates or entirely negates their utility. Of what value is a pocket calculator that only gets the right answer 75% if the time, and you don't ex ante know what 75%?

brigadier132 · on Jan 2, 2024

- I can read the code and reading code is faster than writing it.

- I can also tell the llm to write tests for the code it wrote and i can validate that the tests are valid.

- LLMs are also valuable in introducing me to concepts and techniques I would never had had exposure to. For example, I have a problem and explain my problem, it will bring up technologies or terms I never considered because I just didn't know about them. I can then do research into those technologies to decide if they are actually the right approach.

jgalt212 · on Jan 2, 2024

> I can also tell the llm to write tests for the code it wrote and i can validate that the tests are valid.

If I don't trust the generated code, why should I trust the generated code that tests the generated code?

brigadier132 · on Jan 3, 2024

Do you trust your ability to read?

baq · on Jan 2, 2024

As long as P != NP, verification should be much easier than producing a solution.

Or, from a different angle - all models are wrong, some are useful.

As it happens, LLMs are useful even if they're sometimes wrong.