Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Please give some examples then. I've found the GPT-4 version to be remarkably accurate, and when it makes mistakes it's not hard to spot them.

For example, I commented last week that I've found ChatGPT to be a great tool for managing my task list, and for whatever reason the "verbal" back-and-forth works much better for my brain than a simple checklist-based todo app: https://news.ycombinator.com/item?id=35390644 . But, I also pointed out how it will get the sums for my "task estimate totals by group" wrong. But it's so easy to see this mistake, and after using it for a while I have a good understanding for when it's likely to occur, that it doesn't lessen the value I get from using the tool.



OK, here's one: this substack [1] was flying around a week or two ago, asserting that the marginal value of programmers will fall to zero by 2030. What a dream! No more annoying nerds!

The code in the post is wrong. For this "trivial" example, if you just blindly copied it into your code, it would not do what you want it to do. I love this example not just because it's ironic, but because it's a perfect illustration of how you need to know the answer before you ask for the solution. If you don't know what you're doing, you're gonna have a bad time.

I'm not at all concerned about the value of programmers falling to zero. I'm concerned that a lot of bad programmers are going to get their pants pulled down.

[1] https://skventures.substack.com/p/societys-technical-debt-an...

(Edit: and as a totally hot take, while I'm not worried about good programmers, I think the marginal value of multi-thousand word, think-piece blogposts is rapidly falling to zero. Who needs to pay Paul Kedrosky and Eric Norlon to write silly, incorrect articles, when ChatGPT will do it for free?)


OK, so we are 100% in agreement then? I absolutely don't believe the marginal value of programmers will fall to zero by 2030 (but, to clarify, the way you phrased your original sentence I thought it was that an LLM made this assertion, not some random VC dudes). I also highlighted in my posts that I use AI as an aid to my processes, "That is, for me, the output of ChatGPT or other AI tools is the starting point of my investigation, not the end output. Yes, if you just blindly paste the output from an AI tool you're going to have a bad time, but we also standardize code reviews into the human code-writing process - this isn't that different."

Also, I think the coding example in that substack highlights that one of the most important characteristics of good programmers has always been clarifying requirements. I had to read the phrase "remove all ASCII emojis except the one for shrugs" a couple times because it wasn't immediately clear to me what was meant by "ASCII emojis". I think this example also highlights what happens when you have 2 "VC bros" who don't know what they're talking about highlighting the "clever" nature of what ChatGPT did, because it is totally wrong. Still, I'd easily bet that I could create a much clearer prompt and give it to ChatGPT and get better results, and still have it save me time in writing the boiler plate structure for my code.


You asked for an example and I provided one that I thought illustrated the mistakes GPT makes in a vivid way -- mistakes that are already leading people astray. The fact that this particular example was coupled with a silly prediction is just gravy.

In short, I don't know if we "agree", but I think OP is/was correct that GPT generates lots of subtle mistakes. I'd go so far as to say that the folks filling this thread with "I don't see any problems!" comments are probably revealing that they're not very critical readers of the output.

Now for a wild prediction of my own: maybe the rise of GPT will finally mean the end of these absurd leetcode interview problems. The marginal value of remembering leetcode soltutions is falling to zero. The marginal value of detecting an error in code is shooting up. Completely different skills.


Getting back to that example from that post, though, thinking about it more, "remove all ASCII emojis except the one for shrugs" makes absolutely no sense, because you can't represent shrugs (either with a unicode "Person shrugging" character emoji, or the "kaomoji" version from that code sample that uses Japanese characters) in ASCII, at all. So yes, asking an LLM a non-sensical question is likely to get you a non-sensical response, and it's important to know when you're asking a non-sensical question.


Well, explain it however you like, but the point is that GPT is more than happy to confidently emit gibberish, and if you don't know enough to write the code yourself (or you're outsourcing your thinking to it), then you're going to get fooled.

I'd possibly argue that knowing how to ask the right question is tantamount to knowing the answer.


That code is wrong and I wonder if the author is familiar with the property of code encapsulated in the halting problem. Generically, reading code does not grant one the ability to predict what will happen when that code runs.

Whatever, time will tell. I still haven’t quite figured out how to make good use of GPT-4 in my daily work flow, tho it seems it might be possible.

Has anyone asked it to make an entry for the IOCC?


For a time, I was attempting to use it for game advice during my recent playthrough of Demon's Souls remake (What's the best build for X? What's the best weapon for X?). I asked ChatGPT where to find the NPC The Filthy Woman in a certain level. ChatGPT answered that that NPC doesn't exist, and perhaps I had the wrong game? That NPC most certainly does exist.

I was also using it to generate some Java code for a bit. That is, until it started giving me maven dependencies that didn't exist, and classes that didn't exist, but definitely looked like they would at first glance.


> I asked ChatGPT where to find the NPC The Filthy Woman in a certain level. ChatGPT answered that that NPC doesn't exist, and perhaps I had the wrong game? That NPC most certainly does exist.

OK, wow - that example kind of perfectly proves my point. If I were to ask ChatGPT an extremely specific, low-level question about an extremely niche topic, then I would absolutely be on "high alert" that it wouldn't know the answer. And while I agree the "confidence" with which ChatGPT asserts its answers (though I'd argue the GPT-4 version does a much better job at not being over-confident than 3.5) is off-putting, I think it's pretty easy to detect where it's wrong.

I'd also be curious about your Java example. There was a good YouTube video of a guy that got ChatGPT to write a "population" game for him. In some cases on first try it would output code that had compile errors, e.g. because it had wrong versions of Python dependencies. He would just paste the errors back in to ChatGPT and ChatGPT would correct itself. Again, though, this highlights my point that I use ChatGPT as the start of my processes, a 1st draft if you will. I don't just ask it to write some code, then when I get an error throw my hands up and say "see how dumb ChatGPT is." To each their own, though.


>OK, wow - that example kind of perfectly proves my point. If I were to ask ChatGPT an extremely specific, low-level question about an extremely niche topic, then I would absolutely be on "high alert" that it wouldn't know the answer. And while I agree the "confidence" with which ChatGPT asserts its answers (though I'd argue the GPT-4 version does a much better job at not being over-confident than 3.5) is off-putting, I think it's pretty easy to detect where it's wrong.

I don't consider a popular video game from 2009 to be "extremely niche", and I also shouldn't have to know what ChatGPT knows. And no, I don't think it's easy to detect where it's wrong if you don't know the right answer, and it's actually pretty useless when you have to spend time confirming answers.


I think these type of errors gets mostly resolved with a search plugin.


Out of curiosity was this 3.5 or 4?


I don't believe it was version 4 yet.


Do you happen to know what messages are gonna get dropped by the client if the conversation becomes too long?




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: