Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I've read this same comment for every model since GPT-3.


so have I and this time I wrote it, this broken clock only has to be right once, forever

its beneficial to check it


That's because it's been true for every major new model release since GPT-3.

HN is full of people who tried the free version of ChatGPT a couple of years ago, got a load of random hallucinated slop, and concluded it was all a bunch of useless hype. They enjoy parroting a lot of obsolete stuff they read once about stochastic parrots, without the slightest sense of irony.

When I was growing up, my old man recalled engineers who reacted the same way to transistors, which sucked even more than early-generation LLMs when they first came out. Most of them got over vacuum tubes eventually, though. So will most of the HN'ers, likely including you.


I don't know whose thoughts you're trying to attribute to me, but I've been pretty up-to-date myself (while still keeping myself to a $20/month budget that I switch between providers every few months).

Every generation has at best been a 5% improvement, and nothing revolutionary compared to the last generation. Absolutely no significant improvements between me selecting say Claude 3.7 Sonnet, Claude Sonnet 4, or Claude Sonnet 4.5. Still the same somewhat useful tool for specific purposes, still not good enough to be let loose on production, still not better than what I can do myself with 10 or so years of experience.

Genuinely useful from time to time? Sure, I agree completely. Anywhere near being revolutionary enough as people here insist on gaslighting themselves to be? Absolutely not. Not even remotely.


I'll grant that progress in coding hasn't been as impressive, although if you don't see a difference between Claude 3.7 Sonnet and 4.5 Opus (which is the actual SotA for Claude) I'd suspect a skill issue. I haven't seen miraculous progress in baseline code generation, but there has been a lot of progress in tool use. I can tell Claude CLI to install a complex package, for instance, and it will handle all the Python versioning and dependency-hell issues for me, then test and evaluate the results. 3.7 Sonnet couldn't do that, at least not reliably.

What's really different now is that LLMs have become useful research tools. Three or four years ago, models couldn't cite their sources at all. Two or three years ago, they started to gain the capability, but a large proportion of the citations would either be hallucinated or irrelevant. About a year ago, significant improvements started to emerge. At this point, I can give Gemini 3 Pro or GPT 5.2 Pro a multilayered research task, and end up with a report indistinguishable in accuracy and bibliographic quality from what a good human might produce in a couple of weeks.

It might take a half hour to get the answer, and I'm not sure if you could get the same result at the $20/month level, but the hype and promises we started hearing a couple of years ago are starting to bear fruit. The research models are now capable of performing at grad-student level. Not all the time, and not without making stuff up on occasion... but to argue that no progress at all has been made is nothing more or less than moon-landing denial.


> (which is the actual SotA for Claude)

Cool, how much of it can I use with a $20/month subscription? I'd reach the monthly limit within hours while gaining nothing. It's not that I never tried it, it's that I did and the difference is nowhere near worth 10x my money (nor even the wait time needed to get results).

> I'd suspect a skill issue.

There's 0 skill involved with asking a magic 8 ball to solve your problem for you. There's some skill involved with making it better for yourself consistently by cramming as much useful info as possible into the context window by writing very specific custom agents / skills / MCP servers / whatever, but I have yet to find a client that would pay me to spend an ungodly amount of time fucking around with my toolbelt instead of delivering results.

> Three or four years ago, models couldn't cite their sources at all. Two or three years ago, they started to gain the capability, but a large proportion of the citations would either be hallucinated or irrelevant. About a year ago, significant improvements started to emerge.

Let's translate that: You don't understand what they're doing under the hood when they "do that", you don't understand where that "improvement" is coming from, and I'm not interested in spending my time teaching you for free. For as little as €50/h, I'd be happy to! contact at my username.

> but the hype and promises we started hearing a couple of years ago are starting to bear fruit.

You're gaslighting yourself again in order to justify the price you're currently paying. Downgrade yourself back to $20 subscription, stick to it for one whole month, learn its limits properly, and then if you feel like you need to upgrade to a higher tier again, do so! Spoiler alert: it's gonna appear "significantly crappier" for about a week, and then after that week, you will get used to it and realise you've wasted a fuck ton of money on nothing.

> but to argue that no progress at all has been made is nothing more or less than moon-landing denial.

Where's that coming from? You're putting words into my mouth again. I specifically acknowledged small improvements, but I dismissed drastic improvements. Zero of what you said convinced me otherwise. For the love of all holy, use your own brain a little bit to actually understand the tools you're advocating for. Saying that you drank too much Kool-Aid would've been a severe understatement.


> but I have yet to find a client that would pay me to spend an ungodly amount of time fucking around with my toolbelt instead of delivering results.

right, yeah, I got started by being able to expense the $200 plan, and now I don't expense it but there's no going back, mostly because this is extremely undervalued and its unclear how long this deal will be around

Claude's $200/mo Max x20 plan is the best deal in the game, its equivalent to about $2600 worth of API credits, and is 4x more ability than the $100 plan (Max x5)

all roads to not really... caring... about why you're resistant to this. it's going to be too late when Anthropic raises the prices, all the reasons behind your reservations grow, unless someone else undercuts them just as well.


This is so true, and it's been genuinely painful for me to realize how fucking lazy and close-minded many in our industry actually are when you get down to it.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: