To me, AI hype seems to be the most tangible/real hype in a decade.
Ever since mobile & cloud era at their peaks in 2012 or 2014, we’ve had Crypto, AR, VR, and now AI.
I have some pocket change bitcoin, ethereum, played around for 2 minutes on my dust-gathering Oculus & Vision Pro; but man, oh man! Am I hooked to ChatGpt or what!
It’s truly remarkably useful!
You just can’t get this type of thing in one click before.
For example, here’s my latest engineering productivity boosting query:
“when using a cfg file on the cmd line what does "@" as a prefix do?”
It's astonishing how the two camps of LLM believers vs LLM doubters has evolved even though we as people are largely very similar, doing similar work.
Why is it that e.g. you believe LLMs are truly revolutionary, whereas e.g. I think they are not? What are the things you are doing with LLMs day to day that are life changing, which I am not doing? I'm so curious.
When I think of things that would be revolutionary for my job, I imagine: something that could input a description + a few resources, and write all the code, docs, etc for me - creating an application that is correct, maintainable, efficient, and scalable. That would solve 80% of my job. From my trials of LLMs, they are nowhere near that level, and barely pass the "correct" requirement.
Further, the cynic in me wonders what work we can possibly be doing where text generation is revolutionary. Keeping in mind that most of our jobs are ultimately largely pointless anyway, so that implies a limit on the true usefulness of any tool. Why does it matter if I can make a website in 1/10th the time if the website doesn't contribute meaningfully to society?
> I imagine: something that could input a description + a few resources, and write all the code, docs, etc for me
It could be that you’re falling into a complete solution fallacy. LLMs can already be great at working each of these problems. It helps to work on a small piece of these problems. It does take practice and any sufficiently complicated problem will require practice and multiple attempts.
But the more you practice with them, you start getting a feel for it and these things start to eat away at this 80% you’re describing.
It is not self driving, if anything, software engineering, automation is only accessible to those who nerd out at it, the same way using a PC used to be sending email or programming.
A lot of the attention is on being able to run increasingly capablemodels on machines with less resources. But there’s not much use to fuss over Gemini 2.5 Pro if you don’t already have a pretty good feel for deep interaction with sonnet or GPT 4o.
It is already impressive and can seriously accelerate software engineering.
But the complete solution fallacy is what the believers are claiming will occur, isn't it? I'm 100% with you that LLMs will make subsets of problems easier. Similar to how great progress in image recognition has been made with other ML techniques. That seems like a very reasonable take. However, that wouldn't be "revolutionary", I don't think. That's not "fire all your developers because most jobs will be replaced by AI in a few years" (a legitimate sentiment shared to me from an AI-hyped colleague).
The thing is you're doing what a lot of critics do - lumping together different people saying different things about LLMs into one bucket - "believers" - and attributing the biggest "hype" predictions to all of them.
Yes, some people are saying the "complete solution" will occur - they might be right or might be wrong. But this whole thread with someone saying LLMs today are useful, so it's not hype. That's a whole different claim that is almost objective, or at least hard for you to disprove. It's people literally saying "I'm using this tool today in a way that is useful to me".
Of course, you also said:
> Keeping in mind that most of our jobs are ultimately largely pointless anyway, so that implies a limit on the true usefulness of any tool.
Yeah, if you think most of the economy and most economic activity people do is pointless, that colors a lot about how you look at things. I don't think that's accurate and have no idea how you can even coherently hold that position.
I think the difference is between people who accept nondeterministic behavior from their computers and those who don’t. If you accept your computer being confidently wrong some unknowable percentage of the time, then LLMs are miraculous and game changing software. If you don’t, then the same LLMs are defective and unreliable toys, not suitable as serious tools.
People have different expectations out of computers, and that accounts for the wildly different views on current AI capabilities.
Perhaps. Then how do you handle the computer being confidently wrong a large proportion of the time? From my experience it's inaccurate in proportion to the significance of the task. So by the time it's writing real code it's more wrong than right. How can you turn that into something useful? I don't think the system around us is configured to handle such an unreliable agent. I don't want things in my life to be less reliable, I want them to be more reliable.
(Also if you exist in an ecosystem where being confidently wrong 70% of the time is acceptable, that's kinda suspect and I'll return to the argument of "useless jobs")
Filters. If you can come up with a problem where incorrect solutions can be filtered out, and you accept that LLM outputs are closer to a correct answer than a random string then LLM's are a way to get to a correct answer faster than previously possible for a whole class of problems we previously didn't have answer generators for.
And that's just the theory, in practice the LLM's are orders of magnitude closer to generating correct answers than anything we previously had.
And then there's the meta aspect of them: they can also act as filters themselves. What is possible if you can come with filters for almost any problem a human can filter for, even if that filter has a chance of being incorrect? The possibilities are impossible to tell, but to me very exciting/worrying. LLM's really have expanded the realm of what it is possible to do with a computer. And in a much more useful domain than fintech.
As long as it’s right more than random chance, it’s potentially useful - you just have to iterate enough times to reach your desired level of statistical certainty.
If you take the current trend of the cost of inference and assume that’s going to continue for even a few more cycles, then we already have sufficient accuracy in current models to more than satisfy the hype.
Firstly, something has to verify the work is correct right? Assuming you have a robust way to do this (even with humans coding it's challenging!), at some point the accuracy is so low that it's faster to create it manually than verify many times - a problem I frequently run into with LLM autocomplete and small scale features.
Second, on certain topics the LLM is biased towards the wrong answer and is further biased by previous wrong reasoning if it's building off itself. It becomes less likely that the LLM will choose the right method. Without strong guidance it will iterate itself to garbage, as we see with vibe coding shenanigans. How would you iterate on an entire application created by LLM, if any individual step it takes is likely to be wrong?
Third, I reckon it's just plain inefficient to iterate many times to get something we humans could've gotten correct in 1 or 2 tries. Many people seem to forget the environmental impact from running AI models. Personally I think we need to be doing less of everything, not producing more stuff at an increasing rate (even if the underlying technology gets incrementally more efficient).
Now maybe these things are solved by future models, in which case I will be more excited then and only then. It does seem like an open question whether this technology will keep scaling to where we hope it will be.
I guess everyone has a different interpretation of revolutionary. Some people think ChatGPT is just faster search. But 10x faster search is revolutionary in terms of productivity.
Your example is a better search engine. The AI hype however is the promise that it will be smarter (not just more knowledgeable) than humans and replace all jobs.
And it isn't on the way there. Just today, a leading state of the art model, that supposedly passed all the most difficult math entry exams and whatever they "benchmark", reasoned with the assumption of "60 days in January". It would simply assume that and draw conclusions, as if that were normal. It also wasn't able to corrrectly fill out all possible scores in a two player game with four moves and three rules, that I made up. It would get them wrong over and over.
It's not a better search engine, it's qualitatively different to search. An LLM compose its answers based on what you ask it. Search returns pre-existing texts to you.
Do you think or worry about not-being able to test these things? (Or is that just me :))
Details:
I ack/understand this comes from a dependency (ReAct agents); not directly langmanus.
But, still, curious what the community/hn-tech thinks of testability, veracity, potentially conflicting or overlapping instructions across agents, etc, wrt “prompts” as sources of logic. Ack its a general practice with LLMs.
Absolutely! Even for inference! The SOTA models for all commercial purposes need to run on a consumer’s device.
Running either Grok2 or DeepSeek or even Llama405b requires nearly 400-500gb of memory.
Buying a tinybox with enough gpu memory costs $15k-25k. Or equivalently the same if you build your own.
A distributed Mac cluster costs about the same, if not more, if you’re buying 2-3 M2 Ultra each with 192gb of memory.
So people are absolutely constrained by price/supply here. Every engineer, analyst, scientist would be far more untethered by rules & regulations or policies & terms-of-service nitty gritties if they can trust that LLM they use is completely local, without-telemetry or tracking and is licensed fairly for commercial use (perhaps this excludes llama).
Not a lot of people can afford $15k-30k in spending for a computer (that can run this sota llms). But you can a billion will buy one when it’s $1k
Not to mention, the north star is to get to a place where we have the hardware to do training at home. we're a long ways off, but without the restrictions of needing the hardware to do it, ideally, we'd make the model such that is continually being trained.
It could be much worse: how many projects simply die because they’re locked away in a corporate basement because some corporate attorney decided it’s “too risky to leak IP”
Despite all the layoffs & black founding fathers debacles Google as an institution has had recently, it still has the systems in place to let passionate engineering projects see the light of day.
I’m on an H1B visa since 2017. I extended in 2021 and again in 2023 (thanks to approved I-140). But my visa stamp in my passport is from 2017 which expired in 2020. I haven’t made efforts to get another stamp mainly because of the pandemic in 2020 and long appointment backlogs thereafter.
Is there a necessity to always keep a valid visa stamp in my passport? Apart from the ease of travel are there any other reasons to always keep a current visa stamp in my passport? I do have all valid I-797 documents of status and have kept my status current all the time.
There's no requirement to maintain a valid H-1B visa but under certain circumstances having a valid H-1B visa stamp can make it easier to change H-1B employers.
The 12 years or so I have lived in America I have observed that people always keep the door open for those walking behind them.
Always. Everywhere. DC, Boston, LA, New York, Seattle, Cupertino, everywhere.
Nobody cares about anything.
Somebody cares about something.
Everybody cares about everything.
I think the truth is somewhere in the middle. This makes me optimistic. At least, its nice that people care to hold doors open : )
> Or please tell me how to use the magic mouse while it's charging? Am I just holding it wrong?
is that really a deal-breaking decision to buy an iMac? Yeah, sure, I agree it's super silly design that they put the port under the mouse; but c'mon, does it really matter?
As far as I can tell, I have never had to explain to my 61 year old Indian mother how to use a Mac as much as I have had to debug every little thing on Windows PCs. Macs & Apple products _truly_ do "just work"
How can you say "does it really" matter to something as stupid as the chargig port under the mouse? It's in-your-face, outrageously bad design and very much in line with "not just works".
It's one thing to pay a comparative fortune for a mouse that's got mostly looks going for it; it's another to have to uproot your work/free time because your mouse ran out of batteries and you didn't routinely charge it like a phone overnight.
Yes, of course it matters. It matters because it's dumb and we pay the dumbness price. People paying for it regardless is the reason it continues the way it is.
Can you read? I guess not, because I literally even wrote that it's a great OS?
It just doesn't just work and has issues. That doesn't mean that windows or Linux dont have issues. They do, they all have their warts and that's fine. But that makes the slogan "it just works" idiotic.
It has the by far best vertical integration with the least issues switching devices, sure.
That still doesn't make "it just works" a reality, because that's an unachievable pipedream!
More than a majority of a software engineer’s time is spent on bug triage, reproducing bugs, simulating constituents in a test, and debugging fixes.
Doesn’t matter what the computer becomes — AI, AGI or God-incarnate — there’s always a role between that and the end-user. That role today is called software engineer. Tomorrow, it’ll be whatever whatever. Perhaps paid the same or less or more. Doesn’t matter.
There’s always an intermediary to deal with the shit.
Hmm, I wonder if that’s the roles priests & the clergy have been playing all this while. Except, maybe humanity is the shit God (as an end user) has to deal with
I cannot agree more! Fish shell is the absolute best! I think I may have switched over to it in 2017 _because_ of Julia's blog. And it has been a boon for my productivity.
I set up a whole dotfiles tracking system and periodically keep making backups of the config. Fish has been sole system that has travelled with me to different machines, different companies, running on Macs, running on Linux, running on personal laptops, iMacs, and so so many different versions/instances of workstations on the server in docker, in aws etc etc.
I worked at Amazon in 2017 and we had a whole "developer environment" system built off of Apollo and my Fish config fit right in with the company-defaults for various build systems, log systems, metrics yada yada. I wrote myself a ton of nice aliases, new functions, new scripts.
I moved to Apple in 2019 and brought my Fish config over (sans any Amazon-specific things of course) and all my customizations have ported over nicely. They play well with all the Apple-y unique configs this company now gives its engineers.
Fish has been snappy & delightful.
I guess one could argue zsh could work just as well and has the benefit of being compatible with native bash syntax. But I found zsh and its ilk (oh my zsh) too slow tbh. They're nothing --and I mean zilch, nada-- compared to the speed of Fish. Fish w/ Bass (love the puns in the fish community, btw) accomplishes much of the backward compatibility needs with bash or bash-like syntax; while still performing at the snappy speeds of Fish.
Love Fish! Love Julia's notes! Ahh life is just perfect sometimes!
While we're fanboying over Julia, here's a picture of the time I made myself a t-shirt from Julia's zines of Recurse Center's values: https://x.com/b0rk/status/876571293491109889
Ever since mobile & cloud era at their peaks in 2012 or 2014, we’ve had Crypto, AR, VR, and now AI.
I have some pocket change bitcoin, ethereum, played around for 2 minutes on my dust-gathering Oculus & Vision Pro; but man, oh man! Am I hooked to ChatGpt or what!
It’s truly remarkably useful!
You just can’t get this type of thing in one click before.
For example, here’s my latest engineering productivity boosting query: “when using a cfg file on the cmd line what does "@" as a prefix do?”