Hacker Newsnew | past | comments | ask | show | jobs | submit | libraryofbabel's commentslogin

Trite, and wrong. Stalin died of a stroke at 74. To take just two more examples, Mao and Franco both died at 82, also of natural causes.

As an extension of this idea: for some tasks, rather than asking Claude Code to do a thing, you can often get better results from asking Claude Code to write and run a script to do the thing.

Example: read this log file and extract XYZ from it and show me a table of the results. Instead of having the agent read in the whole log file into the context and try to process it with raw LLM attention, you can get it to read in a sample and then write a script to process the whole thing. This works particularly well when you want to do something with math, like compute a mean or a median. LLMs are bad at doing math on their own, and good at writing scripts to do math for them.

A lot of interesting techniques become possible when you have an agent that can write quick scripts or CLI tools for you, on the fly, and run them as well.


It's a bit annoying that you have to tell it to do it, though. Humans (or at least programmers) "build the tools to solve the problem" so intuitively and automatically when the problem starts to "feel hard", that it doesn't often occur to the average programmer that LLMs don't think like this.

When you tell an LLM to check the code for errors, the LLM could simply "realize" that the problem is complex enough to warrant building [or finding+configuring] an appropriate tool to solve the problem, and so start doing that... but instead, even for the hardest problems, the LLM will try to brute-force a solution just by "staring at the code really hard."

(To quote a certain cartoon squirrel, "that trick never works!" And to paraphrase the LLM's predictable response, "this time for sure!")


As the other commenter said, these days Claude Code often does actually reach for a script on its own, or for simpler tasks it will do a bash incantation with grep and sed.

That is for tasks where a programmatic script solution is a good idea though. I don't think your example of "check the code for errors" really falls in that category - how would you write a script to do that? "Staring at the code really hard" to catch errors that could never have been caught with any static analysis tool is actually where an LLM really shines! Unless by "check for errors" you just meant "run a static analysis tool", in which case sure, it should run the linter or typechecker or whatever.


Running “the” existing configured linter (or what-have-you) is the easy problem. The interesting question is whether the LLM would decide of its own volition to add a linter to a project that doesn’t have one; and where the invoking user potentially doesn’t even know that linting is a thing, and certainly didn’t ask the LLM to do anything to the project workflow, only to solve the immediate problem of proving that a certain code file is syntactically valid / “not broken” / etc.

After all, solving an immediate problem that seems like it could come up again, by “taking the opportunity” to solve the problem from now on by introducing workflow automation to solve the problem, is what an experienced human engineer would likely do in such a situation (if they aren’t pressed for time.)


I've had multiple cases where it will rather write a script to test a thing than actually adding a damn unit test for it :)

> Humans (or at least programmers) "build the tools to solve the problem" so intuitively and automatically when the problem starts to "feel hard", that it doesn't often occur to the average programmer that LLMs don't think like this.

Hmm. My experience of "the average programmer" doesn't look like yours and looks more like the LLM :/

I'm constantly flabbergasted as to how way too many devs fumble through digging into logs or extracting information or what have you because it simply doesn't occur to them that tools can be composed together.


> Humans (or at least programmers) "build the tools to solve the problem" so intuitively and automatically

From my experience, only a few rare devs do this. Most will stick with (broken/wrong) GUI tools they know made by others, by convenience.


I have the opposite experience.

I used claude to translate my application and I asked him to translate each text in the application to his best abilities.

That worked great for one view, but when I asked him to translate the rest of the application in the same fashion he got lazy and started to write a script to substitute some words instead of actually translating sentences.


Cursor likes to create one-off scripts, yesterday it filled a folder with 10 of them until it figured out a bug. All the while I was thinking - will it remember to delete the scripts or is it going to spam me like that?

>It's a bit annoying that you have to tell it to do it, though.

https://www.youtube.com/watch?v=kBLkX2VaQs4


Cursor does this for me already all the time though, give that another shot maybe. For refactoring tasks in particular; it uses regex to find interesting locations , and the other day after maybe 10 of slow "ok now let me update this file... ok now let me update this file..." it suddenly paused, looked at the pattern so far, and then decided to write a python script to do the refactoring & executed it. For some reason it considered its work done even though the files didn't even pass linters but thats' polish.

+1, cursor and Claude code do this automatically for me. Take a big analysis task and they’ll write python scripts to find the needles in the haystacks that I’m looking through

Yeah, I had Cursor refactor a large TypeScript file today and it used a script to do it. I was impressed.

Codex is a lot better at this. It will even try this on its own sometimes. It also has much better sandboxing (which means it needs approvals far less often), which makes this much faster.

Same here, I have a SQLite db that I have let it look over and extract data. I let it build the scripts then I run them as they would timeout if not and I don't want Claude sitting waiting for 30 min. So I do all the data investigations with Claude as a expert who can traverse the data much faster then me.

I've noticed Claude doing this for most tasks without even asking it to. Maybe a recent thing?

Yes. But not always. It's better if you add a line somewhere reminding it.

No, that's not the point of this new checkpoints feature. It's already been possible for a while to rewind context in Claude Code by pressing <ESC><ESC>. This feature rewinds code state alongside context:

> Our new checkpoint system automatically saves your code state before each change, and you can instantly rewind to previous versions by tapping Esc twice or using the /rewind command.

https://www.anthropic.com/news/enabling-claude-code-to-work-...

Lots of us were doing something like this already with a combination of WIP git commits and rewinding context. This feature just links the two together and eliminates the manual git stuff.


From the docs it looks like this feature only reverts the edit tool calls, and not e.g. bash commands that have been executed:

> Checkpoints apply to Claude’s edits and not user edits or bash commands, and we recommend using them in combination with version control


How could they possibly hope to undo bash commands, whose side effects could be anything, anywhere?

Hey Claude... uh... unlaunch those


I mean a naive implementation of this would just make regular git commits to a special hidden repo and revert them (ignoring changes outside project root). I always assumed that’s how cursor did it. Presumably they have good reasons not to do this, probably related to not accidentally reverting user edits.

By tracking changes made by a command, like you might with git.

And how do you propose they track those changes? Do you want to run your LLM through permanent sudo?

Since they recommend still using version control anyway, looks like I will stick to my solution of using a git-colocated jj (jujutsu) SCM which automatically makes label-less (no commit message) commits with every file change to every tracked file (new files are automatically tracked). So you get infinite undo.

It’s rather older than this. Any argument about “the market is swamped by bad X in circulation and good X are much rarer in the market than you would naively expect” (software developers, online dating, used cars, debased currency…) is just a version of Gresham’s Law, which is 500 years old: https://en.m.wikipedia.org/wiki/Gresham%27s_law

I don’t disagree with your point. The earth is pretty special as the only known location for intelligent life. However:

1) Most people haven’t internalized at all how big and empty and terrifying the universe is. This is a rhetorical device to make that point.

2) Carl Sagan also famously said “We are a way for the universe to know itself” so I think you may be straw-manning his overall position just a little bit here.


He plainly says "the folly of human conceits". It's nice if he does a u-turn on this elsewhere, but the speech is still gleefully wallowing in putting down humans as big-headed ... for what? And why are "superstar" and "supreme leader" in quotes like that? Those superstars and supreme leaders on earth are the actual genuine ones. I know of no others elsewhere, do you? I mean OK important people may tend to be pompous, but some people are indeed important. Sagan, for instance, was a celebrity, meaning that he was celebrated by other people on account of his being some good. I don't need him to deny it, and I don't need him to make this speech vicariously denying the importance of everybody else.

This is pretty flakey reasoning. First, the only reason “internalizing” the size of the universe would have any relevance here is if you were going to use that to draw the bad inference of insignificance. That was the context. Otherwise, it is, at best, an interesting fact that can be equally interpreted as awe-inspiring (I do not see why it would be terrifying; I think it is grand and wonderful).

I think Sagan was, in his clumsy way, perhaps trying to recover something lost in the loss of religion, a kind of reverential humility that recognition of God previously inspired. The closest thing he could find was the universe. Naturally, this leed toward a kind of pagan materialist pantheism (which likely explains his interest in Hinduism).


> It is one of the few breeds with clear genetic links back to Arctic wolf populations, which is why it sits closer to the early proto-dog split than nearly all modern breeds.

What does this mean? Aren’t there just as many generations between a Siberian Husky and the wolf-dog split as between a pug and the wolf-dog split? Unless you’re saying there was interbreeding with wolves in the husky’s ancestry (but not the pug’s) after the initial split?


Yes, all breeds are the same distance from the original dog/wolf split, but Huskies (northern snow dogs) are different because their lineage picked up extra genes from ancient Arctic wolves like the 35k-year-old Taimyr wolf. They also show direct continuity with ~9000 year old sled dogs from Zhokhov Island, while most modern breeds only go back a few hundred years. Add in thousands of years of breeding in isolation for Arctic work, and the functional traits that were developed over that long timespan left Huskies with more distinct genes than most other breeds.

That would be interesting, because the modern thinking goes that modern wolves are as different from wolves around the time of the wolf-dog split as modern dogs are. So if there was recent wolf interbreeding in the husky lineage, it's a different kind of wolf than what the first dogs were descended from. They're all very similar animals, so it may only show up on DNA tests, but there may be a sort of genetic timestamp showing when the last wolf admixture was that's visible on a DNA test.

I have doubts that changes in selection pressure on wolves in their natural environment, no matter how many generations, is remotely close to domestication and selective breeding.

The thousands of years of selection in the Arctic shaped Huskies, not the old wolf/husky interbreeding, which only left small but distinct genetic markers. Those came from an ancient Arctic wolf that split off around the same time as the ancestors of dogs and modern wolves, which is why they still show up so clearly today.

Yes! - and I wish this was easier to do with common coding agents like Claude Code. Currently you can kind of do it manually by copying the results of the context-busting search, rewinding history (Esc Esc) to remove the now-useless stuff, and then dropping in the results.

Of course, subagents are a good solution here, as another poster already pointed out. But it would be nice to have something more lightweight and automated, maybe just turning on a mode where the LLM is asked to throw things out according to its own judgement, if you know you're going to be doing work with a lot of context pollution.


This is why I'm writing my own agent code instead of using simonw's excellent tools or just using Claude; the most interesting decisions are in the structure of the LLM loop itself, not in how many random tools I can plug into it. It's an unbelievably small amount of code to get to the point of super-useful results; maybe like 1500 lines, including a TUI.

And even if you do use Claude for actual work, there is also immense pedagogical value in writing an agent from scratch. Something really clicks when you actually write the LLM + tool calls loop yourself. I ran a workshop on this at my company and we wrote a basic CLI agent in only 120 lines of Python, with just three tools: list files, read file, and (over)write file. (At that point, the agent becomes capable enough that you can set it to modifying itself and ask it to add more tools!) I think it was an eye-opener for a lot of people to see what the core of these things looks like. There is no magic dust in the agent; it's all in the LLM black box.

I hadn't considered actually rolling my own for day-to-day use, but now maybe I will. Although it's worth noting that Claude Code Hooks do give you the ability to insert your own code into the LLM loop - though not to the point of Eternal Sunshining your context, it's true.


Do you have this workshop available online? I’m really struggling to understand what “tool calls” and MCP are!

Relevant (800 comment!) 2024 HN discussion on how we got here with Google Search: https://news.ycombinator.com/item?id=40133976

Not an expert, but it’s probably related to cooling. Every joule of that electricity that goes in must also leave the datacenter as heat. And the whole design of a datacenter is centered around cooling requirements.


Exactly. To add to that, I'd like to point out that when this person says every joule, he is not exaggerating (only a teeny tiny bit). The actual computation itself barely uses any energy at all.


This should be the top-voted comment of the whole thread. I used to teach history; it makes me roll my eyes when I hear comparisons between Nazi Germany and the current moment. It reflects both a lack of historical familiarity with the unique circumstances of Germany in the 1920s and 30s (including recently losing a world war), and also, as you say, a lack of knowledge of other more relevant historical examples — of which I’d also put Erdoğan at the top. It’s just a conversation-stopper and a rhetorical cudgel rather than a serious attempt at historical contextualization.


Surely the Venn diagram needs not be a circle for you to draw parallels, nor does the existence of a more direct comparison make other comparisons moot.


Surely the fact that the current ruling party has an influential faction who explicitly reference Nazi Germany as an ideal worth striving for is relevant to setting the current moment in historical context. Yes we're not LITERALLY Nazi Germany for a variety of reasons but that doesn't mean it doesn't paint a picture of what they want to do, regardless of how successful they will be or what that will look like in practice.

Personally I think the most apt historical comparison is the Fourth Crusade and the Sack of Constantinople, but since we don't LITERALLY live in the Middle Ages and have ethnic divisions between Greeks and Latins one might say that's not a relevant comparison either.


Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: