Chris Lattner, inventor of the Swift programming language recently took a look at a compiler entirely written by Claude AI. Lattner found nothing innovative in the code generated by AI [1]. And this is why humans will be needed to advance the state of the art.
AI tends to accept conventional wisdom. Because of this, it struggles with genuine critical thinking and cannot independently advance the state of the art.
AI systems are trained on vast bodies of human work and generate answers near the center of existing thought. A human might occasionally step back and question conventional wisdom, but AI systems do not do this on their own. They align with consensus rather than challenge it. As a result, they cannot independently push knowledge forward. Humans can innovate with help from AI, but AI still requires human direction.
You can prod AI systems to think critically, but they tend to revert to the mean. When a conversation moves away from consensus thinking, you can feel the system pulling back toward the safe middle.
As Apple’s “Think Different” campaign in the late 90s put it: the people crazy enough to think they can change the world are the ones who do—the misfits, the rebels, the troublemakers, the round pegs in square holes, the ones who see things differently. AI is none of that. AI is a conformist. That is its strength, and that is its weakness.
You know where LLMs boost me the most? When I need to integrate a bunch of systems together, each with their own sets of documentation. Instead of spending hours getting two or three systems to integrate with mine with the proper OAuth scopes or SAML and so on, an LLM can get me working integrations in a short time. None of that is ever going to be innovative; it's purely an exercise in perseverance as an engineer to read through the docs and make guesses about the missing gaps. LLMs are just better at that.
I spend the other time talking through my thoughts with AI, kind of like the proverbial rubber duck used for debugging, but it tends to give pretty thoughtful responses. In those cases, I'm writing less code but wanting to capture the invariants, expected failure modes and find leaky abstractions before they happen. Then I can write code or give it good instructions about what I want to see, and it makes it happen.
I'm honestly not sure how a non-practitioner could have these kinds of conversations beyond a certain level of complexity.
Every time I request the wrong OAuth scope that doesn't have the authorization to do what I need, then make a failing request, I hear Jim Gaffigan affecting a funny authoritative voice saying, "No." I can't be the only one who defensively requests too much authority beyond what I need with extra OAuth scopes, hoping one of them will give me the correct access. I've had much better luck with LLMs telling me exactly which scopes to select.
And the libraries provided by the various OAuth vendors are only adding fuel to the fire.
A while ago I spent some time debugging a superfluous redirect and the reason was that the library would always kick off with a "not authenticated" when it didn't find stored tokens, even if it was redirecting back after successful log in (as the tokens weren't stored yet).
oauth is the one area where I genuinely trust the LLM more than myself. not because it gets it right but because at least it reads all the docs instead of rage-quitting after the third wrong scope
Maybe you are already an expert in those so its fine. But for anybody else, using llms extensively would mean becoming way less proficient in those topics. Skip building some deeper senior knowledge and have a rather shallow knowledge. Anytime I grokked anything deeper like encryption, these saml/jwt auths, complex algorithms was only and unavoidably due to having to go deep with them.
Good for the company, not so much for the given engineer. But I get the motivation, we all are naturally lazy and normally avoid doing stuff if we can. Its just that there are also downsides, and they are for us, engineers.
The worst integration problems tend to be conceptual mismatches between the systems, where--even with the same names--they have different definitions and ideas of how things work.
That's a category of problem I wouldn't expect a text-based system to detect very well.. Though it might disguise the problem with a solution that seems to work until it blows up one day or people discover a lot of hard-to-fix data.
Well that's another use I have for LLMs: asking questions about these informational or architectural impedance mismatches. LLMs get it wrong sometimes, but with proper guidance (channel your inner Karl Popper), they can be quite helpful. But this doesn't really speed me up that much, though it makes me more confident that my deliverable is correct.
This is fundamental. Well, not really - a strategy SV tried to use is absolute market dominance to the point where you have to integrate with them. But in spaces where true interoperability is required, it's just philosophically hard. People don't mean the same thing.
Couldn’t agree more and especially when some docs are incorrect and AI is able to guesstimate the correction based on other implementations or parallel docs that it’s found. Goes from “Let me spend a few days scouring the internet and our internal repo to see if I can maybe find a workaround” to “This can definitely get done”.
Itops related work in general is so suitable for ai agents. Configuring clusters of barebone servers. I normaly spend days on configuring things like nfs, sysctls, firewalls, upgrades, disks, crons, monitoring etc. now it's hours max. I can literally ask it to ssh into 50 vps machines and perform all tasks I tell him to do.
MCP connects the LLM to the APIs, which can be consulted with "tool calls." I'm talking about integrating the software I produce (with LLM assistance) to APIs. Traditionally, this is a nightmare given poor documentation. LLMs have helped me cut through the noise.
I built an MCP server to speak WHOIS/RDAP so I could have Claude give me better domain name suggestions that weren't already taken. It can also be used in LLM-enabled applications (provided that the model is "tool calling" and that there's an orchestrator).
In principle, MCP servers can be created for just about any OAuth-protected API. However, you still need to create the server, and this is where the usage I'm talking about shines: when working on the MCP server, an LLM can be quite helpful in getting the right APIs integrated.
The same goes for other development that doesn't need an LLM context built-in. If I wanted to sync two calendars, for instance, I wouldn't build an MCP that speaks CalDav and Exchange and then let it loose (though this so-called agentic workflow is becoming more popular); I'd want to build software with an LLM's help that can speak both protocols by having it generate code to handle whatever OAuth tokens and scopes are necessary and then help me deploy the thing.
> Chris Lattner, inventor of the Swift programming language recently took a look at a compiler entirely written by Claude AI. Lattner found nothing innovative in the code generated by AI [1]. And this is why humans will be needed to advance the state of the art.
I’ve recently taken a look at our codebase, written entirely by humans and found nothing innovative there, on the opposite, I see such brainrot that it makes me curious what kind of biology needed to produce this outcome.
So maybe Chris Lattner, inventor of the Swift programming language is safe, majority of so called “software engineers” are sure as hell not. Just like majority of people are NOT splitting atoms.
Also: if that one particular AI-produced compiler has nothing innovative, that only means that the human "director" behind the AI didn't ask it to produce anything innovative; what it does not mean is that AI can never produce anything innovative in a compiler.
> if that one particular AI-produced compiler has nothing innovative, that only means that the human "director" behind the AI didn't ask it to produce anything innovative
Couldn't it also be true that the AI didn't produce innovative output even though the human asked it to produce something innovative?
Otherwise you're saying an AI always produces innovative output, if it is asked to produce something innovative. And I don't think that is a perfection that AI has achieved. Sometimes AI can't even produce correct output even when non-innovative output is requested.
> Couldn't it also be true that the AI didn't produce innovative output even though the human asked it to produce something innovative?
It could have been, but unless said human in this case was lying, there is no indication that they did. In fact, what they have said is that they steered it towards including things that makes for a very conventional compiler architecture at this point, such as telling it to use SSA.
> Otherwise you're saying an AI always produces innovative output
They did not say that. They suggested that the AI output closely matches what the human asks for.
> And I don't think that is a perfection that AI has achieved.
I won't answer for the person you replied to, but while I think AI can innovate, I would still 100% agree with this. It is of course by no means perfect at it. Arguably often not even good.
> Sometimes AI can't even produce correct output even when non-innovative output is requested.
Sometimes humans can't either. And that is true for innovation as well.
But on this subject, let me add that one of my first chats with GPT 5.1, I think it was, I asked it a question on parallelised parsing. That in itself is not entirely new, but it came up with a particular scheme for paralellised (GPU friendly) parsing and compiler transformations I have not found in the literature (I wouldn't call myself an expert, but I have kept tabs on the field for ~30 years). I might have missed something, so I intend to do further literature search. It's also not clear how practical it is, but it is interesting enough that when I have time, I'll set up a harness to let it explore it further and write it up, as irrespective of whether it'd be applicable for a production compiler, the ideas are fascinating.
I’ve built some live programming systems in the past that are innovative, but not very practical, and now I’m trying to figure out how to get a 1.5B model (a small language model) into the pipeline if a custom small programming language. That is human driven innovation, but an LLM is definitely very useful.
> Lattner found nothing innovative in the code generated by AI
I don't think the replacement is binary. Instead, it’s a spectrum. The real concern for many software engineers is whether AI reduces demand enough to leave the field oversupplied. And that should be a question of economy: are we going to have enough new business problems to solve? If we do, AI will help us but will not replace us. If not, well, we are going to do a lot of bike-shedding work anyway, which means many of us will lose our jobs, with or without AI.
Before software, there were accountants. It was The qualification to have.
Today accountants are still needed. But it's a commodified job. And you start at the absolute bottom of the bottom rungs and slave it out till you can separate yourself and take on a role on a path to CFO or some respectable level of seniority.
I'm oversimplifying here but that is sufficient to show A path forward for software engineers imo. In this parallel, most of us will become AI drivers. We'll go work in large companies but we'll also go work in a back room department of small to medium businesses, piloting AI on a bottom of the rung salary. Some folks will take on specialisms and gain certifications in difficult areas (similar to ACCA). Or maybe ultra competitive areas like how it is in actuarial science. Those few will eventually separate themselves and lead departments of software engineers (soon to be known as AI pilots). Others will embed in research and advance state of art that eventually is commoditized by AI. Those people will either be paid mega bucks or will be some poor academia based researcher.
The vast majority? Overworked drones having to be ready to stumble to their AI agent's interface when their boss calls them at 10 PM saying the directors want to see a feature setup for the meeting tomorrow.
Business problems are essentially neverending. And humans have a broader type of intelligence that LLMs lack but are needed to solve many novel problems. I wouldn't worry.
Unless you're one of the bulk of 1x programmers who aren't doing anything novel. I think it will be like most industries that got very helpful technology - the survivors have to do more sophisticated work and the less capable people are excluded. Then we need more education to supply those sophisticated workers but the existing education burden on professionals is already huge and costly. Will they be spending 10 years at university instead of 3-4? Will a greater proportion of the population be excluded from the workforce because there's not enough demand for low-innate-ability or low-educated people?
To add, just keeping up in this industry was already a problem. I don't know of many professions[1] with such demands on time outside of a work day to keep your skills updated. It was perhaps an acceptable compromise when the market was hot and the salaries high. But I am hearing from more and more people who are just leaving the field entirely labeling it as "not worth it anymore".
[1] Medicine may be one example of an industry with poor work-life balance for some, specifically specialists. But job security there is unmatched and compensation is eye-watering.
> I don't know of many professions[1] with such demands on time outside of a work day to keep your skills updated.
This is an extremely miopic view (or maybe trolling).
The vast majority of software developers never study, learn, or write any code outside of their work hours.
In contrast, almost all professional have enormous, _legally-required_ upskilling, retraining, and professional competence maintenance.
If you honestly believe that developers have anywhere near the demands (both in terms of time and cost) in staying up to date that other professions have, you are - as politely as I can - completely out-of-touch.
Sure, but those same professional certifications and development hours also allow them to not need to re-prove their basic competency when interviewing.
Problems are never ending but amount of money which can be made in short (or even mid) term by solving these problems is limited. Every dollar spent on LLM is a dollar not spent on salaries.
That feels overly optimistic. LLMs seems on track to automate out basically any "email job" or "spreadsheet job," in which case we'll be looking at higher unemployment numbers than the great depression for at least some period of time. Combine with increased automation...
There are a LOT of people in the world and already a not insignificant portion can't find work despite wanting to. Seems the most likely thing is that the value of most labor is reduced to pennies.
Do you really think the billionaires are willing to have consumers so impoverished that they can’t continue to spend large sums of discretionary income buying the things that make the billionaires themselves richer?
I've read a theory that as the ultra rich divide their wealth among their descendants, eventually they capture so much of it among their families that trying to extract more from the working class is hardly worth the effort. The only option then, for the descendants of the ultra wealthy, is to start turning on each other. The theory states that the last time this happened was WWI.
The billionaires are already billionaires. People like Sam Altman are not building a doomsday bunker because they believe in the longevity of established society. They are doing it because they've already won and are taking their ball.
Well what would each billionaire do? Give out money so that the poor can give some of it back?
You cannot just point at a system, say it’d be unsustainable and then assume nobody will let that happen.
Monarchies, lords, etc. have had much more reason to support their own countryfolk, yet many throughout history have not - has society changed enough that the billionaires have changed on this?
Megacap investors already cargo cult business practices that reduce their own return and harm employees. This is why they all over-hired at the start of covid only to begin layoffs a couple of years later.
In summary: billionaires aren't as competent as you'd hope.
I know multiple engineers who have spent months or even years trying to find a job. How can you say not to worry when the industry has already gotten this bad?
It's no consolation, but this situation is temporary. Everyone is just distracted with AI.
"Temporary" might mean "the next three years", but at the same time some acted as if the Zero Interest Rate Policy would continue indefinitely, so this situation might end suddenly and unexpectedly.
To me the opportunity is with agents. Specially copilot and what ever amazon's agent it. figure out how to code using them. build something cool in the space your interested in finding a job for. that's the skill enterprise companies are fighting for. nobody knows how to do it.
"n the field of machine learning, the universal approximation theorems (UATs) state that neural networks with a certain structure can, in principle, approximate any continuous function to any desired degree of accuracy. These theorems provide a mathematical justification for using neural networks, assuring researchers that a sufficiently large or deep network can model the complex, non-linear relationships often found in real-world data."
And then: "Notice also that the neural network is only required to approximate within a compact set K {\displaystyle K}. The proof does not describe how the function would be extrapolated outside of the region."
NNs, LLMs included, are interpolators, not extrapolators.
And the region NN approximates within can be quite complex and not easily defined as "X:R^N drawn from N(c,s)^N" as SolidGoldMagiKarp [2] clearly shows.
It has been proven that recurrent neural networks are Turing complete [0]. So for every computable function, there is a neural network that computes it. That doesn't say anything about size or efficiency, but in principle this allows neural networks to simulate a wide range of intelligent and creative behavior, including the kind of extrapolation you're talking about.
I think you cannot take the step from any turing machine being representable as a neural network to say anything about the prowess of learned neural networks instead of specifically crafted ones.
I think a good example are calculations or counting letters: it's trivial to write turing machines doing that correctly, so you could create neural networks, that do just that. From LLM we know that they are bad at those tasks.
So for every computable function, there is a neural network that computes it. That doesn't say anything about size or efficiency
It also doesn't say anything about finding the desired function, rather than a different function which approximates it closely on some compact set but diverges from it outside that set. That's the trouble with extrapolation: you don't know how to compute the function you're looking for because you don't know anything about its behaviour outside of your sample.
No, but unless you find evidence to suggest we exceed the Turing computable, Turing completeness is sufficient to show that such systems are not precluded from creativity or intelligence.
I believe that quantum oracles are more powerful than Turing oracles, because quantum oracles can be constructed, from what I understand, and Turing oracles need infinite tape.
Our brains use quantum computation within each neuron [1].
The difference is quantum oracles can be constructed [1] and Turing oracle can't be [2]: "An oracle machine or o-machine is a Turing a-machine that pauses its computation at state "o" while, to complete its calculation, it "awaits the decision" of "the oracle"—an entity unspecified by Turing "apart from saying that it cannot be a machine" (Turing (1939)."
This is meaningless. A Turing machine is defined in terms of state transitions. Between those state transitions, there is a pause in computation at any point where the operations takes time. Those pauses are just not part of the definition because they are irrelevant to the computational outcome.
And given we have no evidence that quantum oracles exceeds the Turing computable, all the evidence we have suggests that they are Turing machines.
Turing machines grew from the constructive mathematics [1], where proofs are constructions of the objects or, in other words, algorithms to compute them.
Saying that there is no difference between things that can be constructed (quantum oracles) and things that are given and cannot be constructed (Turing oracles - they are not even machines of any sort) is a direct refutation of the very base of the Turing machine theoretical base.
That's an irrelevant strawman. It tells us nothing about how create such a system ... how to pluck it out of the infinity of TMs. It's like saying that bridges are necessarily built from atoms and adhere to the laws of physics--that's of no help to engineers trying to build a bridge.
And there's also the other side of the GP's point--Turing completeness not necessary for creativity--not by a long shot. (In fact, humans are not Turing complete.)
No, twisting ot to be about how to create such a system is the strawman.
> Turing completeness not necessary for creativity--not by a long shot.
This is by far a more extreme claim than the others in this thread. A system that is not even Turing complete is extremely limited. It's near impossible to construct a system with the ability to loop and branch that isn't Turing complete, for example.
>(In fact, humans are not Turing complete.)
Humans are at least trivially Turing complete - to be Turing complete, all we need to be able to do is to read and write a tape or simulation of one, and use a lookup table with 6 entries (for the proven minimal (2,3) Turing machine) to choose which steps to follow.
Maybe you mean to suggest we exceed it. There is no evidence we can.
> P.S. everything in the response is wrong ... this person has no idea what it means to be Turing complete.
I know very well what it means to be Turing complete. All the evidence so far, on the other hand suggests you don't.
> An infinite tape. And to be Turing complete we must "simulate" that tape--the tape head is not Turing complete, the whole UTM is.
An IO port is logically equivalent to infinite tape.
> PDAs are not "extremely limited", and we are more limited than PDAs because of our very finite nature.
You can trivially execute every step in a Turing machine, hence you are Turing equivalent. It is clear you do not understand the subject at even a basic level.
> You can trivially execute every step in a Turing machine, hence you are Turing equivalent. It is clear you do not understand the subject at even a basic level.
LOL. Such projection. Humans are provably not Turing Complete because they are guaranteed to halt.
Judging from what I read, their work is subject to regular hardware constraints, such as limited stack size. Because paper describes a mapping from regular hardware circuits to the continuous circuits.
As an example, I would like to ask how to parse balanced brackets grammar (S ::= B <EOS>; B ::= | BB | (B) | [B] | {B};) with that Turing complete recurrent network and how it will deal with precision loss for relatively short inputs.
Paper also does not address training (i.e., automatic search of the processors' equations given inputs and outputs).
This is one of the reasons current AI tech is so poor at learning physical world dynamics.
Relationships in the physical world are sparse, metastable graphs with non-linear dynamics at every resolution. And then we measure these dynamics using sparse, irregular sampling with a high noise floor. It is just about the worst possible data model for conventional AI stacks at a theoretical level.
> AI tends to accept conventional wisdom. Because of this, it struggles with genuine critical thinking and cannot independently advance the state of the art.
Of course! But that's what makes them so powerful. In 99% of cases that's what you want - something that is conventional.
The AI can come up with novel things if it has an agency, and can learn on its own (using e.g. RL). But we don't want that in most use cases, because it's unpredictable; we want a tool instead.
It's not true that this lack of creativity implies lack of intelligence or critical thinking. AI clearly can reason and be critical, if asked to do so.
Conceptually, the breakthrough of AI systems (especially in coding, but it's to some extent true in other disciplines) is that they have an ability to take a fuzzy and potentially conflicting idea, and clean up the contradictions by producing a working, albeit conventional, implementation, by finding less contradictory pieces from the training data. The strength lies in intuition of what contradictions to remove. (You can think of it as an error-correcting code for human thoughts.)
For example, if I ask AI to "draw seven red lines, perpendicular, in blue ink, some of them transparent", it can find some solution that removes the contradictions from these constraints, or ask clarifying questons, what is the domain, so it could decide which contradictory statements to drop.
I actually put it to Claude and it gave a beautiful answer:
"I appreciate the creativity, but I'm afraid this request contains a few geometric (and chromatic) impossibilities: [..]
So, to faithfully fulfill this request, I would have to draw zero lines — which is roughly the only honest answer.
This is, of course, a nod to the classic comedy sketch by Vihart / the "Seven Red Lines" bit, where a consultant hilariously agrees to deliver exactly this impossible specification. The joke is a perfect satire of how clients sometimes request things that are logically or physically nonsensical, and how people sometimes just... agree to do it anyway.
Would you like me to draw something actually drawable instead? "
This clearly shows that AI can think critically and reason.
That shows it knew this bit of satire more than anything. Also, the problem as stated isn't actually constrained enough to be unsolvable: https://youtu.be/B7MIJP90biM
Feel free to ask Claude about any other contradictory request. I use Claude Code and it often asks clarifying questions when it is unsure how to implement something, or or autocorrects my request if something I am asking for is wrong (like a typo in a filename). Of course sometimes it misunderstands; then you have to be more specific and/or divide the work into smaller pieces. Try it if you haven't.
I have. In fact, I've been building my own coding agent for 2 years at this point (i.e. before claude code existed). So it's fair to say I get the point you're making and have said all the same stuff to others. But this experience has taught me that LLMs, in their current form, will always have gaps: it's in the nature of the tech. Every time a new model comes out, even the latest opus versions, while they are always better, I always eventually find their limits when pushing them hard enough and enough times to see these failure modes. Anything sufficiently out of distribution will lead to more or less nonsensical results.
The big flagship AI models aren't just LLMs anymore, though. They are also trained with RL to respond better to user requests. Reading a lot of text is just one technique they employ to build the model of the world.
I think there are three different types of gaps, each with different remedies:
1. A definition problem - if I say "airplane", who do I mean? Probably something like jumbo jet or Cesna, less likely SR-71. This is something that we can never perfectly agree on, and AI will always will be limited to the best definition available to it. And if there is not enough training data or agreed definition for a particular (specialized) term, AI can just get this wrong (a nice example is the "Vihart" concept from above, which got mixed up with the "Seven red lines" sketch). So this is always going to be painful to get corrected, because it depends on each individual concept, regardless of the machine learning technology used. Frame problem is related to this, question of what hidden assumptions I am having when saying something.
2. The limits of reasoning with neural networks. What is really happening IMHO is that the AI models can learn rules of "informal" logical reasoning, by observing humans doing it. Informal logic learned through observation will always have logical gaps, simply because logical lapses occur in the training data. We could probably formalize this logic by defining some nice set of modal and fuzzy operators, however no one has been able to put it together yet. Then most, if not all, reasoning problems would reduce to solving a constraint problem; and even if we manage to quantize those and convert to SAT, it would still be NP-complete and as such potentially require large amounts of computation. AI models, even when they reason (and apply learned logical rules) don't do that large amount of computation in a formal way. So there are two tradeoffs - one is that AIs learned these rules informally and so they have gaps, and the other is that it is desirable in practice to time limit what amount of reasoning the AI will give to a given problem, which will lead to incomplete logical calculations. This gap is potentially fixable, by using more formal logic (and it's what happens when you run the AI program through tests, type checking, etc.), with the mentioned tradeoffs.
3. Going back to the "AI as an error-correcting code" analogy, if the input you give to AI (for example, a fragment of logical reasoning) is too much noisy (or contradictory), then it will just not respond as you expect it to (for example, it will correct the reasoning fragment in a way you didn't expect it to). This is similar to when an error-correcting code is faced with an input that is too noisy and outside its ability to correct it - it will just choose a different word as the correction. In AI models, this is compounded by the fact that nobody really understands the manifold of points that AI considers to be correct ideas (these are the code words in the error-correcting code analogy). In any case, this is again an unsolvable gap, AI will never be a magical mind reader, although it can be potentially fixed by AI having more context of what problem are you really trying to solve (the downside is this will be more intrusive to your life).
I think these things, especially point 2, will improve over time. They already have improved to the point that AI is very much usable in practice, and can be a huge time saver.
You had me at "fuzzy", but lost me at "clean up" - because that's what I usually have to do after it went on another wild refactoring spree. It's a stochastic thing, maybe you're lucky and it fuzzy-matches exactly what you want, maybe the distributions lead it astray.
On the line test, I guess it's highly probable that the joke and a few hundred discussions or blog pieces about it were in it's training data.
Was recently optimizing an old code base. If I tell it to optimize it does stupid stuff but if I tell it to write profiler first and then slowly attack each piece one at a time then it does really well. Only a matter of time before it does it automatically.
> Chris Lattner, inventor of the Swift programming language recently took a look at a compiler entirely written by Claude AI. Lattner found nothing innovative in the code generated by AI [1]. And this is why humans will be needed to advance the state of the art.
This feels like an unfair comparison to me; the objective of the compiler was not to be innovative, it was to prove it can be done at all. That doesn't demonstrate anything with regards to present or future capabilities in innovation.
As others have mentioned, it's not entirely clear to me what the limit of the agentic paradigm is, let alone what future training and evolution can accomplish. AlphaDev and AlphaEvolve ddemonstrate that it is possible to combine the retained knowledge of LLMs with exploratory abilities to innovate in both programming and mathematics; there's no reason to believe that it'll stop there.
Yeah, it's a bit like taking the output of a student project in a compiler construction class and using it to judge whether said student is capable of innovation without telling them in advance they'd be judged on that rather than on the stated requirements of the course.
It'd be interesting to prompt it to do the same job but try to be innovative.
To your point, yeah, I mostly don't want AI to be innovative unless I'm asking for it to be. In fact, I spend much more time asking it "is that a conventional/idiomatic choice?" (usually when I'm working on a platform I'm not super experienced with) than I do saying "hey, be more innovative."
Yeah, I'd love to find time to. But e.g. I think that is also a "later stage". If you want to come up with novel optimizations, for example, it's better to start with a working but simple compiler, so it can focus on a single improvement. Trying to innovate on every aspect of a compiler from scratch is an easy way of getting yourself into a quagmire that it takes ages to get out of as a human as well.
E.g. the Claude compiler uses SSA because that is what it was directed to use, and that's fine. Following up by getting it to implement a set of the conventional optimizations, and then asking it to research novel alternatives to SSA that allows restarting the existing optimizations and additional optimisations and showing it can get better results or simpler code, for example, would be a really interesting test that might be possible to judge objectively enough (e.g. code complexity metrics vs. benchmarked performance), though validating correctness of the produced code gets a bit thorny (but the same approach of compiling major existing projects that have good test suite is a good start).
If I had unlimited tokens, this is a project I'd love to do. As it is, I need to prioritise my projects, as I can hit the most expensive Claude plans subscription limits every week with any of 5+ projects of mine...
Yeah I think he had a pretty sane take in that article:
>CCC shows that AI systems can internalize the textbook knowledge of a field and apply it coherently at scale. AI can now reliably operate within established engineering practice. This is a genuine milestone that removes much of the drudgery of repetition and allows engineers to start closer to the state of the art.
And also
> The most effective engineers will not compete with AI at producing code, but will learn to collaborate with it, by using AI to explore ideas faster, iterate more broadly, and focus human effort on direction and design. Lower barriers to implementation do not reduce the importance of engineers; instead, they elevate the importance of vision, judgment, and taste. When creation becomes easier, deciding what is worth creating becomes the harder problem. AI accelerates execution, but meaning, direction, and responsibility remain fundamentally human.
> allows engineers to start closer to the state of the art
This reminds me of the Slate Star Codex story "Ars Longa, Vita Brevis"[1], where it took almost an entire lifespan just to learn what the earlier alchemists had found, so only the last few hours of an alchemist's life were actually valuable. Now we can all skip ahead.
Anther perspective, AI is fast turning [0.1x to 0.5x] low cost X-world Sofwate Engineers into >1x engineers.
Contrary to pre AI era, one of my close relative he has become very good "understand / write the requirement" guy. HN may be dominated by >1x engineers, another revolution is happening at lower /bulk end of spectrum as well.
AI makes it possible for someone who has never written code to generate a program that does what they want. One of my friends wanted to simulate a 7,9 against a dealer 10 upcard in the card game blackjack. GPT was able to write the simulation for him in javascript/html. So it took a 0.001x coder and turned him into a 0.2x coder.
I think Lattner was too generous and missed a couple of crucial points in the CCC experiment. He wrote:
> CCC shows that AI systems can internalize the textbook knowledge of a field and apply it coherently at scale.
Except that's not what happened. There was neither (just) textbook knowledge nor a "coherent application at scale":
1. The agents relied on thousands of human written tests embodying many person-years of "preparation effort", not to mention a complete spec. Furthermore, their models were also trained not only on the spec (and on the tests) but also on a reference implementation and the agents were given access to the reference implementation as a test oracle. None of that is found in a textbook.
2. Despite the extraordinary effort required to help the agents in this case - something that isn't available for most software - the models ultimately failed to write a workable C compiler, and couldn't converge. They reached a point where any bug fix caused another bug and that's when the people running the agents stopped the experiment.
The main issue wasn't that there was nothing innovative in the code but that even after embibing textbooks and relying on an impractical amount of preparation effort of help, the agents couldn't write a workable C compiler (which isn't some humongous task to begin with).
There is no critical thought, you can't prod an LLM to do such a thing. Even CoT is just the LLM producing text that looks like it could be a likely response based on what it generated before.
Sometimes that text looks like critical thought, but it does not at all reflect the logical method or means the AI used to generate it. It's just riffing.
Sure but there's somebody somewhere who had a relevant critical thought and the LLM can find it and adapt it to your case. That's good enough much of the time.
That's impressive, but it isn't thought, anymore than neurons in a dish that learn to play Tetris have thoughts, or if you spent eons painstakingly calculating what the TPU did with the model to come up with the same output tokens, but via pen and paper instead.
When the TPU does it, is the TPU thinking? Where does the critical thinking take place in the endless pages of matrix math that eventually evaluates into the same token output as the TPU?
With the way modern development often goes this essentially means using spicy autocomplete for code is a just a fast track to the cargo culted solutions of whatever day the model was trained.
1. The experiment was to show that AI can generate working code for a fairly complicated spec. Was it even asked to do things in a novel way? If not, why would we expect it do anything other than follow tried and tested approaches?
2. Compilers have been studied for decades, so it's reasonable to presume humans have already found the most optimal architectures and designs. Should we complain that the AI "did nothing novel" or celebrate because it "followed best practices"?
I'm actually curious, are there radically different compiler designs that people have hypothesized but not yet built for whatever reasons? Maybe somebody should repeat the experiment explicitly prompting AI agents to try novel designs out, would be fascinating to see the results.
"We", yes, but my point is that most people who write compilers do nothing but implement known techniques. If you then judge human ability to innovate by investigating a single compiler for innovation, odds are you would get the entirely wrong idea of what we are capable of.
>AI tends to accept conventional wisdom. Because of this, it struggles with genuine critical thinking and cannot independently advance the state of the art.
all AI works on patterns, it's not very different from playing chess. Chess Engines use similar method, learn patterns then use them.
While it's true training data is what creates pattern, so you do not have any new "pattern" which is also not already in data
but interesting thing is when pattern is applied to External World -> you get some effect
when the pattern again works on this effect -> it creates some other effect
This is also how your came into existence through genetic recombination.
Even though your ancestral dna is being copied forward, the the data is lossy and effect of environment can be profoundly seen. Yet you probably don't look very different from your grandparents, but your grandchildren may look very different from your grandparents.
at same point you are so many orders moved from the "original" pattern that it's indistinguishable from "new thing"
in simple terms, combinatorial explosion + environment interaction
"The AI made a compiler, but it wasn't that novel, so AI is not novel" is a very poor rhetorical foundation
Man - just think about what you said
Two years ago that would have been beyond shocking.
If 'AI is making compilers' - then that's 'beyond disruptive'.
It's very true that AI has 'reversion to the mean' characteristics - kind of like everything in life ..
... but it's just unfair to imply that 'AI can't be creative'.
The AI is already very 'creative' (call it 'synthetic creativity' or whatever you want) - but sufficiently 'creative' to do new things, and, it's getting better at that.
It's more than plausible that for a given project 'creativity' was not the goal.
AI will help new language designers try and iterate over new ideas, very quickly, and that alone will be disruptive.
"The AI made a compiler" is an argument for the disruptive power of AI, not against it.
The LLM didn’t make a compiler. It generated code that could plausibly implement one. Humans made the compilers it was trained on. It took many such examples and examples of other compilers and thousands of books and articles and blog posts to train the model. It took years of tweaking, fitting, aligning and other tricks to make the model respond to queries with better, more plausible output. It never made, invented, or reasoned about compilers. It’s an algorithm and system running on a bunch of computers.
The C compiler Anthropic got excited about was not a “working” compiler in the sense that you could replace GCC with it and compile the Linux kernel for all of the target platforms it supports. Their definition of, “works,” was that it passed some very basic tests.
Same with SQLite translation from C to Rust. Gaping, poorly specified English prose is insufficient. Even with a human in the loop iterating on it. The Rust version is orders of magnitude slower and uses tons more memory. It’s not a drop in Rust-native replacement for SQLite. It’s something else if you want to try that.
What mechanism in these systems is responsible for guessing the requirements and constraints missing in the prompts? If we improve that mechanism will we get it to generate a slightly more plausible C compiler or will it tell us that our specifications are insufficient and that we should learn more about compilers first?
I’m sure its possible that there are cases where these tools can be useful. I’m not sure this is it though. AGI is purely hypothetical. We don’t simulate a black hole inside a computer and expect gravity to come out of it. We don’t simulate the weather systems on Earth and expect hurricanes to manifest from the computer. Whatever bar the people selling AI system have for AGI is a moving goalpost, a gimmick, a dream of potential to keep us hooked on what they’re selling right now.
It’s unfortunate that the author nearly hits on why but just misses it. The quotes they chose to use nail it. The blog post they reference nearly gets it too. But they both end up giving AI too much credit.
Generating a whole React application is probably a breath of fresh air. I don’t doubt anyone would enjoy that and marvel at it. Writing React code is very tedious. There’s just no reason to believe that it is anything more than it is or that we will see anything more than incremental and small improvements from here. If we see any more at all. It’s possible we’re near the limits of what we can do with LLMs.
slightly off-topic perhaps, but it makes me wonder if its so tedious how did it catch on in the first place...
i feel like llms are abstracting away that tedium sometimes yes, but i feel its probably because the languages and frameworks we use aren't hitting the right abstractions and are too low level for what we are trying to do... idk just a thought
I wouldn’t call what LLMs are doing an abstraction. They generate code. You just don’t have to write it. It can feel like it’s hiding details behind a new, precise semantic layer… but you’ll find out once the project gets to a certain size that is not the case: the details absolutely matter and you’ll be untangling a large knot of code (or prompting the AI to fix it for the seventh time).
It’s a good thought and I tend to think that this is the way I would feel more productive: better languages that give us the ability to write better abstractions. Abstractions should provide us with new semantic layers that lose no precision and encapsulate lots of detail.
They shouldn’t require us to follow patterns in our code and religiously generate boilerplate and configuration. That’s indirection and slop. It’s wasted code, wasted effort, and is why I find frameworks like React to be… not pleasant to use. I would rather generate the code that adds a button. It should be a single expression but for many reasons, in React, it isn’t.
Humans learned from prior art, and most of their inventions are modifications of prior art. You a are after all, mostly a biological machine.
The point is - there are so many combinations and permutations of reality, that AI can easily create synthetically novel outcomes by exploring those options.
It's just wrong to suggest that 'it was all in some textbook'.
"here’s just no reason to believe that it is anything more than it is "
It's almost ridiculous at face value, given that millions of people are using it for more than 'helping to write react apps every day'.
It's far more likely that you've come to this conclusion because you're simply not using the tools creatively, or trying to elicit 'synthetic creativity' out of the AI, because it's frankly not that hard, and the kinds of work that it does goes well beyond 'automation'.
This is not an argument, it's the lived experience of large swaths of individuals.
> Chris Lattner, inventor of the Swift programming language recently took a look at a compiler entirely written by Claude AI. Lattner found nothing innovative in the code generated by AI [1]. And this is why humans will be needed to advance the state of the art.
Lots of people have ideas for programming languages; some of those ideas may be original-but many of those people lack the time/skills/motivation to actually implement their ideas. If AI makes it easier to get from idea to implementation, then even if all the original ideas still come from humans, we still may stand to make much faster progress in the field than we have previously.
So the problem with Chris’ take is “This one for fun project didn’t produce anything particularly interesting.”
So outside of the fact that we have magic now that can just produce “conventional “ compilers. Take it to a Moore’s Law situation. Start 1000 create a compiler projects- have each have a temperature to try new things, experiment, mutate. Collate - find new findings - reiterate- another 1000 runs with some of the novel findings. Assume this is effectively free to do.
The stance that this - which can be done (albeit badly) today and will get better and/or cheaper - won’t produce new directions for software engineering seems entirely naive.
Moors law states that the number of transistors in an integrated circuit doubles about every two years. It has nothing to say about the capabilities of statistical models.
In fact in statistics we have another law which states that as you increase parameters the more you risk overfitting. And overfitting seems to already be a major problem with state of the art LLM models. When you start overfitting you are pretty much just re-creating stuff which is already in the dataset.
In their example it doesn't matter is this case if the models get better or not. It matters whether inference gets cheaper to the point that we can afford to basically throw huge amounts of tokens at exploring the problem space.
Further model improvements would be a bonus, but it's not required for us to get much further.
> Modern LLMs showed that overfitting disappears if you add more and more parameters.
I have not seen that. In fact this is the first time I hear this claim, and frankly it sounds ludicrous. I don‘t know how modern LLMs are dealing with overfitting but I would guess there is simply a content matching algorithm after the inference, and if there is a copyright match the program does something to alter or block the generation. That is, I suspect the overfitting prevention is algorithmic and not part of the model.
>Chris Lattner, inventor of the Swift programming language recently took a look at a compiler entirely written by Claude AI. Lattner found nothing innovative in the code generated by AI [1]. And this is why humans will be needed to advance the state of the art.
"Needed to advance the state of the art" and actually deployed to do so are two different things. More likely either AI will learn to advance the state of the art itself, or the state of the art wont be advancing much anymore...
>AI systems are trained on vast bodies of human work and generate answers near the center of existing thought. A human might occasionally step back and question conventional wisdom, but AI systems do not do this on their own. They align with consensus rather than challenge it. As a result, they cannot independently push knowledge forward.
But AI companies keep telling us AGI is 6 months into the future.
LLMs helping with code that is averge to above average might be an improvement overall across most projects, and I also have found that some things that LLMs suggest to me that are new to me can feel innovative, but areas I have experience with I often have a different or more effective way to start at instead of iterating towards it while trying to contain complexity.
I think the fact that AI can make a working compiler is crazy, especially compared to what most of us thought was possible in this space 4 years ago.
Lately, there have been a few examples of AI tackling what have traditionally been thought of as "hard" problems -- writing browsers and writing compilers. To Christ Lattner's point, these problems are only hard if you're doing it from scratch or doing something novel. But they're not particularly hard if you're just rewriting a reference implementation.
Writing a clean room implementation of a browser or a compiler is really hard. Writing a new compiler or browser referencing existing implementations, but doing something novel is also really hard.
But writing a new version of gcc or webkit by rephrasing their code isn't hard, it's just tedious. I'm sure many humans with zero compiler or browser programing experience could do it, but most people don't bother because what's the point?
Now we have LLMs that can act as reference implementation launderers, and do it for the cost of tokens, so why not?
Reinforcement Learning changes this though - remember Move 37?
The issue is you need verifiable rewards for that (and a good environment set-up), and it's hard to get rewards that cover everything humans want (security, simplicity, performance, readability, etc.)
> Chris Lattner, inventor of the Swift programming language recently took a look at a compiler entirely written by Claude AI. Lattner found nothing innovative in the code generated by AI [1].
Well, of course. Despite people applying the label of AI to them, LLMs don't have a shred of intelligence. That is inherent to how they work. They don't understand, only synthesize from the data they were trained on.
99% of humans in a particular specialization, sure. It's the 1% who become experts in that specialization who are able to advance the state of the art. But it's a different 1% for every area of expertise! Add it all up and you get a lot more than 1% of humans contributing to the sum of knowledge.
And of course, if you don't limit yourself to "advancing the state of the art at the far frontiers of human knowledge" but allow for ordinary people to make everyday contributions in their daily lives, you get even more. Sure, much of this knowledge may not be widespread (it may be locked up within private institutions) but its impact can still be felt throughout the economy.
If 1% of the people in each specialization are advancers, and you add up all the specializations together, then 1% of the total number of people are advancers.
Even this assumes that everyone has a specialization in which 1% of people contribute to the sum of human knowledge. I would probably challenge that. There are a lot of people in the world who do not do knowledge-oriented work at all.
You don’t need to do knowledge work to advance the state of the art. You could be working in a shoe factory and discover a better way to tie your shoes.
Your math assumes each person has exactly one thing they do in life. The shoe factory worker could also be a gardener. He might not make any advancements in gardening, but his contribution means that if you add up all the fields of specialization the sum is greater than the population of humans. Take 1% of that sum and it’s greater than 1% of humans. 1% of people in a specialization is not the same as 1% of specialists. In fact, I would say it’s a much higher proportion of specialists making contributions (especially through collaboration).
Oh, and don’t get caught up on the 1% number. I used it as shorthand for whatever small number it is. Maybe it’s only 10 people in some hyper-specialized field. But that doesn’t matter. Some other field may have thousands of contributors. You don’t have to be a specialist in a field to make a contribution to that field, for example: glassmakers advanced the science of astronomy by making the telescope possible.
>99% of humans in a particular specialization, sure. It's the 1% who become experts in that specialization who are able to advance the state of the art
How? By also "synthesizing the data they were trained on" (their experience, education, memories, etc.).
Can we be sure? Maybe it's just very rare for experience, education and memories to line up in exactly the way that allows synthesizing something innovative. So it requires a few billion candidates and maybe a couple of generations too.
I want to point back to my remark about everyday people.
if you don't limit yourself to "advancing the state of the art at the far frontiers of human knowledge" but allow for ordinary people to make everyday contributions in their daily lives, you get even more
This isn't a throwaway comment. I do this all the time myself, at work. Everywhere I've worked, I do this. I challenge the assumptions and try to make things better. It's not a rare thing at all, it's just not revolutionary.
Revolutions are rare. Perhaps only a handful of them have ever happened in any one particular field. But you simply will not ever go from Aristotelian physics to Newtonian physics to General Relativity by merely "synthesizing the data they were trained on", as the previous comment supposed.
Edit: I should also say something about experimentation. You can't do it from an armchair, which is all an LLM has access to (at present). Real people learn things all the time by conducting experiments in the world and observing the results, without necessarily working as formal scientists. Babies learn a lot by experimenting, for example. This is one particular avenue of new knowledge which is entirely separate from experience, education, memories, etc. because an experiment always has the potential to contradict all of that.
Experimentation leads to experience, so I feel like this was included by the parent comment. And in the case of writing software, agents are able to experiment today. They run tests, check log output, search DBs... Sure, they can't have apples fall on their heads like Newton had but they can totally observe the apple falling on someones head in a video.
Of course it does, but only after the fact. You don't have any experience of the result of the experiment before you perform it.
Sure, they can't have apples fall on their heads like Newton had but they can totally observe the apple falling on someones head in a video
I have strong doubts that LLMs have any understanding whatsoever of what's happening in images (let alone videos). The claim (I've sometimes heard) that they possess a world model and are able to interpret an image according to that model is an extremely strong one, that's strongly contradicted by the fact that they: a) continue to hallucinate in pretty glaring ways, and b) continue to mis-identify doctored (adversarial) images that no human would mis-identify (because they don't drastically alter the subject).
In software, they can and do perform experiments (make a change then observe the log output). I don't think they possess a "world model" or that it's worth spending too much thought on... My reasoning is more along the lines that our brains are also just [very advanced] inference machines. We also hallucinate and mis-identify images (there are image/video classification tasks where humans have lower scores).
For me the most glaring difference to how humans work is the lack of online learning. If that prevents them from being able to innovate, I'm not so sure.
Software is not the world. It’s a tiny bit of what humans do.
The lack of online learning is a critical fault. Much of what humans learn (such as anything based on mathematics) has a dependency tree of stuff to learn. But even mundane stuff involves a lot of dependent learning. For example, ask an LLM to write a cookbook and it can synthesize from recipes that are already out there but good luck having it invent new cooking techniques that require experimentation or invention (new heat source, new cooking utensils, etc).
I guess we'll just have to wait and see how things turn out. Currently it seems we have examples of where it seems like the technology allows some amount of innovation (AlphaGo, software, math proofs) and examples where they seem surprisingly stupid (recipes?).
Well, humans do experiments for one, as I explained elsewhere in the discussion. Experiments give us access to new knowledge from the world itself, which is not merely a synthesis of what we already know.
Real progress in science is made by the hard collection and cataloguing of data every single day, not by armchair philosophizing.
How can you ever say that about humans? Human brain is not trained once on all the data and then you start using it, human brain is constantly training and rewiring in the real time while being used, there is quite a dramatic difference from how LLM transformers work. Human can form new abstractions from sparse experience which is the true conceptual reasoning which LLMs are struggling with
100% (or close to it) of material AI trains on was human generated, but that doesn't mean 100% of humans are generating useful material for AI training.
Yes, and the natural extension is that a lot of what people do day to day is not work-driven by intelligence; it is just reusing a known solution to a presented problem in a bespoke manner. However, this is something that AI excels at.
>Despite people applying the label of AI to them, LLMs don't have a shred of intelligence. That is inherent to how they work. They don't understand, only synthesize from the data they were trained on
People also "synthesize from the data they were trained on". Intelligence is a result of that. So this dead-end argument then turns into begging the question: LLMs don't have intelligence because LLMs can't have intelligence.
You could say the same thing about Chris Lattner. How did he advance the state of the art with Swift? It’s essentially just a subjective rearranging of deck chairs: “I like this but not that.” Someone had to explain to Lattner why it was a good idea to support tail recursion in LLVM, for example - something he would have already known if he had been trained differently. He regurgitates his training just like most of us do.
That might read like an insult to Lattner, but what I’m really pointing out is that we tend to hold AIs to a much higher standard than we do humans, because the real goal of such commentary is to attempt to dismiss a perceived competitive threat.
The point is that saying the LLM failed to do what the overwhelming majority of devs can't do isn't exactly damning.
It's like Stephen King saying an AI generated novel isn't as good as his. Fine, but most of have much lesser ambitions than topping the work of the most successful people in the field.
> Lattner found nothing innovative in the code generated by AI [1].
In theory, we are just one good innovation away from changing this. In reality, it's probably still some years away, but we are not in a situation where have to seriously speculate with this possibility.
> And this is why humans will be needed to advance the state of the art.
But we only need a minority for innovations, progress and control. The bulk of IT is boring repetitive slop, lacking any innovation and just following patterns. The endgame will still result in probably 99% of humans being useless for the machinery. And this is not really new. In any industry, the majority of workers are just average, without any real influence on their industries progress, and just following conventional wisdom to make some bucks for surviving the next day.
Humans have the advantange of millions of year of training baked in their genes. There is nothing magical about being a human. Once algorithms have ability to collect data from real world(robotics), ability to do experiments in real world and ability to mimic nature all these advantages will fall away.
The rate of change is accelerating. I worry we don't have much time left unless we get serious about merging with machines.
The innovation isn't the output but the provenance.
We don't necessarily need a Chris Lattner to make a compiler now.
End of the day Chris Lattner is a single individual, not a magic being. A single individual posting submarine ads for his cleverness in the knowledge work subfield of language compilers. Of course he is going to drag the competition.
Languages are abstraction over memory addresses to provide something friendlier for human consumption. It's a field that's decades old and repeats itself constantly being it revolves around the same old; development of a compression technique to deduplicate and transpile to machine code the languages more verbose syntax.
Building a compiler is itself just programming. None of this is truly novel nor has it been since the 60s-70s. All that's changing is the user interface; the syntax.
Intelligence gives rise to our language capacity. The languages themselves are merely visual art that fits the preferences of the language creator. They arbitrarily decided nesting the dolls their way makes the most sense.
Currently have agents iterating on "prompt to binary". Reversing a headless Debian system into a model and optimizing to output tailored images. Opcodes, power use in system all tucked into a model to spit back out just the functions needed to achieve the electromagnetic geometry desired[1]
So someone who is a proven expert in his field, who writes a detailed, well-reasoned, balanced assessment of the state of compiler development and the role LLMs play in this, is according to you, “A single individual posting submarine ads for his cleverness in the knowledge work subfield of language compilers. Of course he is going to drag the competition.”?
Chris Lattner has forgotten more about language and compiler design than most of us will know in a lifetime. If you’re going to mis-characterize him you need to bring more to the table than some reductionist pseudo intelligent babbling.
AI tends to accept conventional wisdom. Because of this, it struggles with genuine critical thinking and cannot independently advance the state of the art.
AI systems are trained on vast bodies of human work and generate answers near the center of existing thought. A human might occasionally step back and question conventional wisdom, but AI systems do not do this on their own. They align with consensus rather than challenge it. As a result, they cannot independently push knowledge forward. Humans can innovate with help from AI, but AI still requires human direction.
You can prod AI systems to think critically, but they tend to revert to the mean. When a conversation moves away from consensus thinking, you can feel the system pulling back toward the safe middle.
As Apple’s “Think Different” campaign in the late 90s put it: the people crazy enough to think they can change the world are the ones who do—the misfits, the rebels, the troublemakers, the round pegs in square holes, the ones who see things differently. AI is none of that. AI is a conformist. That is its strength, and that is its weakness.
[1] https://www.modular.com/blog/the-claude-c-compiler-what-it-r...