I wrote one! It works well with cutting-edge LLMs. You feed it one or more source files that contain natural language, or stdin, and it produces a design spec, a README, and a test suite. Then it writes C code, compiles with cosmocc (for portability) and tests, in a loop, until everything is passing. All in one binary. It's been a great personal tool and I plan to open source it soon.
A programming language implementation produces results that are controllable, reproducible, and well-defined. An LLM has none of those properties, which makes the comparison moot.
Having an LLM make up underspecified details willy-nilly, or worse, ignore clear instructions is very different from programming languages "handling a lot of low-level stuff."
You can set temperature to 0 in many LLMs and get deterministic results (on the same hardware, given floating-point shenanigans). You can provide a well-defined spec and test suite. You can constrain and control the output.
LLMs produce deterministic results? Now, that's a big [citation needed]. Where can I find the specs?
Edit: This is assuming by "deterministic," you mean the same thing I said about programming language implementations being "controllable, reproducible, and well-defined." If you mean it produces random but same results for the same inputs, then you haven't made any meaningful points.
I'd recommend learning how transformers work, and the concept of temperature. I don't think I need to cite information that is broadly and readily available, but here:
I also qualified the requirement of needing the same hardware, due to FP shenanigans. I could further clarify that you need the same stack (pytorch, tensorflow, etc)
You claimed they weren't deterministic, I have shown that they can be. I'm not sure what your point is.
And it is incorrect to base your analysis of future transformer performance on current transformer performance. There is a lot of ongoing research in this area and we have seen continual progress.
> This is assuming by "deterministic," you mean the same thing I said about programming language implementations being "controllable, reproducible, and well-defined." If you mean it produces random but same results for the same inputs, then you haven't made any meaningful points.
"Determinism" is a word that you brought up in response to my comment, which I charitably interpreted to mean the same thing I was originally talking about.
Also, it's 100% correct to analyze things based on its fundamental properties. It's absurd to criticize people for assuming 2 + 2 = 4 because "continual progress" might make it 5 in the future.
What are these fundamental properties you speak of? 8 years ago this was all a pipe dream. Are you claiming to know what the next 8 years of transformer development will look like?
That LLMs are by definition models of human speech and have no cognitive capabilities. There is no sound logic behind what LLMs spit out, and will stay that way because it merely mimics its training data. No amount of vague future transformers will transform away how the underlying technology works.
But let's say we have something more than an LLM, that still wouldn't make natural languages a good replacement for programming languages. This is because natural languages are, as the article mentions, imprecise. It just isn't a good tool. And no, transformers can't change how languages work. It can only "recontextualize," or as some people might call it, "hallucinate."
Citation needed. Modern transformers are much, much more than just speech models. Precisely define "cognitive capabilities", and provide proof as to why neural models cannot ever mimic these cognitive capabilities.
> But let's say we have something more than an LLM
We do. Modern multi-modal transformers.
> This is because natural languages are, as the article mentions, imprecise
Two different programmers can take a well-enough defined spec and produce two separate code bases that may (but not must) differ in implementation, while still having the exact same interfaces and testable behavior.
> And no, transformers can't change how languages work. It can only "recontextualize," or as some people might call it, "hallucinate."
You don't understand recontextualization if you think it means hallucination. Or vice versa. Hallucination is about returning incorrect or false data. Recontextualization is akin to decompression, and can be lossy or "effectively" lossless (within a probabilistic framework; again, the interfaces and behavior just need to match)
The burden of proof is on the one making extraordinary claims. There has been no indication from any credible source that LLMs are able to think for itself. Human brains are still a mystery. I don't know why you can so confidently claim that neural models can mimic what humanity knows so little about.
> Two different programmers can take a well-enough defined spec and produce two separate code bases that may (but not must) differ in implementation, while still having the exact same interfaces and testable behavior.
Imagine doing that without a rigid and concise way of expressing your intentions. Or trying again and again in vain to get the LLM produce the software that you want. Or debugging it. Software development will become chaotic and lot less fun in that hypothetical future.
The burden of proof is not on the person telling you that a citation is needed when claiming that something is impossible. Vague phrases mean nothing. You need to prove that there are these fundamental limitations, and you have not done that. I have been careful to express that this is all theoretical and possible, you on the other hand are claiming it is impossible; a much stronger claim, which deserves a strong argument.
> I don't know why you can so confidently claim that neural models can mimic what humanity knows so little about.
I'm simply not ruling it out. But you're confidently claiming that it's flat out never going to happen. Do you see the difference?
You can't just make extraordinary claims [1][2], demand rigorous citation for those who question it, even going as far as to word lawyer the definition of cognition [3], and reverse the burden of proof. All the while providing no evidence beyond what essentially boils down to "anything and everything is possible."
> Vague phrases mean nothing.
Yep, you made my point.
> Do you see the difference?
Yes, I clearly state my reasons. I can confidently claim that LLMs are no replacements for programming languages for two reasons.
1. Programming languages are superior to natural languages for software development. Nothing on earth, not even transformers, can make up for the unavoidable lack of specificity in the hypothetical natural language programs without making things up because that's how logic works.
2. LLMs, as impressive as they may be, are fundamentally computerized parrots so you can't understand or control how they generate code unlike with compilers like GCC which provides all that through source code.
This is just stating the obvious here, no surprises.
Your error is in assuming (or at least not disproving) that natural language cannot fully capture the precision of a programming language. But we already see in real life how higher-level languages, while sometimes making you give up control of underlying mechanisms, allow you to still create the same programs you'd create with other languages, barring any specific technical feature. What is different here though is that natural language actually allows you to reduce and increase precision as needed, anywhere you want, offering both high and low level descriptions of a program.
You aren't stating the obvious. You're making unbacked claims based on your intuition of what transformers are. And even offering up the tired "stochastic parrot" claim. If you can't back up your claims, I don't know what else to tell you. You can't flip it around and ask me to prove the negative.
If labeling claims as "tired" makes it false, not a single fact in the world can be considered as backed by evidence. I'm not flipping anything around either, because again, it's squarely on you to provide proof for your claims and not those who question it. You're essentially making the claim that transformers can reverse a non-reversible function. That's like saying you can reverse a hash although multiple inputs can result in the same hash. That's not even "unbacked claims" territory, it defies logic.
I'm still not convinced LLMs are mere abstractions in the same way programming language implementations are. Even though programmers might give up some control of the implementation details when writing code, language implementors still decides all those details. With LLMs, no one does. That's not an abstraction, that's chaos.
I have been careful to use language like "theoretically" throughout my posts, and to focus on leaving doors open until we know for sure they are closed. You are claiming they're already closed, without evidence. This is a big difference in how we are engaging with this subject. I'm sure we would find we agree on a number of things but I don't think we're going to move the needle on this discussion much more. I'm fine with just amicably ending it here if you'd like.
The index contains a file name that you can append to the CommonCrawl url to download the archive and view.
More detailed information on downloading archives here:
normal arrays in bash are implemented as linked lists. bash stores a pointer to the last accessed element, which turns the most common case of iteration into O(1), but the performance is terrible if you need to jump around
there are also associative arrays which are bucketed hash tables, which are fine for string keys but imho they are hardly ever worth it as a replacement for indexed arrays
[[ is a keyword, and exec is a builtin. With the {name}< syntax, exec is opening a file descriptor and assigning it's numerical value to $name, and {name}>&- closes it
reply