Hacker News new | past | comments | ask | show | jobs | submit | izabera's comments login

>sufficiently descriptive natural language specification https://www.commitstrip.com/en/2016/08/25/a-very-comprehensi...

sounds like it would pair well with a suitably smart compiler

I wrote one! It works well with cutting-edge LLMs. You feed it one or more source files that contain natural language, or stdin, and it produces a design spec, a README, and a test suite. Then it writes C code, compiles with cosmocc (for portability) and tests, in a loop, until everything is passing. All in one binary. It's been a great personal tool and I plan to open source it soon.

No, the key difference is that an engineer becomes more product-oriented, and the technicalities of the implementation are deprioritized.

It is a different paradigm, in the same way that a high-level language like JavaScript handles a lot of low-level stuff for me.


A programming language implementation produces results that are controllable, reproducible, and well-defined. An LLM has none of those properties, which makes the comparison moot.

Having an LLM make up underspecified details willy-nilly, or worse, ignore clear instructions is very different from programming languages "handling a lot of low-level stuff."


[citation needed]

You can set temperature to 0 in many LLMs and get deterministic results (on the same hardware, given floating-point shenanigans). You can provide a well-defined spec and test suite. You can constrain and control the output.


LLMs produce deterministic results? Now, that's a big [citation needed]. Where can I find the specs?

Edit: This is assuming by "deterministic," you mean the same thing I said about programming language implementations being "controllable, reproducible, and well-defined." If you mean it produces random but same results for the same inputs, then you haven't made any meaningful points.


I'd recommend learning how transformers work, and the concept of temperature. I don't think I need to cite information that is broadly and readily available, but here:

https://medium.com/google-cloud/is-a-zero-temperature-determ...

I also qualified the requirement of needing the same hardware, due to FP shenanigans. I could further clarify that you need the same stack (pytorch, tensorflow, etc)


This gcc script that I created below is just as "deterministic" as an LLM. It produces the same result every time. Doesn't make it useful though.

    echo '#!/usr/bin/env bash' > gcc
    echo 'cat <<EOF' >> gcc
    openssl rand -base64 100 >> gcc
    echo 'EOF' >> gcc
    chmod +x gcc
Also, how transformers work is not a spec of the LLM that anyone can use to learn how LLM produces code. It's no gcc source code.

You claimed they weren't deterministic, I have shown that they can be. I'm not sure what your point is.

And it is incorrect to base your analysis of future transformer performance on current transformer performance. There is a lot of ongoing research in this area and we have seen continual progress.


I reiterate:

> This is assuming by "deterministic," you mean the same thing I said about programming language implementations being "controllable, reproducible, and well-defined." If you mean it produces random but same results for the same inputs, then you haven't made any meaningful points.

"Determinism" is a word that you brought up in response to my comment, which I charitably interpreted to mean the same thing I was originally talking about.

Also, it's 100% correct to analyze things based on its fundamental properties. It's absurd to criticize people for assuming 2 + 2 = 4 because "continual progress" might make it 5 in the future.


What are these fundamental properties you speak of? 8 years ago this was all a pipe dream. Are you claiming to know what the next 8 years of transformer development will look like?

That LLMs are by definition models of human speech and have no cognitive capabilities. There is no sound logic behind what LLMs spit out, and will stay that way because it merely mimics its training data. No amount of vague future transformers will transform away how the underlying technology works.

But let's say we have something more than an LLM, that still wouldn't make natural languages a good replacement for programming languages. This is because natural languages are, as the article mentions, imprecise. It just isn't a good tool. And no, transformers can't change how languages work. It can only "recontextualize," or as some people might call it, "hallucinate."


Citation needed. Modern transformers are much, much more than just speech models. Precisely define "cognitive capabilities", and provide proof as to why neural models cannot ever mimic these cognitive capabilities.

> But let's say we have something more than an LLM

We do. Modern multi-modal transformers.

> This is because natural languages are, as the article mentions, imprecise

Two different programmers can take a well-enough defined spec and produce two separate code bases that may (but not must) differ in implementation, while still having the exact same interfaces and testable behavior.

> And no, transformers can't change how languages work. It can only "recontextualize," or as some people might call it, "hallucinate."

You don't understand recontextualization if you think it means hallucination. Or vice versa. Hallucination is about returning incorrect or false data. Recontextualization is akin to decompression, and can be lossy or "effectively" lossless (within a probabilistic framework; again, the interfaces and behavior just need to match)


The burden of proof is on the one making extraordinary claims. There has been no indication from any credible source that LLMs are able to think for itself. Human brains are still a mystery. I don't know why you can so confidently claim that neural models can mimic what humanity knows so little about.

> Two different programmers can take a well-enough defined spec and produce two separate code bases that may (but not must) differ in implementation, while still having the exact same interfaces and testable behavior.

Imagine doing that without a rigid and concise way of expressing your intentions. Or trying again and again in vain to get the LLM produce the software that you want. Or debugging it. Software development will become chaotic and lot less fun in that hypothetical future.


The burden of proof is not on the person telling you that a citation is needed when claiming that something is impossible. Vague phrases mean nothing. You need to prove that there are these fundamental limitations, and you have not done that. I have been careful to express that this is all theoretical and possible, you on the other hand are claiming it is impossible; a much stronger claim, which deserves a strong argument.

> I don't know why you can so confidently claim that neural models can mimic what humanity knows so little about.

I'm simply not ruling it out. But you're confidently claiming that it's flat out never going to happen. Do you see the difference?


You can't just make extraordinary claims [1][2], demand rigorous citation for those who question it, even going as far as to word lawyer the definition of cognition [3], and reverse the burden of proof. All the while providing no evidence beyond what essentially boils down to "anything and everything is possible."

> Vague phrases mean nothing.

Yep, you made my point.

> Do you see the difference?

Yes, I clearly state my reasons. I can confidently claim that LLMs are no replacements for programming languages for two reasons.

1. Programming languages are superior to natural languages for software development. Nothing on earth, not even transformers, can make up for the unavoidable lack of specificity in the hypothetical natural language programs without making things up because that's how logic works.

2. LLMs, as impressive as they may be, are fundamentally computerized parrots so you can't understand or control how they generate code unlike with compilers like GCC which provides all that through source code.

This is just stating the obvious here, no surprises.

[1]: https://news.ycombinator.com/item?id=43567653

[2]: https://news.ycombinator.com/item?id=43568699

[3]: https://news.ycombinator.com/item?id=43585498


Your error is in assuming (or at least not disproving) that natural language cannot fully capture the precision of a programming language. But we already see in real life how higher-level languages, while sometimes making you give up control of underlying mechanisms, allow you to still create the same programs you'd create with other languages, barring any specific technical feature. What is different here though is that natural language actually allows you to reduce and increase precision as needed, anywhere you want, offering both high and low level descriptions of a program.

You aren't stating the obvious. You're making unbacked claims based on your intuition of what transformers are. And even offering up the tired "stochastic parrot" claim. If you can't back up your claims, I don't know what else to tell you. You can't flip it around and ask me to prove the negative.


If labeling claims as "tired" makes it false, not a single fact in the world can be considered as backed by evidence. I'm not flipping anything around either, because again, it's squarely on you to provide proof for your claims and not those who question it. You're essentially making the claim that transformers can reverse a non-reversible function. That's like saying you can reverse a hash although multiple inputs can result in the same hash. That's not even "unbacked claims" territory, it defies logic.

I'm still not convinced LLMs are mere abstractions in the same way programming language implementations are. Even though programmers might give up some control of the implementation details when writing code, language implementors still decides all those details. With LLMs, no one does. That's not an abstraction, that's chaos.


I have been careful to use language like "theoretically" throughout my posts, and to focus on leaving doors open until we know for sure they are closed. You are claiming they're already closed, without evidence. This is a big difference in how we are engaging with this subject. I'm sure we would find we agree on a number of things but I don't think we're going to move the needle on this discussion much more. I'm fine with just amicably ending it here if you'd like.

slow?


Just for context, there is a new post about OpenAI DDoS'ing half the internet every other day on hn

https://news.ycombinator.com/item?id=42660377

https://news.ycombinator.com/item?id=42549624


Just for context, the author of the second link in your comment verifiably lied about blocking crawlers via robots.txt

CommonCrawl archives robots.txt

For convenience, you can view the extracted data here:

https://pastebin.com/VSHMTThJ

You are welcome to verify for yourself by searching for “wiki.diasporafoundation.org/robots.txt” in the CommonCrawl index here:

https://index.commoncrawl.org/

The index contains a file name that you can append to the CommonCrawl url to download the archive and view. More detailed information on downloading archives here:

https://commoncrawl.org/get-started

From September to December, the robots.txt at wiki.diasporafoundation.org contained this, and only this:

>User-agent: * >Disallow: /w/


If you ask OpenAI to stop, using robots.txt, they actually will.

What Aaron was trying to achieve was great, how he want about it is what ruined his life.


It is a well known fact that OpenAI stole content by scraping sites with illegally uploaded content on it.


Nobody really asked Aaron about anything they collected more evidence and wanted to put him to jail.

School should have unplugged his machine bring him for questioning and tell him not to do that.


try `bash game.bash 2>errors` and see the content of that file when the game exits


normal arrays in bash are implemented as linked lists. bash stores a pointer to the last accessed element, which turns the most common case of iteration into O(1), but the performance is terrible if you need to jump around

see some basic benchmarks here https://gist.github.com/izabera/16a46ed79c2248349a1fb8384468...

there are also associative arrays which are bucketed hash tables, which are fine for string keys but imho they are hardly ever worth it as a replacement for indexed arrays


Thanks a lot, that's very interesting re. it using linked lists, and nice benchmark!


That's really cool! The explanation on their blog is great!


Thanks for the hint! That's exactly what I was looking for as I have never looked into raycasting but having an interest in it.

Direct Links:

- https://nthorn.com/articles/batch_raycaster/ (batch variant)

- https://lodev.org/cgtutor/raycasting.html (general intro)


yeah i really liked that tutorial


just type it all in your current live shell manually :)


you can put it in a function, functions don't fork


[[ is a keyword, and exec is a builtin. With the {name}< syntax, exec is opening a file descriptor and assigning it's numerical value to $name, and {name}>&- closes it


He's my fiancé :)


I can still ask myself if I feel I need a different license! ;-)


But that's hardly the most optimal approach in this scenario :-)


Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: