Then you'll get code that passes the tests you generate, where "tests" includes ...

pixelfarmer · 2025-03-03T07:18:06 1740986286

> Then you'll get code that passes the tests you generate

Just recently I think here on HN there was a discussion about how neural networks optimize towards the goal they are given, which in this case means exactly what you wrote, including that the code will do stuff in wrong ways just to pass the given tests.

Where do the tests come from? Initially from a specification of what "that thing" is supposed to do and also not supposed to do. Everyone who had to deal with specifications in a serious way knows how insanely difficult it is to get these right, because there are often things unsaid, there are corner cases not covered and so on. So the problem of correctness is just shifted, and the assumption that this may require less time than actually coding ... I wouldn't bet on it.

Conceptually the idea should work, though.

gunian · 2025-03-04T06:57:45 1741071465

what if you thought of your codebase as something similar to human DNA and the LLM as nature and the entire process as some sort of evolutionary process? the fitness function would be no panics exceptions and latency instead of some random KPI or OKR pr who likes working with who or who made who laugh

it's what our lord and savior jesus christ uses for us humans if it is good for him its good enough for me. and surely google is not laying off 25k people because it believes humans are better than their LLMs :)