Reinforcement Learning changes this though - remember Move 37?
The issue is you need verifiable rewards for that (and a good environment set-up), and it's hard to get rewards that cover everything humans want (security, simplicity, performance, readability, etc.)
The issue is you need verifiable rewards for that (and a good environment set-up), and it's hard to get rewards that cover everything humans want (security, simplicity, performance, readability, etc.)