Hacker News new | past | comments | ask | show | jobs | submit login

When I use LLMs that what it does. Spawns commands, edits files, runs tests, evaluates outputs, iterates and solutions under my guidance.



The key here is "under your guidance". LLM's are a major productivity boost for many kinds of jobs, but can LLM-based agents be trusted to act fully autonomously for tasks with real world consequence? I think the answer is still no, and will be for a long time. I wouldn't trust LLM to even order my groceries without review, let alone push code into production.

To reach anything close to definition of AGI, LLM agents should be able to independently talk to customers, iteratively develop requirements, produce and test solutions, and push them to production once customers are happy. After that, they should be able to fix any issues arising in production. All this without babysitting / review / guidance from human devs, reliably




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: