Hacker News new | past | comments | ask | show | jobs | submit login

The impression I get from using all cutting edge AI tools:

1. Sonnet 3.7 is a mid-level web developer at least

2. DeepResearch is about as good an analyst as an MBA from a school ranked 50+ nationally. Not lower than that. EY, not McKinsey

3. Grok 3/GPT-4.5 are good enough as $0.05/word article writers

Its not replacing the A-players but its good enough to replace B players and definitely better than C and D players




I'd expect mid-level developer to show more understanding and better reasoning. So far it looks like a junior dev who read a lot of books and good at copy pasting from stackoverflow.

(Based on my everyday experience with Sonet and Cursor)


A midlevel web developer should do a whole lot more than just respond to chat messages and do exactly what they are told to do and no more.


When I use LLMs that what it does. Spawns commands, edits files, runs tests, evaluates outputs, iterates and solutions under my guidance.


The key here is "under your guidance". LLM's are a major productivity boost for many kinds of jobs, but can LLM-based agents be trusted to act fully autonomously for tasks with real world consequence? I think the answer is still no, and will be for a long time. I wouldn't trust LLM to even order my groceries without review, let alone push code into production.

To reach anything close to definition of AGI, LLM agents should be able to independently talk to customers, iteratively develop requirements, produce and test solutions, and push them to production once customers are happy. After that, they should be able to fix any issues arising in production. All this without babysitting / review / guidance from human devs, reliably




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: