Hacker News new | past | comments | ask | show | jobs | submit login

A human could get a valid end state most of the time, gpt-4 seems to mess up more than it got it right based on the examples posted here. So to me it seems like gpt-4 is worse than humans.

Gpt-4 with help from a competent human will of course do better than most humans, but that isn't what we are discussing.




> valid end state most of the time

I disagree. Don't assume "most humans" are anything like Silicon Valley startup developers. Most developers out there in the wild would definitely struggle to solve problems like this.

For example, a common criticism of AI-generated code is the risk of introducing vulnerabilities.

I just sat in a meeting for an hour, literally begging several developers to stop writing code vulnerable to SQL injection! They just couldn't understand what I was even talking about. They kept trying to use various ineffective hacky workarounds ("silver bullets") because they just didn't grok the the problem.

I've found GPT 4 outperforms median humans.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: