Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Have you seen https://www.swebench.com/ ?

Once you engage agentic behaviour, it can take you way further than just the chats. We're already in the "resolving JIRA tickets" area - it's just hard to setup, not very well known, and may be expensive.



> We're already in the "resolving JIRA tickets" area

For very simple tasks maybe, but not for the kinds of things I get paid to do.

I don't think it will be able to get to the level of reliably doing difficult programming tasks that require understanding and inferring requirements without having AGI, in which case society has other things to worry about than programmers losing their jobs.


Looks like the definition of "resolving a ticket" here is "come up with a patch that ensures all tests pass", which does not necessarily include "add a new test", "make sure the patch is actually doing something meaningful", "communicate how this is fixed". Based on my experience and what I saw in the reports in the logs, a solution could be just hallucinating completely useless code -- as long as it doesn't fail a test.

Of course, it is still impressive, and definitely would help with the small bugs that require small fixes, especially for open source projects that have thousands of open issues. But is it going to make a big difference? Probably not yet.

Also, good luck doing that on our poorly written, poorly documented and under tested codebase. By any standard django is a much better codebase than the one I work on every day.


Some are happy with creating tests as well, but you probably want to mostly write them yourself. I mean, only you know the real world context - if the ticket didn't explain it well enough, LLMs can't do magic.

Actually the poorly documented and poorly written is not a huge issue in my experience. The under tested is way more important if you want to automate that work.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: