Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Yeah, if you have these tools in place to validate it's changes you can quickly iterate with it to the right results. But think through how it's making UI changes and it becomes obvious quickly why it can make absolutely wrong and terrible guesses about the implementation details, it can't _see_ what it's doing, or interact with it, it's just pattern matching other implementations its seen.


Yea, the next breakthrough for Codex or Claude Code would be to actually use/test the app like a real human would during the development process.


Here's a document produced by Claude Code using my Showboat testing tool this morning to help explore SeaweedFS (a local S3 clone) - it includes trying things out with curl and getting screenshots from Chrome using my Rodney tool: https://github.com/simonw/research/blob/main/seaweedfs-testi...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: