Hacker Newsnew | past | comments | ask | show | jobs | submit | fumeux_fume's favoriteslogin

Open request to skeptics or curious minds - do you have a task that's at least somewhat less difficult for me to set up than swe-bench?

I'd be happy to create you a base agent, and a fine-tuned agent, and OSS the traces for you to look at differently.

And if it's really compelling, visualize them in a hosted frontend :-)


Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: