Hacker News new | past | comments | ask | show | jobs | submit login

Feels like we're like a year away from local LLMs that can debug code reliably (via being hooked into console error output as well) which will be quite the exciting day.



Have you tried Code Llama? How do you know it can't do it already?

In my applications, GPT-4 connected to a VM or SQL engine can and does debug code when given error messages. "Reliably" is very subjective. The main problem I have seen is that it can be stubborn about trying to use outdated APIs and it's not easy to give it a search result with the correct API. But with a good web search and up to date APIs, it can do it.

I'm interested to see general coding benchmarks for Code Llama versus GPT-4.


> But with a good web search and up to date APIs, it can do it.

How do you do that?


What does "GPT-4 connected to a VM or SQL engine" mean?


https://aidev.codes shows connected to VM.


Have you tried giving up to date apis as context?


That sounds like an interesting finetuning dataset.

Imagine a database of "Here is the console error, here is the fix in the code"

Maybe one could scrape git issues with console output and tagged commits.


I'd be surprised if GPT-4 couldn't already do that with the caveat that piping in so much code might cost you billionaire money at scale.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: