I'll respond to this comment to provide a general response for all of the sub-comments here.
As I highlighted in my post, LLM's generally are still not in a position to replace a developer for more complex tasks and refactoring. We're in the early days of the technology, but we are seeing extremely strong improvements in it over the last year. We on the team have QA'd thousands of results for public, and private repositories. The private ones are particularly interesting because the LLM's do not have that in their corpus, and have seen very strong fix results.
Most people just assume we're wrapping around an LLM, but there's a lot that goes underneath the hood that needs to happen to ensure that fixes are going to be secure and correct. Here are the standards we're setting for fix quality:
- The fix needs to be best-practice and complete. A partial security fix isn't a security fix. This is something we're constantly working on.
- Supporting the widest coverage in CWE's.
- Not introducing any breaking changes in the rest of the code.
- Understanding the language, the framework being used, and any specific packages. For example, fixing an CSRF issue in Django is different than Flask. Both are python frameworks but approach it differently.
- Reusing existing packages correctly to improve security and if it does need to add a package does so in a standard way.
- Placing imports in the correct part of the file.
- Not using deprecated or risky packages.
- Avoiding LLM hallucinations.
- Ensuring syntax and formatting are correct.
- Follow the coding and naming convention in the file being fixed.
- Making sure fixes are consistent within the same issue type.
- Explain the fix properly and clearly so that someone can understand it.
- Avoiding assumptions that could cause problems.
- Not removing any code that is not part of the issue.
Our goal is to get to 90% - 95% accuracy in fixes this year, and we're on a trajectory to do that. I will be the first to say 100% accuracy is impossible, and our goal is to get it right more times than engineers would.
We take fix quality and transparency extremely seriously. We'll be publishing a whitepaper showing the accuracy in results because it's the right thing to do. I hope this helps.
If you don't know that - or rather, if nobody on your team recognized this issue and brought it up - you should not be selling and shipping this product.