Hi HN, I’m the founder of Corgea (
https://corgea.com). We help companies fix their vulnerable source code using AI.
Originally, we started with a data security product that would detect data leaks at companies. Despite initial successes and customer acquisitions, we frequently heard that highlighting issues wasn't enough; customers wanted proactive fixes. They had hundreds (yes hundreds!) of security tools alerting them about vulnerabilities, but couldn’t afford a dedicated team to go through them all and fix them. One prospect we spoke to had tens of thousands of reported vulnerabilities in their SAST tool. With the rise of AI code generation, we saw an opportunity to give customers what they really wanted.
Having Corgea is like having a security engineer on staff focused on making your code more secure. We want security to be an enabler of engineering rather than a blocker to it, and the reverse to be true. To accomplish this, we built it on top of existing LLMs to issue code fixes.
To show Corgea’s capabilities, we took some popular vulnerable-by-design applications like Juice Shop (https://github.com/juice-shop/juice-shop), scanned them and issued fixes for their vulnerabilities. You can see some of them here: https://demo.corgea.com. Some examples of vulnerabilities it solves are like SQL injection, Path Traversal and XSS.
What makes this tough is that currently LLMs struggle at generalist coding tasks because it has to understand your whole code base, the domain you’re in, and the user’s request to do something. This can lead to a lot of unintended behavior where it codes things incorrectly because it’s giving a best guess at what you want. Adam, one of the founding engineers on the team coined it well: LLMs don’t reason, they fuzz.
We made several decisions that helped the LLM become more deterministic. First, what we’re doing is extremely domain specific: vulnerable code fixes in a limited number of programming languages. There are roughly 900 security vulnerabilities in code, called CWE’s (https://cwe.mitre.org/), that we’ve built into Corgea. An SQL injection vulnerability in a Javascript app is the same regardless if you’re a payments company or a travel booking website. Second, we have no user generated input going into the LLM, because SAST scanners everything needed to issue a fix. This makes it much more predictable and reproducible for us and customers. We can also create robust QA processes and checks.
To illustrate the point, let’s put some of this to the test using some napkin math. Assume you’re serving 5,000 enterprises that ship on average 300 domain specific features a year in 5 different programming languages that each require 30 lines of code changes across multiple files. You’ll have about 300m permutations the product needs to support. What a nightmare!
Using the same napkin math, Corgea needs to support the ~900 vulnerabilities (CWE’s). Most of them require 1 - 2 line changes. It doesn’t need to understand the whole codebase since the problem is usually isolated to a few lines. We want to support the 5 most popular programming languages. If we have 5,000 customers, we have to support ~4,500 permutations (900 issues x 5 different languages). This leads to a massive difference in accuracy. Obviously, this is an oversimplification of the whole thing but it illustrates the point.
What makes this different from Copilot and other code-gen tools is that they do not specialize in security and we’ve seen them inadvertently introduce security issues unbeknownst to the engineer. Additionally, they do not integrate into existing scanning tools that companies are using to resolve those issues. So unless a developer is working on every part of the product, they’re unable to clear security backlogs, which can be in the thousands of tickets.
As for security scanners, the current market is flooded with tools that report and overwhelm security teams and are not effective at fixing what they’re reporting. Most vulnerability scanners do not remediate issues, and if they do they’re mostly limited to upgrading packages from one version to another to reduce a CVSS. If they do offer CWE remediation capabilities their success rates are very low because they’re often based on traditional AI methodologies. Additionally, they do not integrate with each other because they want to only serve their own findings. Enterprises use multiple tools like Snyk, Semgrep, Checkmarx, but also have a penetration testing program, and a bug bounty program. They need a solution that consolidates across their existing tools. They also use Github, Gitlab and Bitbucket for their code repository.
We’re offering a free tier for smaller teams and priced tiers. We believe we can reduce 80% of the engineering effort for security fixes, which would equate to at least $10m a year for enterprises.
We’re really excited to share this with you all and we’d love any thoughts, feedback, and comments!