The amount of fear I see from people around C is so surprising to me. Not saying you fit into this, but so far my anecdata suggests that it's largely people who don't know (or know very little) C. Especially among CS grads whose main exposure to C was in the stack smashing exercise in a security class, all they know about C is the part that was intentionally made vulnerable so it would be easy to exploit. Unless you're being wildly negligent and reckless with your programming, C is really not that scary.
What percentage of non-C applications did you review? I suspect C has enough footguns to be an issue, but its popularity, especially in low-level software (kernels, codecs, firmware) ensures that it'll show up in security issues regardless of how bad the languages itself is.
I reviewed mostly C, then reviewed mostly Golang. But I also reviewed codebases in Erlang, Perl, Java, C++, Rust, Python, Javascript, etc.
Mind you I was mostly reviewing cryptographic-related applications, but most C applications contained bugs that had nothing to do with the logic (lots of memory corruption bugs) while most Golang applications contained logic bugs. Or at least I would find logic bugs because Golang was both rid of most memory corruption bugs, and also an extremely readable language (so easier for me to understand the code and find logic bugs). Although Golang still had nil dereference bugs (happens a lot when people used protobuf), because they don't have sum types. Today I think a great language would be a mix between a readable language like Go (with good defaults, toolings, stdlib) and a safe languages like Rust.
There's a difference in GC languages though. strongly typed languages like Golang will always be more secure than dynamically typed languages like Python.
Interesting, I had the same job in mid to late 00s, although I wasn't a consultant so my sample was the company's codebases (of which there were a lot because we built a lot of embedded systems on top of vxworks that did a lot of network communications, sometimes in very niche protocols), not necessarily the codebases of company's that are worried enough that they hire a consultant. That was right around the time when compilers and security tools were becoming available that could flag nearly every possible problem. At that point false positives became a big challenge.
What years were you a consultant reviewing C applications?
I'm guessing you were using tools like coverity? I actually never used such tools. I mostly did manual reviews and sometimes implemented fuzzers with AFL. But most of the code I looked at was crypto code. Did that at Matasano/NCC Group from 2015-2019
it's been 15 years so I don't remember the names of the tools, but coverity rings a bell. There was one that we used to make fun of a lot because it was written in Java, but it was by far the best at finding stuff. It would even show you the AST to help point out problems. I'm suddenly feeling really nostalgic about GUIs written in Swing and SWT :-D
Definitely. Though Linux code has probably been tested more thoroughly, and run through more static analysis, than any other C code base. That does help me sleep a little better at night.
And yet, the most significant part of the C code issues we find is memory corruption. Which is either significantly harder to cause or impossible-by-design in many alternatives. Unless you can realistically say "people working daily, for years, on huge C projects write those bugs, but they're reckless and I'm better than them" - yes, C should be scary to you these days.
We can't even agree on safe string functions for C, half a century later. You shouldn't have security bugs baked into the standard library and you shouldn't have to do a mountain of research to know which functions are safe and in which cases.
However, for most things non-string, non-pointer, and non-array, I agree with you.
It's easy to say "You shouldn't have security bugs baked into the standard library" but it's a lot harder to say, "we're breaking decades worth of working code by removing some functions that have been part of the standard lib and were widely used."
We don't even have a string library in C. Strings are Unicode, not just zero-terminated buffers. You cannot find strings, nor compare them. In the kernel you have filesystems and login systems using unidentifiable names. Because the kernel has no identifier support.
And for insecure standards, the committees rather want to eliminate the safe functions, than fixing the spec bugs or add u8 support.
I've been programming in C for 20 years, and I think it's scary. Not because I don't know it, but because I know exactly how many cases of seemingly normal code can hide UB. I know from experience that even the best programmers following best practices will make mistakes (or run into someone else's). C is an extraordinary amplifier of bug severity. I know how much diligence, effort, and tooling it takes to merely not screw things up in C.
I've seen time after time people saying "nah, C is fine, you just avoid this and that, use these tools, etc." and this turning out to be insufficient. I've heard many times "maybe you're just a bad programmer and can't handle C, but I'm a good programmer and have no problems" and their code not surviving 5 minutes of fuzzing. I've seen people conclude that multi-threading beyond simplest constructs is just infeasible to get right, and think that's an inherent property of threading, and not fragility of C.
Yeah, I find it astonishing to find C programmers that see no problem, even though I think it would be reasonable to say that they see no reason to change. It's mind boggling.
Have they not worked on large projects with other developers? Have they not seen the myriad of ways things can silently go wrong for seemingly no technical benefit? (although I know there's often less obvious reasons, for eg. UB, performance, platform specificity, etc.)
All that, said I do think the following might be reasonable:
> "nah, C is fine, you just avoid this and that, use these tools, etc."
I guess you mean they write off the risks entirely? You should never be this "handwavy", and should always take the risk seriously, especially with a language like C. However, I think it's fair to say, that C is a good choice, many of the risks can be mitigated, and it's not THAT big a deal. In which case, the above doesn't seem that absurd.
Following basic common sense and making an effort to identify and eliminate some of the sketchier situations, backed by some really good integration testing, can really help. I feel reasonably safe under those circumstances (I mean not really, but other languages can be "unsafe" too). A huge chunk of the really evil things I've seen have been the result of taking absurd risks, and/or disregarding the rules entirely. If you were paying attention and "trying" to write good C code, they would never happen; these aren't just individual developer things, but project wide. Eg. I had a compiler that didn't even warn for implicit functions... jerk move by TI, but that should be flagged and dealt with. Instead, the team just thought "great no compilation errors".
I will say there's a lot of developers that are just uninterested in any of this and will deliver some really, really sketchy C code. In their mind they're smart programmers and their code will just be right, and they don't seem to understand any of these issues. Just plow ahead and patch around the bugs, then move on to the next gig.
Skilled C programmers make these mistakes day in day out, we write too much code to trust a language that just let's it happen. Especially one that encourages the writing of code multiple times rather than reuse.
I don't know about that. I agree about it being a popular idea with people that don't understand C. Plenty do and still harbor those opinions. I've been mortified by what I've seen in C, and it seems really preventable.
Lots of developers out there are widely negligent. C provides them with enough rope to hang themselves. To be honest, I'm a little surprised to find veteran C developers that AREN'T scared. I guess they just see every disaster as the fault of "negligence and recklessness". If you're not scared of your code (mistake), you should certainly be scared of other peoples.