While I really enjoy Theo's talks and writings, I wonder if the fact that the VCS is CVS ain't a security issue in itself?
It's been really a long time I haven't used CVS but I remember that attempt to introduce backdoors in projects using Git as a (D)VCS have been caught (it was in the Linux kernel I think). IIRC some attempts were caught precisely because it's hard to fake SHA hashes and so people can't really "mess" with the history of a DVCS like Git: too many people noticed a critical file having no business being modified being, well... Modified.
Once again, it was quite a while ago but I'm pretty certain that both the fact that Git was decentralized and that Git was using cryptographically secure hashes was touted as a "Good Thing" [TM] that helped catch the backdooring attempts.
Git uses SHA-1 hashes, which have not been considered cryptographically secure since 2005.
Git and CVS are both just tools. They each provide a server implementation, but it's uncommon to use either of these for write access in large projects. It's more common to wrap CVS or Git with a different frontend like HTTPS or SSH. My guess is that the OpenBSD guys use OpenSSH.
This team is fanatical about security and process. I am completely comfortable with them using whichever tools they want.
SHA-1 has been shown to not be collision-resistant. Correct me if you've heard otherwise, but I believe SHA-1 is still believed to be second-preimage resistant.
In other words, with fewer than
2 * * 80
trials, an attacker can generate two files that have the same SHA-1 hash. (In the case of C source files, this probably means embedding a nonsense comment in the middle of each file, probably several hundred to several thousand ASCII characters.) If an attacker can get the very carefully constructed benign file past code review and have the ability to modify the repository, they can substitute the very carefully crafted malicious file for the very carefully crafted benign file without changing the root of the Merkle tree.
Assuming that SHA-1 is still second-preimage resistant, it will take an attacker about
2 * * 159
attempts to come up with a file that has the same hash as a legitimate file not carefully constructed by the attacker.
So, the weaknesses in SHA-1 probably mean that exploiting those weaknesses in the context of git still requires a mole in the development team. Though, I wouldn't want to bet my life on nobody noticing a big nonsense comment in the middle of a C file or someone figuring out how to construct a reasonably reviewable C file as the carefully crafted benign file.
In any case, the weaknesses in SHA-1 still likely pose a significant difficulty in forging a git history without planting a mole in the dev team. It's much better than no cryptographic barriers to forgery.
Conversely there are other signing techniques. GPG signed tags is an officially supported method.
Probably more importantly though, Git encourages everyone to have the full repository lying around. Even if you inserted a vulnerability in a master, there would still be thousands of copies of code which could be independently compared to find the exact changes which were made.
I think gpg signs just the sha1 the tag points to (root of merkle tree). Also, when comparing local repo against remote repo during fetch, I think git assumes that as long as the sha1 of a commit did not change, there is no need to compare further. So the substitution will not get propagated to people who do "git pull" but people who do "git clone" will get it.
Linus Torvalds: "the point is the SHA-1, as far as Git is concerned, isn't even a security feature. It's purely a consistency check. The security parts are elsewhere, so a lot of people assume that since Git uses SHA-1 and SHA-1 is used for cryptographically secure stuff, they think that, OK, it's a huge security feature. It has nothing at all to do with security, it's just the best hash you can get."
My reply was not intended as an attack on Git. I use it daily and would choose it 10 times out of 10 vs. CVS for a new project. I just think the assertion that Git 'saved' Linux from some backdooring attempts because it's decentralized and uses cryptographic hashes is wrong; it's not the tools that make this happen, it's the processes around the use of these tools which do that.
I don't know any OpenBSD developers nor do I have any inside knowledge of how their team works, but I know from observation that they are a small team with high standards for code style and quality. They don't just let anyone commit code and appear to be thorough with code review. When procedural/practice problems are identified in the industry, they are proactive about mitigating or fixing those. They have a demonstrated track record of good releases. Basically, I don't see any reason to question their use of CVS.
(first note that I didn't assert anything: I asked question(s) and used "IIRC" etc.)
I found the story back and things are, IMHO, actually quite interesting... If only because the attempt was made after someone ill-intentioned gained access to Linux's CVS repository.
Back then Linux was still using BitKeeper (decentralized) for Linus hadn't created Git yet (so I was not remembering things correctly here). But apparently some people didn't like BitKeeper so there was a CVS clone of the BitKeeper version. And it's in the CVS repo that the attempt took place (after someone hacked his way into the server hosting the CVS repo).
Now even though Linus didn't choose SHA-1 for its cryptographic properties and even if SHA-1 is not SHA-256 nor SHA-3, it still looks like an attacker gaining access to a CVS repo would have a much easier time inserting a backdoor than an attacker gaining access to DVCS using cryptographic hashes (which user KMag here explained nicely).
> Basically, I don't see any reason to question their use of CVS.
Why not ?
With CVS, the security rely on the security of a single server. Anybody with root access to the CVS server can modify history, and nobody would notice.
> Git uses SHA-1 hashes, which have not been considered cryptographically secure since 2005.
That a vast oversimplification. There are still no publicly-known preimage or second-preimage attacks against SHA-1. Even the collision attack in 2005 was limited insofar that it reduced the search from a brute-force 2^80 to 2^69.
Perhaps you're referring to a length extension attack, but I admit I don't know much about those in the context of Git's use of SHA-1.
In terms of C source files, unless the fist C source file being extended is at least 64 petabytes long, a length extension attack is going to embed null bytes in the C source file. I don't know chapter and verse of any of the C standards or GCC/Clang extensions, but I wouldn't be surprised if even string literals and comments including nulls cause problems for both GCC and Clang.
Anyone care to chime in/experiment with ways to embed nulls in C files such that either GCC or Clang will continue compiling code after hitting a null byte in the middle of a file? (I'm not talking about an escaped null in a character or string literal, but actually a 00 showing up in hexdump -C of the source.)
EDIT: I think it's safe to assume people will notice someone trying to sneak a 64-petabyte C source file into the codebase. With apologies to Sweet Brown, aint nobody got time for [downloading] that.
"I am completely comfortable with them using whichever tools they want."
Forgive me for saying so and I doubt you intend it so, but statements like this is what gets us into trouble. At a certain point we need to trust the people that build the foundation for us, but only once we've done our due diligence. Most people, maybe not you since you seem to know more about the team, have not yet.
As far as I read up about this some time ago, it boils down to the following reasons:
- CVS is simple - it's small (fits on a floppy with bsd.rd) and easy to use on exotic/slow architectures
- CVS is preferable for their current contribution workflow (centralized)
- CVS is treated as a tool that, currently, just works (tm)
It seems ever-so-slightly ironic that support for exotic architectures is considered a plus in this case, while support for exotic architectures is being ripped out of the OpenSSL/LibreSSL codebase. (Though probably not deeply ironic, since CVS is comparatively simple.)
It might seem ironic to people who do not understand the difference. OpenSSL includes wrappers and (probably untested since N years ago and therefore broken) code branches to work around bugs in old broken exotic systems that do not get fixed. This sort of code is a maintenance burden if you try to keep it working and just useless if you don't. It is also likely to introduce bugs. Getting rid of it simplifies the code and gets rid of bugs along with it.
OpenBSD's system level support for exotic platforms has more to do with drivers, compiler & tool chain support, boot code, etcetra. These do not create a jungle of ifdefs and workarounds in userspace applications. And if these platforms are broken, they are fixed instead of worked around with more jungle and weed. So support for these things does not add bugs for others.
On the contrary -- getting your software to run on a SPARC64 is more likely to expose flaws in your code.
While I really enjoy Theo's talks and writings, I wonder if the fact that the VCS is CVS ain't a security issue in itself?
It's been really a long time I haven't used CVS but I remember that attempt to introduce backdoors in projects using Git as a (D)VCS have been caught (it was in the Linux kernel I think). IIRC some attempts were caught precisely because it's hard to fake SHA hashes and so people can't really "mess" with the history of a DVCS like Git: too many people noticed a critical file having no business being modified being, well... Modified.
Once again, it was quite a while ago but I'm pretty certain that both the fact that Git was decentralized and that Git was using cryptographically secure hashes was touted as a "Good Thing" [TM] that helped catch the backdooring attempts.
Ain't using CVS potentially an issue here?