Perhaps this is a good opportunity to change that. Actually, I'd expect at least the bigger distributions to use git already. Git enables them to pick and choose from the updates and merge it with their own patches (alternative schedulers and the like).
But you're right, as long as they serve tarballs, they should make sure they are safe. I've rolled some kernels myself and always used th e tarballs from kernel.org. Time to practice what I preach an rebuild from git.
All tarbals are signed via linux kernel archives gpg key, which according to the kernel.org security breach announcement was not compromised. So I don't see a problem here.
The downloaders will have to be aware of the "no warranty" clause of the license Linux uses, and take responsibility for the software they bring into their environment. The kernel developers provide instructions[1] on how to cryptographically verify the authenticity of a downloaded kernel, including building a PGP "trust path" to their key.
So? An attacker's not going to have the private key they were created with, and probably won't have a private key that you have an existing trust relationship to.
The linux kernel archive public key (which is needed to verify the signature) is well known and many people have it's own copy for several years. But if the atacker gained access to the private key, he would be able to sign trojanized tarballs without anybody noticing. On the one hand kernel.org page doesn't mention this kind of breach and the key is still used, on the other hand there are rumors (https://lwn.net/Articles/457142/) that the private key was available on the compromised server so that the atacker could produce trojanized tarballs with proper signature - but I find it unlikely because the kernel.org admins doesn't warn users about it and haven't changed the key.
this post assumes (if i have understood the argument correctly) SHA-1 is secure. it's isn't - it is well and truly broken. people don't use SHA-1 for security any more (well, they certainly shouldn't). is their any real analysis that shows that git is secure, given the weak nature of SHA-1?
It's not quite that simple. When cryptographic functions are broken, practical vulnerabilities are usually years away. A reduced SHA-1 collision time is certainly a good reason to move away from SHA-1, but it doesn't mean that an attacker can compromise an existing piece of code secured by a SHA-1 hash.
In this case, it is not merely sufficient to find a collision, but to find a collision that is also a valid Git commit, implements a vulnerability, and, ideally, is not immediately obvious at first glance it is a hack (i.e. having kilobytes of garbage data in the commit diff). This is much harder than just finding a random set of bytes that just happens to match the commit SHA-1.
Now, it could be that an organisation like the NSA is sufficiently ahead of public crypto technology that it is capable of not only finding a SHA-1 collision in feasible time, but is also able to craft a malicious Git commit with an identical SHA-1 to a legitimate commit. Inserting such a commit into the Linux source in an appropriate place might result in compromised kernels appearing in commercial products.
But... I kinda doubt anyone is that far ahead, and in any case, it seems extremely risky to try to play such a high-value advantage in a public repository. If someone happens to spot your faked commit, suddenly everyone knows what you are capable of.
So whilst Git would be more secure using SHA-512 or an equivalent, it's currently very unlikely that anyone has the practical capability and will to compromise the kernel's commit log.
i am not saying git is insecure - i am saying someone needs to do exactly the analysis you are sketching. you can wave your hands and say that this is not a problem, but it's been 6 years since sha-1 was broken and that's an awfully long time.
also, i find it absolutely typical of the place that HN has become that my original post asking a reasonable, informed question with reference is voted down. you're a bunch of mindless fucking morons.
Your comment was accurate, but somewhat combative, which may be why it was down-voted. I suspect if you had said the same thing, but had phrased it a little differently, it would have been up-voted.
That said, it was not an unreasonable question to ask, and so I've up-voted you.
What doesn't help is calling everyone "mindless fucking morons" after only one or two down-votes. It's not constructive, and won't get you anywhere. Also: chill. Just because one or two individuals didn't like the tone of your original comment, doesn't mean everyone on HN is out to get you. Well, at least not before you insulted them.
Between the spelling mistake, lack of capitalization, and tone I certainly would have voted it down if it hadn't been the only post in this thread bringing up an important point. As it was I felt I had to vote up.
If they could slip something into the git source then from there, they could tamper with any git project undetected. But since git is presumably self-hosted, it would still be profoundly difficult to get everyone to upgrade to the evil git without noticing the attack. It would require some devilishly underhanded code: http://underhanded.xcott.com/
Speaking as a regular git user, I assure you that a hash change would be noticeable. Rebase would look for a common hash between your history and the remote history you just fetched and rebase your commits on top of it. When there is a mismatch, you experience a twilight zone where there are attempted merges of code that your changes had nothing to do with. At this point, it is apparent the remote history has changed, and in the process of trying to figure out how to rebase cleanly, your eyes will be on that foreign code.
Given the number of changes coming in every release, more twilight zone experiences increases the number of eyes on the discrepancy.
I am confident in the sanctity of the git-sourced code, thanks to there being multiple version of the same repository around that people are actively working on. I am more worried about all the stuff I might rely on that is not git-sourced.
If the history is tampered during a rebase how would you notice?
Suppose a tree well known for rebasing frequently is rebased on kernel.org, and the dev doing it works from the k.org servers (might be possible, since they give shell access).
Then downstream would just see it as yet another rebase, no?
Is this the hypothetical situation of someone doing an interactive rebase on public/published branches and making changes midway through? My current understanding is that, although it is possible, the community avoids doing that (policy: published code history set in stone). That style requires strong communication between developers. It makes people do extra investigation and work, and we all hate doing extra work, right? :)
I cannot speak for what they really do over at kernel.org, but if they do rebase their public repositories often (which will break people doing pushes and fetch/merges without some regular communication), yes, it is possible to sneak something in because the public rebase will not be as exceptional.
This is where you tell me that's what they do there and make me scared again. ;)
The latest git SHA1 sum is not enough to check. The interesting part for intruders are "grafts":
"Graft points or grafts enable two otherwise different lines of development to be joined together. It works by letting users record fake ancestry information for commits."
https://git.wiki.kernel.org/index.php/GraftPoint
Really the point is that nothing can be injected into the git history undetected. To add any new code to the repository it needs to be put at the top of the 'stack' as the last commit. So even if someone got access to Linus' (or any other high level dev) machine they wouldn't be able to inject malware undetected as a look at recent commits would show the changes made.
Yep, the very detailed article comes down to one basic fact: "we always have lots of copies on our and other people's machines so we can always track ANY possible modifications."
In that regard, I guess it could be more "likely" to sneak in a very, very well hidden and cryptic exploit in contributed code.
Linus, however, is not distributed. If a malicious commit was discreetly slipped into his repository as a seemingly Linus-sourced change, there is enough trust in him that the change would likely propagate.
What if they wanted to modify the source archives of some other project hosted there? Obviously the kernel is a big target, but it seems there are lots of other less-widespread things on there that would be useful to backdoor.
So, this is very annoying for people who took kernels tarballs in the past from kernel.org.
Sure, they can regenerate all the tarballs from the git repositories, but this isn't the point for the past actions.