Hacker News new | past | comments | ask | show | jobs | submit login
Protecting Mozilla’s GitHub Repositories from Malicious Modification (blog.mozilla.org)
152 points by jvehent on Sept 14, 2018 | hide | past | favorite | 54 comments



> Production branches should be identified and configured:

> ...

> Require all commits to be GPG signed, using keys known in advance.

Is it possible to configure "all commits gpg signed" on Github? I haven't seen this option.

Another interesting thing that Github lacks is signed git pushes (`gpg push --signed`) that allows audit logging who moved which object to which ref.


> Is it possible to configure "all commits gpg signed" on Github? I haven't seen this option.

Settings > Branches > [Add Rule] > <Apply Rule To> "*" > check "require signed commits".

At least according to the help text:

> Commits pushed to matching branches must have verified signatures.


I've been signing my GPG commits for five years now.

The problem is that they don't allow GPG keys to be sunsetted. "Verified" should be a property on the commit, not something computed. If I replace my GPG key with something more secure, but I have no reason to believe my former GPG key was stolen, I should be able to keep trusting the other commits.


If you rotate your signing subkey then this would work as expected.

Rotating primary/master keys is a problem in OpenPGP in general (not just Github).


Yeah.

I guess I've tended to shy away from subkeys because I figure losing the primary is a function of time. Basically at some point a virus, or improperly secured backup, is going to get my key.

Though I suppose the pro way of doing this would be to set up some kind of air gaped PC, generate the subkey, then display it as a QR code or something.


Yes, that's what I did but instead of copying the subkey via QR I put it on a Yubikey. Actually it's more convenient that way as it's possible to use it on any computer.

But from my perspective if you keep your master key super secure subkeys can be your "operational keys" that are easily swapped when needed (I rotated my subs when Yubico Infineon bug surfaced).


Air-gaped PC could be any old smartphone you (or some friend) have lying around and set it on airplane mode then use it to generate keys as QR images (if you are truly paranoid then root it and delete the wifi/bluetooth drivers or use a home-made faraday cage)


Thank you!


Proposed code changes should in any case be reviewed, tested, then committed by someone with the right to do so (which should be a restricted right).

This is what really prevents malicious code changes.


Signed git pushes give you an opportunity to check if they really were "committed by someone with the right to do so". Without git push --signed you have to basically trust Github (which may be fine for most people's needs).


I don't know how github works but I would hope that there are users-based restrictions.

"you have to basically trust Github"

And that's why entrusting a third party with managing something as critical as your source code is unwise.


>And that's why entrusting a third party with managing something as critical as your source code is unwise.

Yes, but signing your code strongly reduces the need for trust. If every commit is signed, and every checkout and clone checks that commits are signed by one of the trusted parties, Github has no chance to insert or modify code.


> (...) Github has no chance to insert or modify code.

Yes, exactly! The only slim window is refs. All-signed-commits still does not protect against someone pushing signed commit from branch "testing" to branch "master". But `git push --signed` [0] makes it possible to have an audit log of all ref modifications.

[0]: https://github.com/git/git/commit/a85b377d0419a9dfaca8af2320...


If you do not trust Github then signed commits will not help since Github physically controls the repository.


Github can't fake the sigs


Sigs are not important if they hold the repository to start with. If the premise is "I don't trust Github" then the only option is not to use Github.

And, in any case, sigs do not prevent malicious code changes.


I don't understand what you mean exactly. What attack vector are you considering?

If I sign a commit and push it on github anybody else can pull it from github and if they have my key than can validate that it is indeed me who made it. The only thing GH can do is modify it and strip the signature, modify it and replace the signature with a "fake" one that may fool people who don't have my public key or simply drop my commit altogether if for instance it contains a security fix (but then they also have to drop all future commits referencing this one since the hashes won't match).


@danieldk: "If we compare the hashes of the latest commit two git repositories"

Which implies that you hold a copy of the repository, which is therefore the master copy, as opposed to trusting a third party to do that.


Which implies that you hold a copy of the repository, which is therefore the master copy, as opposed to trusting a third party to do that.

No, you do not need to hold a copy of the repository. You only need a known-good hash. Commit signing is a way to vouch that a hash is good.

Similarly, when downloading a distribution ISO, you do not have to trust the mirror, you only need someone trusted (e.g. the distribution maintainer) to vouch the correct hash of an ISO.

In both cases, you can hash the data that you retrieved and compare to a hash that is known to be good.


@danieldk: "No, you do not need to hold a copy of the repository. You only need a known-good hash. Commit signing is a way to vouch that a hash is good."

The overhead is such that you are better off holding the repo yourself a la linux kernel.

Note also that SHA1 collisions are a practical attack these days: https://shattered.io/ . So if you rely on SHA1 hashes you have no guarantee...


HackerNews pro-tip: you can react to a comment during the cool-down period by clicking on the timestamp of the comment ;). Of course, don't abuse it, there is a reason why the cool-down is there.


shattered is not (yet) a practical attack on git because of the way the data is represented before being hashed. Managing to get a SHA1 collision while simultaneously getting a valid git object and a meaningful commit (with, say, a backdoor inserted instead of pseudorandom garbage that will fail to compile) is still vastly too difficult to be practical, at least with publicly available knowledge.

Regarding the overhead I don't know what you mean and you're really grasping at straws at this point, how is validating a GPG signature harder than validating that, say, a website is actually hosting the official repo of a project instead of a malicious fork? What if my self-hosted website gets hacked mitm'd or DNS highjacked? GPG can still be used to validate that my commits are valid in these situations. Actually I'm not even tied to github or anything else in this situation, any mirror can be used without fear as long as my keys are secured.


Collision attacks allow somebody (say Greg) to produce two documents A and A' that hash the same. So hash(A) = hash(A'), with the result that signatures over A can be attached to A' and they work.

But this attack does NOT allow an adversary who sees some document B to produce a forgery B' such that hash(B) = hash(B')

If you believe Greg is a bad guy, don't let Greg make signed commits to your system. Trusting Alice and Bob to make signed commits doesn't allow Greg to attack you with a collision, only trusting Greg would do that, so don't trust Greg.


"If you believe Greg is a bad guy, don't let Greg make signed commits to your system."

That's exactly my point.


Is "let untrustworthy people change our code" a common approach in GitHub? Every GitHub project I've ever worked on expects all but their trusted insiders to submit PRs

If you can't trust Greg not to make collisions, you definitely shouldn't let him write code!


You cannot guarantee the integrity of a repository that is controlled by a third party. You have to trust them.


You can. Git's DAG is a Merkle tree. If we compare the hashes of the latest commit two git repositories and they are identical, then they have exactly the same history up to and including that commit [1].

If GitHub modified a commit, the hashes of all commits starting at that commit would change. This is why commit signing is important, it makes it impossible to change commits up to the signed commit, since it would change the hash of the signed commit and signature verification will fail.

[1] Or you found a way to modify the repository that leads to a hash collision. Which is unlikely.


I don't trust my ISP (that much), yet I feel fine to use my ISP to access my bank. That's because TLS protects me from malicious modifications by a middle man whose job is just to pipe my data.

Suppose I don't trust GitHub. What makes you think I can't use signatures properly so that I can still trust the code hosted on GitHub, safe from malicious modifications by GitHub whose job is just to host my code?


Sigs are not important if they hold the repository to start with.

Why not? Someone can clone the repository and verify using my public key that I signed a commit. If GitHub modifies the repository, the chain of hashes changes, and the signature would be incorrect.

And, in any case, sigs do not prevent malicious code changes.

That is true. But if every commit was signed using a known signature, then you know who injected the malicious code. For a third party to inject malicious code, they would have to compromise the one committer's machine and/or key, rather than GitHub or a specific GitHub account. Also, once the attack is detected, you know which commits are potentially bad, namely those signed using the compromised key.


At the size of Mozilla I wonder what would be the threshold for them to decide to commit to basically move to their own github like interface to manage all of that.

I assume even maintaining a private gitlab would have a non trivial cost and need dedicated OPS as well at their size.


They already do that for their main codebases; as the article says, this is just about their use of GitHub for other projects.


It's not just a matter of operational cost.

We spend a lot of time and effort making our code accessible to new contributors, and lowering that barrier is very important. Contributing to code in Mercurial is hard (but getting easier with Phabricator), whereas code in GitHub is easily accessible.

There is also the question of integrating with external services. GitHub has a very mature ecosystem that's helpful to developers and lets us ship better products faster.


Considering how sensitive piece of software browsers are these days, I would think that security should be the priority.


It is a shame that a lot of critical projects (including compilers, browsers...) still try to do things a la CVS/SVN (even if they use a DVCS).

Please, stop it. Do it the way the kernel does it. A hierarchy of maintainers that reviews the work sent by others and a single person with commit access to the main repository.

I am amazed that these smart people have not realizead yet that unrestricted commit access is simpy a no-go, with or without signed commits/tags.


Github lowers barriers to entry for a lot of people and I guess Mozilla likes that.

Compare creating a one line fix for a project on github and having a discussion and review to sending a properly formatted patch to a mailing list and following up on that.

I know that having properly configured mutt and git send-email eases most of these pains but not everyone uses mutt and browsers have better consistency between them than e-mail clients. You don't need to go through a guide to configure your browser to send patches and participate in a review (vs https://nanxiao.me/en/configure-thunderbird-to-send-patch-fr... ).


It's also a balance... Most experimental researy projects don't need high security or high code quality. They just need to prove viability, etc..


> Compare creating a one line fix for a project on github and having a discussion and review to sending a properly formatted patch to a mailing list and following up on that.

For the former:

* Create a Github account

* Fork the repository

* Set up the new git remote

* Push to that remote

* Open the pull request

* Wait for notifications via email

* Go back to the webpage and find the comment

* Modify the code, add, commit, push, and type a comment

* Repeat

Versus

* Configure git to use one's ISP's email server

* Run git format-patch

* Run git send-email

* Check email for replies

* Amend the commit, run git format-patch, and git send-email

* Reply to the maintainer email

* Repeat

To reply to emails, you can use any email client or even do that in the browser. You only need to use the format-patch and send-email commands to send patches.


1. Bias:

(a) You included setting up your local repo from the fork in the Github list but it's implied in the 2nd list

(b) You included "Modify the code, add, commit, push, and type a comment" in the first list: this is literally how you use Git for normal development. It isn't an exclusive part of using Github

2. Network effect:

(a) Creating a Github account is a one-time step: this will be omitted for any platform with a large projectbase

(d) Most popular modern tooling/documentation/etc. are set up for the Github-ish workflows and general user knowledge of `format-patch` and `send-email` as commands is considerably more sparse.


> You included setting up your local repo from the fork in the Github list but it's implied in the 2nd list

Presumably one would have to clone the repo in order to make the modification locally before forking. But whether or not one would fork the repo through the GUI before cloning it depends on the person. I don't know which scenario is more common. If it's the former, then they would have to handle setting up the new remote for the fork before pushing up their code.

> You included "Modify the code, add, commit, push, and type a comment" in the first list

In the second list, I did state: "Amend the commit, run git format-patch, and git send-email" which is essentially the same thing. The only difference is that the first list implies that you make more commits (which is in line with the expected pull request workflow on Github) while the second involves amending the commit and resubmitting the patch (which is in line with the expected workflow for email based review).

> general user knowledge of `format-patch` and `send-email` as commands is considerably more sparse.

Github explicitly tells you how to set up a remote when you create a new repo and tells you how to clone a repo and has pages that tell you how to handle pushing code up to the remote and how to set up your git config for your name and email address. They don't assume that anyone has knowledge of those commands (git push, git config --global --add user.name, git clone, git branch, git remote add, etc). There's no reason why similar quick to read documentation for git format-patch and git send-email couldn't also be provided.


> You included setting up your local repo from the fork in the Github list but it's implied in the 2nd list

I didn't read your statement carefully enough when I replied earlier. For the second list, you would only have to clone the repository and make your modifications locally. You wouldn't push your changes up to a remote repository; you would only send email messages with the commit message information and the diff.

This means you wouldn't have to go through the step of forking the repository and setting up the new remote before pushing the changes and going ball to the UI to open a pull request (though setting up the new remote isn't required if you fork the repository first and then clone from the fork--though that would make keeping your local copy of the repertory up to date more difficult).


Create a Github account

Which most people have already and it allows you to contribute to a gazillion of repositories.

Run git send-email, Check email for replies

There is one important step missing here: Find out who to send the patch to or subscribe to a mailinglist (and later unsubscribe). In the former case, if a maintainer goes AWOL, the patches that they received are lost. In the latter case, subscribing to a mailinglist, let alone for every project that you want to do a drive-by submission, also involves several steps:

- Figure out how to subscribe

- Subscribe

- Set up filter in your e-mail client to avoid cluttering up your inbox.

- Unsubscribe when you are done

To reply to emails, you can use any email client or even do that in the browser.*

You can also reply to GitHub issue e-mails.

(I am not a big fan of GitHub, but I think it is important to understand why a lot of people find GitHub so convenient.)


> Which most people have already and it allows you to contribute to a gazillion of repositories.

And pretty much everyone has an email account. In fact, I don't think it's possible to sign up for a Github account without providing an email address.

> There is one important step missing here: Find out who to send the patch to

Many projects have a README or CONTRIBUTING file that describes the necessary steps. Looking at projects that don't accept Github pull requests (like the Linux kernel or git itself), there are (as of the time I'm writing this post) 225 and 155 open pull requests on each respective project, which indicates that people aren't reading the documentation before trying to make a contribution.

> In the former case, if a maintainer goes AWOL, the patches that they received are lost.

The same thing can happen in Github if the maintainer never bothers to acknowledge the pull request or just closes it. Finding a particular patch out of the hundreds of open but not acknowedged pull requests in the linux and git repositories isn't going to be easy.

>> To reply to emails, you can use any email client or even do that in the browser.

> You can also reply to GitHub issue e-mails.

I made the statement about using any email client (local or web based) because the post I replied to stated that one needed to use a TUI client like mutt.


> I made the statement about using any email client (local or web based) because the post I replied to stated that one needed to use a TUI client like mutt.

Wouldn't using HTML email (default in most modern clients and webmails) be frowned upon by the ML participants?

(for the record I liked your comparison)


> Wouldn't using HTML email (default in most modern clients and webmails) be frowned upon by the ML participants?

It depends on the mailing list. They may not mind in terms of general correspondence or review of code, but they would definitely have strict requirements for emails that contain patches that a maintainer would have to apply (so that it works with git am).


For most folks who already have an account, it's just a matter of clicking the "edit" icon on the top right of a file, making the change, then saving it. GitHub does everything else under the hood, and creates a fork and pull request programmatically.

We can debate the value of centralized web interfaces to vcs systems all day long, but the fact that they immensely simplify contributing to open source software cannot be understated.


> For most folks who already have an account, it's just a matter of clicking the "edit" icon on the top right of a file, making the change, then saving it.

From what I've read, it could lead to badly formatted files and commit messages (since the online editor may not handle wrapping or spacing correctly). Plus, how does one test a change they make if they just made it in an online editor? If their contribution has syntax errors because they didn't even run the code, then is it a worthwhile contribution?


Modern projects don't rely on local linting and testing, but instead leverage continuous integration platforms like travisci or circleci for verification.


Does that apply to forks of the project? Or is the only way to test the project is to open the PR and check the build status on the upstream?

Personally, I would like to know that my code is in a working state before I actually submit it to the maintainer for review.


There is no proper tooling to support maintainer hierarchies beyond git itself, and most people would view it as a step backwards if a major open source project moved entirely to email.

I agree that letting everyone commit is a bad idea once you reach a certain size (though I prefer to limit commit rights to a group of code reviewers, which integrates nicely with github&co). But not every mozilla project has reached a size where the overhead is worth it.


It depends on which kernel you mean!

https://www.freebsd.org/developers/cvs.html

https://cvsweb.openbsd.org/

Dragonfly and Illumos use git, though.


It seems strange to me too. It's not just that, but I have seen more than once someone well-meaning person changing history and pushing it, causing problems for other maintainers. Git is a complex tool and as a result more complex actions tend to fall out of memory and tend towards copy-pasting from a Stack Overflow answer.

> A hierarchy of maintainers that reviews the work sent by others and a single person with commit access to the main repository.

This is surely the whole point of having a DCVS. You can just produce a request to pull and the person with push access can review it before it's committed.


I completely agree.

Critical projects are used in self-driving cars now. It's time to grow up.


Is this comment targeted at Mozilla? My understanding is that they fully host on Mercrual.


Firefox and related code is on Mercurial, self-hosted, and mirror on GitHub. But Mozilla is a lot more than just Firefox, and tons of our applications, services, experiments and so on are hosted on GitHub.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: