If you think about it a bit longer you'll realize it can not be a single name. Yes, someone wrote this code. But then someone else reviewed it and then a third person signed off on shipping the end result. It takes (or really, it should take) more than one person at a company like Juniper to get away with a trick like this. Unless their processes are completely broken, let's for the moment assume that they are not otherwise all bets are off. Juniper is going to be scratched from a lot of POs because of this story, if it turns out they don't have a process in place to protect against stuff like this and it does not involve two people or more operating in concert to circumvent that process then they're pretty much dead.
If I was doing an attack like this, I'd wait for an opportunity to slip it into a huge, mundane change.
For example, changing the repository from Mercurial to Git. Splitting or combining two repositories. Moving lots of files between directories. Running an autoformatter over the entire codebase. Something like that.
Every code review tool I've seen there's /some/ way you can get it to highlight thousands of trivial changes. And I don't know about you, but I can't promise I'd spot 1 evil change among 1000 trivial ones.
Yes, that would be the way to do it. And that is exactly why a change like that is just as dangerous as any other.
I did quite a bit of code review during a contract earlier this year. One day one of the programmers shows up with a 100K line changeset, a reformatting and clean up that he'd embarked on of his own accord. Imagine his surprise when I point blank refused the commit. He was extremely upset (understandably, it was a lot of work and well intended) but at the back of my mind were two things: one, we are already under pressure to get things to work and this does not add anything functional. two, if it does add anything functional it will be an error or a bug and there is no way I'm going to catch this manually reviewing all these changes (across 100's of files). So the safe thing to do is to simply not accept the change unless I can free up developer time to break this massive commit into smaller ones that can be reviewed and accepted one-by-one over a longer period of time. And that was a luxury we could not afford.
So too bad, one very inflexible reviewer and one very pissed off programmer. 1000 trivial changes in a single commit isn't going to fly (at least, not with me), unless I review each and every one of those with just as much attention as I would review a much smaller commit. It's that sort of attention to process details that probably makes me 'less than easy' (to put it mildly) to work with but I really feel that if a customer trusts me that I'm going to have to earn that and rubberstamping a change set because it is too large to review is definitely a breach of that trust.
That's a great way to do it: fix-as-you-go. The same goes for adding unit tests where those are missing (in bulk, rather than as a set going with a recently touched piece of code), fixing up documentation issues and other global changes such as naming conventions and so on.
Hitting all of the code base in one go with a thing like that is asking for trouble.
Downvoters are cordially invited to state why they disagree with this.
I work on a team maintaining a large code base. We follow a similar rule concerning unit tests. Our testing over 5 years has slowly grown from 0% to give us ~75% coverage, by using the methodology you describe.
I believe that's the wrong way to do it. You can only prettify/reformat if you aren't making changes to the file. That way you review only the prettify changes and not the functional changes.
Agreed, refactoring and functional changes should certainly be in separate commits, and ideally in separate pull requests. jacquesm can still have her/his rule to only change code involved in the course of functional changes; it's just that the refactoring and functional-change PRs will come in close succession and the refactoring PR exists only because the functional-change PR does also.
I agree with you, beautifying and reformatting code can mess up the diff, and all of a sudden, someone ends up with their name pointing to code they didn't right. Beautifying, reformatting should only done alone.
I'll offer another opinion on this - I an a frontend developer and so it's very common for me to have to deal with files that are written in PHP, HTML, JavaScript, and also sometimes including CSS embedded right in it.
Roughly speaking, PHP and JavaScript have a similar syntax and can be formatted/styled in a similar way, but HTML it a totally different beast and has very different structure, as well as CSS which is a glorified list of rules.
I don't refactor code for the sake of it, however, in an effort to read related files I often clean things up that I must read in order to edit another section. I don't discard my cleanups because I'm leaving it better than when I found it (and cleaning up a messy repository is tough, but a little in each commit can help).
So I agree that you should style where you're working - but there are valid reasons to refactor or restyle code in other places too. Here's an example of a file I've been updating lately:
There are exceptions to every rule, and yours is a valid exception to the rule that in general one should not commit non-functional change-sets. If something is particularly ugly then you could clean it up but that change should not be taken for granted not to have any adverse effects, there is really no such thing as an 'innocent' change. I've seen whitespace changes break code far too frequently to not be suspicious of any change at all.
You could use a tool to compare AST's for equality, at the least you could use it to highlight the real changes as opposed to whitespace cleanup, which would make this job much easier. Or perhaps compare binaries directly if you set up deterministic compilation.
Most repositories allow you to ignore whitespace changes if you want to.
For instance, github allows you to do a ?w=1 appended to a diff url to see the differences without seeing the whitespace differences (that's like git diff -w).
That's the correct choice, although it would have been nicer to review it (if possible).
That said, huge commits can be ok if they can (a) be reviewed very quickly, (b) be entirely generated by a script (review the script, and ensure that running it generates the proposed change), or (c) are small under e.g. git diff -w.
Incidentally, removing $SubversionThingy$ from the header of each file in a moderately-sized repository is enough to break the FishEye code reviewing tool...
every place i've ever worked would require the developer to notify or ask permission from the project lead before making a change of that scope, if for nothing else, just the time commitment.
That's how it should have gone. But to the persons defense, the codebase really was a mess and a cleanup would have really helped us in getting a better insight in what the heck was going on, it really was with the best of intentions. But code review tends to be a bottle-neck anyway and there is no such thing as an 'innocent change' in my book, especially not in code-bases that have too little automated test coverage and zero documentation.
You'd want one better -- slip it in when your co-worker was about to make a change, update the yet to be committed code in their local working directory. They'd have a low chance of noticing it but the blame would lie with them.
There certainly will be multiple people to blame for this at least because nobody noticed who should have been looking but, it could have been also been a single rogue/disgruntled employee as well. About to be laid off perhaps, or also playing on the black market selling 0-day exploits. Even better they managed to slip in the name of the co-worker they hate the most in the committer field.
I'll take that over a national intelligence operation: infiltrating, gaining trust, paying off multiple people (and thus risking revealing their hand and operations), I can see a single person, who found out what the price for 0-day exploits are and who was perhaps been unfairly treated, was about to be laid of or leave the country. This was their way of getting an extra bonus on the way out of the door.
You'd be surprised. While the -official- code will typically always get QA'd and reviewed, there are plenty of cases where custom code for a specific use case is done by one engineer and bypasses the proper channel because of deadlines.
I don't think it's necessarily true, I remember Linus saying that for Linux' kernel they used a web of trust. At one point you had to trust some people, you can't watch every updates. And he said something along the line "if you're not doing cryptography/security like this, you're doing it wrong"
> At one point you had to trust some people, you can't watch every updates.
Yes, so Linus doesn't. But his web of trust is hierarchical. In case of some malicious code making its way into the kernel, there will be a chain of trusted people that can be made accountable for it.
If you're suggesting the reviewer wrote the patch you've eliminated the developer but you still have someone higher up in the chain that needs to be totally incompetent. Even so, that's still two people, not one.
Linus' web-of-trust is more like a pyramid of trust, he has a couple of lieutenants that he trusts but I highly doubt they in turn will blindly sign off on a commit on something as critical as this.