> I’ve recently come across a discussion where a new developer joined a team and faced over 300 PR comments on their first contribution. Most of it was stylistic nitpicking. This isn’t just unproductive, it’s outright toxic.
For me this says more about the company culture than any inherent flaws with the code review process.
> This is why I love things like cargo fmt / go fmt / eslint / etc.
I agree. Once I had the displeasure of working with a junior dev who was very prolific in posting comments on style and if a space should be at the left or at the right of a symbol. It took me a few days of dealing with that noise to onboard a linter.
Even so the junior dev felt entitled to manifest how high their standards were by posting a torrent of comments in the PR that onboarded the team's official linter config.
But once the PR was approved and merged, surprise surprise: the junior dev's PR comment metrics dropped from dozens per PR to zero. The style guide didn't even had to be enforced.
The last I've heard about the junior dev was throwing a tantrum when one of their PRs received a comment from another team member asking to run the linter because it failed to adhere to the team's official style guide. Apparently the high standards and this attention to detail only went one way.
That’s why you hook the linter up so it runs on every commit / push. No need to ask, it always runs. And no need to quibble over style. Don’t like it? Change the linter.
I agree that it should be hooked up to run as a precommit hook, but unfortunately that doesn’t always solve things. People can- and do- bypass precommit hooks. It happens all the time at my company with certain teams, but I’ve been unable to figure out why. Any precommit hook that should run- needs a correlating required PR action to verify any precommit expectations are met. Otherwise, people reviewing the PRs just assume it ran and the people bypassing it get away with it.
Install pre-commit with various linters for all the languages and data files out there, and the problem is solved. I do it even for Makefile's and cmake.
That's also why I dislike it: The mind that will go apeshit over trivial style nonsense will still provide unhelpful feedback on other topics... but it will become far less obvious that their comments should be downgraded.
Formatters also do a poor job if your language is flexible and expressive. Great for go, but if your language is the kind that easily supports internal DSLs, formatters are not even helpful at making the code regular
What's your definition of "trivial style nonsense"? Is it "something I personally don't understand or care"?
Things like tabs vs spaces is important, and so is how many spaces an indentation level should take. This affects how editors reformat code, how other people's editors present the code, and even how many lines a commit has and is blamed for a change.
That's why linters are of critical importance to a team, and so is adopting a shared standard and editor config.
The only people who don't understand the importance of a style guide and how to make it a non-issue by enforcing tools to enforce it are those with little to no experience working with software.
Linting and style guides are not of "critical importance". No business objective will go unachieved because some checkins use tabs and others use spaces.
Whatever problem style issues might cause can be resolved before lunch by running a formatter and committing the result.
> Linting and style guides are not of "critical importance".
This is simply false, as attested by the huge volume of comments in this thread by those with actual professional experience working on real-world software projects.
You're also oblivious to the problem domain, because otherwise you'd understand that the critical problems are not whether a space should be at the left or at the right of a symbol, but all the churn that is required to manually address style problems in PRs.
Try to think about the problem. You post a PR that screws up all formatting. It takes time for a team member to review a PR. Once you start to get reviews,you notice comments pointing out failures in complying with a specific style. Whether you go the passive-aggressive path of waiting for any other team member to review your code or you do the right thing and fix the problems you introduced, that requires another round of PR reviews. The time that you take with each iteration is the time your work is delayed to be merged. Now think about how many hours per month you waste just because you can't manage to format your code properly.
I’m not sure if I’m understanding you correctly, but how on earth would a pull request even make it to the review state if it fails to lint in the pipeline?
I sort of agree with GP in that the discussions are a waste of time. I also agree with you that you should simply automate it through tools. Styling doesn’t have to be a democracy or about personal preference, all styles work, it’s all about picking one and then forcing everyone to use it. Of course you do it in a much more involving process than what I make it sound like here, but ultimately someone with decision making powers is going to have to lock down the process so no further time is wasted on it.
Only one place I have worked at had the goal of writing code as if it were from just one person and it was pretty nice, honestly. The diff tool output was easy to understand (not a lot of noise). What made it work was everyone was pretty mature and understood that this was a group effort, and not everyone was going to get their personal itch scratched around brace style, etc.
We did have some outside contractors who didn't get it at first, but after several of their submissions were rejected (with potential financial penalties) they got onboard and followed the guidelines we had sent them. "These people mean it."
> it will become far less obvious that their comments should be downgraded.
I'm curious why you think this would happen... I would imagine that their comments will now have be to about things that matter, and if they are unhelpful they will stand out more.
It is obvious that a style comment is of no significance, and that the person who made it chose to spend time and personal capital on something of no significance.
It is not obvious that an actual code change is of poor quality.
To show that takes infinitely more experience, work to analyse all the direct and indirect ramifications of both the original and proposed approaches, capacity to push back and decide that your own engineering judgement is equal or better than whoever is purporting to correct you, willingness to suffer everyone else accusing you of arrogance for that on top.
Even with all of that, it's a lot harder to prove the valueless comment is valueless because the more you know, the more you know that practically every single line of code could be done 30 other ways with some valid argument for each one.
It's completely insidious both for the junior and the senior.
The junior has no way to know the junk comment is junk. So they internalize the comment and everyone is worse for it.
The senior has it almost worse. They know enough to know they don't know everything and the comment might be valid, so they put in all kinds of work to try to figure out if they actually missed something, did they have point etc. Maybe in the end the code doesn't end up as bad as the junior just rolling over, but it sucked for everyone and the challenge was not really an intentional productive crucible, it was just a douche that everyone had to waste time and energy taking seriously.
Only someone who actually is arrogant (whatever level, whichever side of the pr) has it easy. Again bad for everyone except them maybe.
If the seniors are too passive to say no faster and faster, and management is absent, the junior’s going to end up being the tech lead. I’ve seen this. :)
It’s not necessarily even bad if there is no meaningful leadership above or alongside the junior. The fast-rising junior may end up hiring developers who are more assertive and still capable.
I don't particularly care which one you choose - although I admit to having a preference - but I want that style consistently applied throughout the project, so I'm not stumbling over stylistic differences as I'm scrolling through code trying to investigate something.
And keep in mind that this particular example, that some people can read clearly, is just five lines. When you're looking at a 3k line file, one of the dozen you're digging through during an outage, that shit matters.
You need to work with code from multitude of sources, then you will learn how to read code efficiently. And how futile are your dreams about styles. We are not supermodels and fasion designers, far from it, get over it!
Of course you can, it's a tiny, isolated example consisting of 5 or so tokens. That's not how real programs look like so you cannot take the example so literally.
The second form is superior for at least three reasons:
1. Enables a property corollary to the happy path rule, which is that every return statement of a function is at the beginning of a line. This property is crucially important for readability, it makes it reliably possible to determine every exit point of a function by merely scanning the left hand vertical slice of it as opposed to needing our eyes to jump around erratically searching for arbitrarily placed returns.
2. Uniformity: mixing different if statement styles in a single codebase interferes with our ability to match visual patterns. Since we can no longer rely on every if statement looking the same, we cannot quickly skim over code anymore.
3. Refactorability: it's easier to add another line of code to the if statement if there's no need to also add brackets.
Bugs. Subtle horrible bugs that can take forever to find. Not having the braces makes it so easy to accidentally break in later changes or refactoring. Sure on its own it's super obvious, but next to other changes or refactoring it is really easy to miss that suddenly logic falls out of the condition or doesn't make sense anymore.
I don't remember the details but I've had some typo in a one-line-if-condition once and it took me days to find it. Might have been an accidental semicolon in C. A linter enforced the braces and line-breaks and made it obvious.
Anyways, running a linter also helps because all anger or frustration can be directed towards the machine.
> Not having the braces makes it so easy to accidentally break in later changes or refactoring.
Did you mean “not having a linter”? Because once you have a linter you no longer need the braces because the autoformatting will always fix the indentation.
Perfectly valid and people do it. But then later on someone might put a debug statement above the return to check something. Or add some other logic.
if (foo)
doSomething();
return;
And suddenly the logic is broken and the return falls out of the if-condition. It looks fairly contrived with an example like this but easy to do when code is a bit more complicated or people are tired or rushed. It's a good habit to always enforce braces for if-conditions as it defeats that whole category of mistakes. Fairly innocent but can be so hard to find because of that.
A linter would enforce having braces and solve the issue. And for codebases without a linter it is a good habit to just go with the braces.
The trouble is with those rush code out without really looking what they are doing. They make those mistakes, very easily, indeed.
No style could save us from those devils!
People should take the necessary attention to make quality work (and use proper techniques, where proper does not entail style or other visuals or appearances, not at all. That's for superficious persons). This is true for any part of life not just coding.
Using snake_case everywhere when you should be using camelCase should have been one or two comments at most. It's toxic to go through every use and comment
Toxic is a strong word. It’s useless, but not toxic, some people might think they’re making the reviewee’s work easier by flagging all the occurrences of an issue. Maybe they like that on their own PR. Maybe they never thought about it and think thoroughness is expected. If someone’s code review practice is bothering you, just mention it instead of labeling it as toxic.
IMO the best way to handle stylistic choices (if at all) is to have the reviewer just adjust it themselves. Ping-pong with comments like "add empty line" is a waste of time for everyone.
Not just stylistic choices - I absolutely love the way ex. gitlab lets you do a suggested change in a comment, precisely because it lets me do all the work and the MR owner just has to click to accept if they agree. That lowers the bar for what I'm willing to bring up in the first place (though I personally still don't like to comment on matters of style or taste) while making it less work for everyone, which makes it much easier to give useful suggestions:)
I'm not sure about gitlab, but on github this is a distinct feature where you post a comment wrapped in three backticks and the language as suggestion. It'll format it as a diff that the author can one-click create a commit from.
Services like GitHub and GitLab support preventing changes that overwrite a repo's history, in addition to tracking history even when you force update a branch.
You can revert and even force-update the branch to rewrite it's history, but GitHub still tracks the old commits and even lists in the PR the events that rewrote the branch history.
I like GitLab's feature of "suggested change" whereby the submitter can offer the change as a one-click "accept change" but importantly the change is still under the original author, because in any sane(?) review process, authors are not able to approve their own merge request. In your workflow if the reviewer checked out the branch, did the needful, committed, and then resumed reviewing they'd have locked themselves out of approving (err, assuming the aforementioned "authors cannot approve their own merges")
I like this idea, it has the bonus of revealing how much the reviewer actually cares. I bet half the things they wouldn’t take the time to fix themselves.
> I like this idea, it has the bonus of revealing how much the reviewer actually cares.
And now you contributed to turn your team culture into shit, with a toxic mix of PR creators purposely shitting on the team's style guide and PR reviewers who have to constantly post follow-up commits to your work because you can't even put together and acceptable PR or clean up after yourself.
What if anyone who posts a PR addresses feedback anyone posts on their work and addresses them until you get the necessary and sufficient approvals?
> IMO the best way to handle stylistic choices (if at all) is to have the reviewer just adjust it themselves.
How does that inform the PR author that they failed to pay attention to a relevant detail, and more importantly contribute to not make it again?
If there is no feedback, problems will persist.
Nit comments add noise and failures to comply with a style guide is always a distraction on a PR, but that does not mean style guide violations should be ignored. It means that your team must work their way into making them a non-issue.
This means establishing an objective and usable set of rules and guidelines, and enforce them programmatically with a linter, not turning a blind eye to style violations.
> And any manager that lets it happen without severe reprimands to the time-waster isn't worth working for.
That would be something only a terribly incompetent manager would ever do.
A manager who is not terribly incompetent would either not take sides until explicitly asked to weigh in, and if and only if he was dragged into the discussion then the manager's only acceptable input would be to eliminate the problem for good by adopting a linter.
Everything else is a mistake and a complete waste of time.
Passive-aggressive minefield. Someone will see a change done to THEIR code, without the changer even asking, and it will feel like the person who did it is a passive aggressive dickhead. Resentment will brew, tempers will be lost. It will only get worse from there. Software engineers already aren't exactly known for their humbleness and ability to swallow their ego.
Isn’t that a given in 2024? I find it amazing that it still seems to be an issue in some places. Using a linter/formatter is a no-brainer at this point, and it’s been for years.
It says PR comments benefit the reviewer in some way, and they're not always left to improve the code or to help the person who opened the PR. For example, if you work in a culture where being seen to leave PR comments is seen as positive regardless of the quality of them, or even that code review is a tracked metric that contributes to your performance review.
That culture will turn even the most conscientious dev into a monster.
This kind of feedback has largely been solved, for me / my team at least, with linters and formatters (as mentioned in the article). So I'd say it is still a reflection on the code review process being broken.
There is a lot of stuff a linter can't do, which however are stylistic choices some people care a lot about. Naming being a big one. Which terms to abbreviate how etc. Also there are a lot of things where automatic highlighting is possible, but any rule needs exceptions which a re subjective (nesting, usage of "raw" looos, early exit, ...)
Trying to get some consistency there can be good. And teaching new team members the "habits and traditions" may simplify long term maintenance.
However could review tools aren't a good place for distinguishing between "hey, this is good, but I'd like this slightly different" from "there is an actual issue" but that has to be solved somehow by communication. Which often isn't the best skill for engineers.
In lots of case linters might be harmful also.
It is one thing to have a style guide people follow in most cases, it is another one to rigidly force everywhere the style that the linter enforce.
Especially because most linters will impose that everyone on the team will use them.
For example, if we take Python, there is black that is bat shit and makes your code less readable but adopted by so many teams as cargo cult because doing that signal that you are a cool, hype, best practice following team...
In the end with "black", people are forced to abide by the style of the random guy that created it, that is not exactly following the PEP8 or the zen of python, and is just itself created in cargo cult inspired by go and rust, and Python creator even advised against using it for readability except in special cases like very very big teams.
> CI should reject the feature branch if the linter fails.
Your CI pipeline is broken if it refuses to run because of style issues. Linting is either applied as a pre commit hook or manually by the developer. Anything else is a mistake you are making without any concrete tradeoff.
The same goes for other mistakes such as handling warnings as errors.
Imagine going into a meeting with a senior manager and explain that you cannot release a hot fix because your pipeline is broken due to the last commit having 5 spaces instead of 4.
Requiring manual intervention to handle a blocked pipeline over a non-issue defeats the whole purpose of continuous integration, not to mention that you are suggesting adding twice the complexity as an alternative to not adding any complexity at all.
And should I stress again that there is absolutely zero positive tradeoffs?
Who’s blocking any pipelines here? If you’re running a PR process at all, these changes are being pushed to a feature branch and reviewed before getting merged to master. If you have a feature branch that’s failing CI, that doesn’t block me from merging my feature branch once my PR is approved.
As I specifically said, the CI should fail on the feature branch. If you’re only running CI on master, you’re going to run into the exact same problem if your unit tests fail. The way to avoid that problem is to run the unit tests on your feature branch, and if you’re doing that you might as well run the linters too.
> Your CI pipeline is broken if it refuses to run because of style issues.
The whole point of CI is to automatically verify the code. Linters are a method of doing that, the same as tests.
> Imagine going into a meeting with a senior manager and explain that you cannot release a hot fix because your pipeline is broken due to the last commit having 5 spaces instead of 4.
Sounds like an easy fix to me. And if your CI is set up properly, the misformatted branch wouldn’t have been merged to master in the first place.
on a different side: it also tell you (a lot) about specific people.
I've seen good/great people call out the nitpicks (in my case it was often mis-spelling, due to not being a native speaker of english) but will approve the PR anyway (implicitly expecting another revision to be sent, trusting the submitter).
On the other hand bad/toxic people will drown you with stylistic nitpicks and won't approve (and trust) you to do your best work. You will be essentially blocked pending their approval (so that nitpicks are changed according to their likings).
The weird thing is that all this traceability leaves traces for management to see who's doing a good pr review job and who's not... But I've learned that management usually does not care much.
In my current role I review a lot of PRs, they tend to be large due to the waterfall way things work... If I don't ask to fix the spelling now, It might not happen for a year or two. That said if it can be fixed in separate PR before release, that's fine.
Really, it's best to have some terms or explicitness.
For example, with my teams 'Nitpick' means I'm just being nitpicky, Doesn't have to change unless it's on the edge (and I'm explicit as to why it should change, i.e. I know the next thing will need the change anyway). "Consider" means It doesn't have to happen, but here's some food for thought. "PLZ FIX" is fairly self explanatory.
Also, making sure management (especially for contract houses) knows that PRs are a 'judgement free zone' and should not be held against people for perf reviews etc; that should be collected by other peer feedback channels instead.
Agree. Why did the junior developer feel like their changes were ready to be reviewed. Looks like they had little guidance. Why didn't a senior developer suggest to close the PR, apply the formatting or whatever and then re-open the PR or something like that. 300 comments, that's either a lot of developers commenting or just a lot of back and forth. I don't get how anyone would let it come to that point. Makes me feel grateful for the places I worked at and the experience I have now.
* I've seen good/great people call out the nitpicks (in my case it was often mis-spelling, due to not being a native speaker of english) but will approve the PR anyway (implicitly expecting another revision to be sent, trusting the submitter).*
I always thought this was the best way.
I wish these systems had a way to assign severity to comments, and urgency to the commit.
If you have a jr developer it is your job to give them stylistic feedback, the problem comes from mixing it in with security holes or sneaky bugs. And when the process doesn’t identify when we need to ship it yesterday, vs in the next few months.
Depending on your company culture, you can also just explicitly write something like "This is not a blocker, but I would suggest...". Of course if your culture uses explicit "press the button to unblock" then that's probably redundant.
Please see my other comment. I meant misspelling of English words in the comments. Misspelled code will not compile and/or pass the tests so the pr won’t be accepted anyway.
For me it's the opposite. If I'm making a PR, please point out ALL of the nits you can even think about. My next PR is bound to have at most 25% of that since by now I get more about what the team values and considers important.
This highlights something very important to me that sometimes gets lost when thinking about the tooling and that is the human element and the importance of communication, existing relationships, and rapport. The PR feedback I give and receive is extremely influenced by the relationship I have with the person - for some PRs any nit pick feels frustrating but I have other coworkers that can slam my code and even pepper in insults but our relationship lets me read it in the jocular and secretly constructive way it is intended.
The code is the same way and there was a great comment higher up about an ignored lint rule being an indicator of someone thinking about it and making a judgement call that this time the rule shouldn't apply. I have SOME peers where that my reading of that line of code and others where my assumption is the opposite and they were just lazy or didn't know how to do it "the right way". It's the same line of code!
I don't know how tooling will ever replace this part so I think it's important to keep it as the bedrock of any collaboration process.
> The PR feedback I give and receive is extremely influenced by the relationship I have with the person - for some PRs any nit pick feels frustrating (...)
Posting PR comments doesn't prevent you from accepting the PR. You can post a torrent of nit comments and still approve it so that minor issues don't turn into blockers.
It's impossible for any code formatter to be 100%. Where to put a blank line? Where to break a line? How to name a variable/function? etc. etc. Some try (e.g. prettier), and what you end up is frankly just bizarre code that's just ugly. Never mind tons of other small things that often don't really matter.
The solution is to just not be too anal about it. It really is a cultural problem.
For example a few weeks ago I reviewed a PR from a new team member; there were some seriously structural problems with his approach, so I commented on that and ignored all the small stuff for now. Another programmer on my team also reviewed at the same time, and only commented on the small stuff that, IMHO, don't really matter, and didn't look at the general approach at all (which really was just all wrong, and also quite obviously wrong).
Not to be too arrogant about it, but I feel this sort of stuff is what distinguishes a "good engineer" from a "mid engineer".
> The solution is to just not be too anal about it. It really is a cultural problem
"Any proposal that requires everyone to just is not a solution, because everyone will not just"
People are anal. You aren't going to get them to stop being anal
A real solution is to have the team agree on a shared style guide, then enforce it with a linter and formatter. If anyone cannot come to an agreement with the rest of the team, or continues to be anal about things that do not appear in the style guide after this, then that person has singled themselves out and the company will need to find a way to deal with the behavior
I agree that good engineers focus more on the actual structure and problems instead of nitpicky things like formatting
That said, code cleanliness and consistency is important too. It makes codebases much easier to maintain and understand if everything is formatter consistently. It's a pretty mid engineer take to think it's not important at all
You can't "enforce" all of these things. That was my main point.
And you absolutely can stop people from doing that. Simply accepting dickish asshole toxic behaviour as "well, people are like that shrug" and never telling people off is exactly the problem.
> If you start telling people off at work, there is a good chance that you will be perceived as being the problem
The whole concept of a PR is to review the changes you ask your team to pull into the repository.
What do you think PR comments are intended to be? Pats on the back and public announcements on how awesome you are?
No, the whole point of a PR is to allow others to review the changes you proposed so that the mistakes you are trying to introduce are easier to spot and prevent.
What do you call comments that flag a problem with your code changes? Do you call it "being the problem"?
> If there's a team agreement and someone continues to violate it (...)
How will they tell if you do not point out those violations in the PRs? That's precisely why they exist.
> I agree that good engineers focus more on the actual structure and problems instead of nitpicky things like formatting
I don't think you understand that formatting is of critical importance.
Sure, your code won't break if you add a space at the right or at the left of a symbol.
But your code will be reformatted the next time someone like you works on that file and takes the same naive approach to that code that you took.
That leads to PRs having a larger code footprint for no reason other than fixing the previous PR's failure to comply with a style guide.
This means tools like Git blame start to flag parts of the code as having changed recently just because you failed to pay attention to the style guide.
Now that regression that was introduced by an unrelated commit becomes slightly harder to track because it's buried between commits that add and remove white spaces around the problem, and the last change is just nitpicking around something you should have gotten right in the very moment you posted your PR if only you ran a linter or paid attention to the comments posted in your PR.
style fixes should always be separate commits. if they fix a PR, they should be part of that PR or an independent PR, but ideally not included in the next code changing PR
Not really. If you're already changing the code and running linters afterwards introduces changes over your change, this means you are the one introducing the problems. Separate commits just add noise.
Your comment is like saying that bug fixes should be separate commits when arguing about how not to add bugs to begin with.
> "Any proposal that requires everyone to just is not a solution, because everyone will not just"
invalidates this:
> A real solution is to have the team agree on a shared style guide, then enforce it with a linter and formatter.
... since the "team" is everyone. It's basically the same problem.
The other issue is that linters/formatters don't "solve" all formatting/stylistic choices. Most formatters, fortunately, still allow you to choose where you do line breaks for example, since they do matter and shouldn't be arbitrary.
A shared style guide is an external impetus to adjust behavior. It comes with external accountability, and also an implicit understanding that violating the shared expectation may bring consequences
"Everyone should just be less anal" is expecting an internal impetus to adjust behavior. There no external accountability, and there's no expectation that failing to do so has consequences
> The other issue is that linters/formatters don't "solve" all formatting/stylistic choices.
The point where this is the same is "everyone just needs to agree on a common style".
> 80% of a solution is better than 0%
I'm not sure if it's 80% or rather 20%, I guess it's a POV / arbitrary number.
In any case, this is far from a solved problem, while I often see in these type of discussions the idea that auto-formatters solve formatting/code style problem. "Just use black" while dismissing the idea that the rest has to be tackled informally / culturally.
Yes, but aligning the team culture is in theory also a one-time thing. And this is IMHO necessary in any case, no matter if you decide to enforce this alignment using auto-formatters or not.
> It's impossible for any code formatter to be 100%. Where to put a blank line? Where to break a line?
Not true. Those are objectively covered by any linter.
> How to name a variable/function?
Unless the issue is things like snake_case vs PascalCase, that's not the job of a linter. That's exactly what PR comments are about.
> Some try (e.g. prettier), and what you end up is frankly just bizarre code that's just ugly.
Not true. Without Prettier you always get bizarre code that's just ugly. You don't notice because you paid no attention to the code you wrote, it followed your personal subjective opinion you happened to have at that moment, and you didn't felt strongly enough to verify it.
Without a linter, others would certainly point out the problems they saw in your PR as that would certainly not comply with their personal subjective opinion they might hold at that moment and not before or after.
The importance of Prettier is that it objectively enforces a set of rules. If it's ugly it's because you configured it to be ugly, but it's less of a problem because it will be objectively, systematically and reproducibly ugly.
> Enforce a code quality standard with a linter and formatter and be done with it
Adopting a linter does not eliminate style issues. It just eliminates the number of possible comments a PR can get over style issues by replacing all detailed feedback with just one comment with a very clear actionable request that's trivial to satisfy: "It seems X doesn't look right. Could you please run the linter?"
This. You either don't care about style or push the style automatically without any interaction. Every other option is bad to some extent.
But only use a linter if you will add your rules over the empty set. If you get a pre-built set and are expected to remove the ones you don't need, you are making a toxic environment.
Automating this with linter and formatter is great. It moves the argument over style and format to a one liner change to a lint config instead of mingling it with the with the main code change.
If you continue to have people bringing up arguments over the linter and formatter after an initial agreement is made, then you can talk to those people
There are formatters that will enforce the basic case/underscore rules of naming conventions. I haven't seen one for the actual text, but I can see how that might be nice if it worked reliably.
My team has a bike shedding sort of problem where a 100 loc PR will sometimes get scrutinized to hell, but a 3,000 loc PR will get LGTM'd by enough of the team to be merged before anyone that actually cares gets a chance to look at it. I would say the second half of that is the much bigger problem.
People know who to ask to get a quick lgtm.
I don't know what to do about it. I can't make people actually review. I've had unrealistic dreams like holding people responsible for things they approved that had bugs, but anything like that would be supremely unpopular. I could do something like requiring review from the team members that actually review, but they already feel overwhelmed by being the only ones that actually review.
We tried setting soft limits on the size of PRs, but that comes with a lot of PR that are hard to review because the work is poorly divided and doesn't make sense in isolation.
The reality is that you can't make people to care more about quality than their (yours) management cares about it.
I would suggest discuss this with manager, and if he cares enough about quality, he can try to slowly coach the individuals to give better reviews. Or he can suggest that multiple experienced/reliable people also review the work.
Have you tried the code owners feature (assuming you're on Github).
IMO a good approach is to have the actual code owners (i.e. the team responsible for a specific service or library) review the PR. If they think a shallow LGTM review of 3k LOC is enough, they can also deal with the bugs :-)
If you don't have specific ownership in your code base I'd start there.
The answer is to require a high percentage of test coverage. Then you skim through the code to get a general idea of what it does and how, then scrutinize the tests to verify they enforce that understanding.
This does three things. First, it prevents people from dropping big chunks of code and asking their reviewers to make it work (if coverage is too low, it doesn't work). Second, the tests spare you the cognitive work of verifying the code does it's supposed to--especially edge and failure cases. Third, this frees you up to think about the bigger picture. Is this the right way to do this, should we be doing it at all, does this conflict with something else in the works, etc
Something helpful I have seen in this situation is to require the reviewer to test the changes for long PRs, and post proofs (a screenshot, logs, ...).
Even with the best of intentions, it's easier to miss things in longer PRs, plus it gives more time for others if they want to have a look. It's not bulletproof of course, but at least it makes sure the basic functionality is working.
Agree. I've made it a habit the past few years to include before/after screenshots in my PRs and sometimes a screen capture of the UI/workflow changes. Also, screenshots in the issues pages for bugs or anything UI related.
I find it makes it easier for people to see and understand the goal of your changes and helps them focus on the how in the code review.
When others do that in their PRs I can jump straight into the tests files and get a better understanding of the what is proposed (or at least start the review process in a more substantial way - get the important stuff addressed early on).
Speaks volumes to the maturity of our work that a 3000 LOC PR is seen not only as acceptable, but expected.
Raising a planning change equivalent to a 3000 LOC PR in a civil engineering firm would get your assignment swiftly handed over to someone more competent.
LOC is a bad measure of literally anything. It's not related to productivity, efficiency, bug count, readability, etc. What it's most related to is the choice of language and libraries. For me, shunning big contributions just for being big just stifles my drive to contribute. If the solution does what it's supposed to, is valid, efficient and there's no obvious unnecessary duplication of codethen who cares if it's big. All valid contributions push the project forward.
It is actually directly related to the time and complexity of code review. Just because LOC is an imperfect metric for productivity etc., does not mean it's a poor metric for risk, complexity, poor sequencing of changes, or other attributes.
Whenever I encounter a pull request that I find many issues with, I ask to meet with the engineer and review it one on one. More than half of the workplace problems engineers have is due to their introverted nature and their refusal to get on a teams/zoom call to explain their issues and get resolution. Comments are a really poor mechanism for teaching programming best practices.
At my last job, the rule was if the PR is big enough, get on a call and go over it in person. ideally with more than one "reviewer", to break stalemates. Worked really well!
I used to be that introverted guy, but a few years of pair programming completely cured that, and I'm now very comfortable discussing design in detail and at length, in a (if I may say so myself) friendly and constructive way.
If you've never discussed design with other people, it's a genuinely difficult thing to do!
A PR big enough should probably have its changes implemented and reviewed unit by unit on a feature branch before its creation. Or it should be an already approved work (refactoring, reformatting,...)
And if it's refactoring / reformatting, then it should be pair-programmed to validate that exactly what was said to have been done was done and the pair programmer should confirm such in the PR.
I would love this approach, but the people reviewing my PRs are 10 tzs ahead of me, so we basically can’t meet synchronously unless someone is willing to meet early in the morning or late in the evening.
Maybe prerecorded code walkthroughs would help here, but nothing replaces instant face to face pairing. That’s where you can provide somewhat nitpicky feedback but with the empathy that comes with our instinctual responses to body language and voice tone… which over time builds a much better culture IMO.
Personally I have a strong distaste for projects that try to use some metric for how long a function should be, e.g. line count or cyclomatic complexity. I'm not sure if this one is better automated, human judgement for what makes sense seems better to me. Sometimes the cleanest and highest performance way to write some code is going to be basically one relatively large function. If there's a function complexity/size lint I'll often chunk off helper functions that are virtually useless outside of that specific function, and in doing so, often make it harder to follow what's going on.
This isn't universally true of course, but it's a "when not if" situation if you have a hard lint limit and it's not very high. (e.g. In my opinion if you're going to have a lint for this, set it to something rather obscene, something that would be extremely hard to cross without doing something clearly awful. In my opinion it's always been set at least 4 times lower than it should be.)
Linux C style guide has this summarized pretty well:
"The maximum length of a function is inversely proportional to the complexity and indentation level of that function. So, if you have a conceptually simple function that is just one long (but simple) case-statement, where you have to do lots of small things for a lot of different cases, it’s OK to have a longer function.
However, if you have a complex function, and you suspect that a less-than-gifted first-year high-school student might not even understand what the function is all about, you should adhere to the maximum limits all the more closely."
"Less-than-gifted first-year student" is btw a great bar for the conceptual complexity. Another one I've seen elsewhere is "An SRE not familiar with the code woken up at 3 AM to troubleshoot an outage".
What really matters is "can I understand what this function roughly does at a glance?" In general, "length of function" is a reasonably-ish proxy for it, but as your quote says, sometimes longer is okay, and sometimes there are also other factors than just "length of function".
Either way, the question you should be asking during review is "can I understand this function?" and not "how long is this function?"
Unfortunately I think that cyclomatic complexity is a rough measure of this. It’s an attempt at an objective measurement, but I’ve never found it to be a particularly good one.
It shouldn't be a hard rule, but in most cases a sufficiently long function will have opportunities for pulling out functions with sensible signatures that make sense outside that context.
All a linter does automatically is require adding the "ignore this line" comments that are then visible to humans in diffs.
A linter directive also tells humans that somebody has decided that the function being long is acceptable in this specific location. It's been acknowledged. In many cases, that's all I want.
Then you go from pre-linter arguments about line/function/class length to arguments over whether it’s appropriate to override the linter rule in those same instances.
> in most cases a sufficiently long function will have opportunities for pulling out functions with sensible signatures that make sense outside that context.
And will make understanding the function a lot more difficult.
If you want an example of this, look at any C++ code that uses DX12 or Vulkan. It will be atomized sufficiently that you'll have to hunt through dozens of files to figure out which constructor/destructor where something fired. Part of the problem is that C++ discourages naked functions operating on structs rather than member functions on classes.
Whereas, if those functions were still all linear in a single function, you'd find what you were looking for in the same file you were already in.
That having been said, I'll probably organize that big function as though it were separate functions. But, then, I'm not allergic to adding an extra block level here or there which GASP might exceed 80 columns.
One core issue is that PRs are often too large. I don't believe there are 300 comments on a small PR unless the team is dysfunctional.
Aim to make them as atomic as possible. If it involves adding a significant feature that requires substantial changes to other modules, create a separate branch for it. Within this branch, break the work down into a series of smaller PRs.
When PRs are too large, the process usually suffers from insufficient reviews, excessive nitpicking, or (most often) a combination of both.
I can't imagine going back to working on codebases in languages that don't come with strongly opinionated defaults.
I'm very lucky to spend the majority of my dayjob hours working on Rust code, so everyone just runs cargo fmt && cargo clippy, and this in enforced in CI. You can't even publish a PR until those basic bars have been met.
I can't imagine the absolute insanity of working on JS projects where there are more ways to do things than there are people on a team, and where getting those people with strongly held opinions about things that ultimately just don't matter to agree on style and convention is like pulling teeth.
JS has excellent tooling for standardizing code style, Prettier for opinionated formatting and ESLint for everything else. If some teams decide to not use these tools, that’s their mistake.
I'm shocked how many companies still don't focus on DevX and wonder why not only is productivity in the gutter, but morale as well. There's so many companies that help solve DevX problems like this, that's it's practically easy to throw money at it.
I ask myself this often, usually while despairing that the lead dev doesn’t see any point to build pipelines, unit tests, SQL migrations (literally right clicks and edits a database schema), and the requirement of needing Remote Desktop to a VM to do anything.
Can confirm GitHub's process is poor for small PRs (if that's what you mean by "multi-commit features"). My team has mostly gotten around this with custom CLI tools (for pushing small PRs chained together) and web apps (for concisely viewing your code review status, both giving and receiving).
Recently I allowed everyone in my team to push to develop without PRs and even without doing feature branches. We review the code all together (max 2 ppl) just before the release and that's it. But great CI pipeline is must for this. It will get even better soon when AI will be able to slightly refactor the code.
Yeah, I've heard arguments for this, and it always feels weird to me. Why go so long without review? That's just giving developers - junior and senior - opportunity to dig themselves into a hole that they don't notice until someone points out, "hey, your last five commits are buggy, but that matters less than the fact that they're implementing the wrong thing."
Good point, but my team is small and has only top engineers with both soft and hard skills. Many times you can do much more with team of 3 than team of 10 ppl.
I’d also add that it’s helpful to have a review guideline.
When I get to the PR phase, I’ve already done the design work, considered the trade-offs, and am asking folks to check that the code is mergeable per our guidelines.
A pet peeve of mine are reviewers with a bone to pick who will leaving blocking comments to redo the work in the way they would have approached it.
A guideline helps here because you can, along with your team, manage expectations for the process.
Some folks like to throw up code for critiques. They use PRs to get feedback on the design/approach itself. That can be useful!
Going in to manage expectations helps smooth things along so that you’re giving the constructive feedback the author is looking for.
Part of me wonders if the real issue is we don't have an author/editor type system. If the reviewer/editor could just make the nitpicky changes in about the same time it takes to call them out, the relationship might be much more healthy. Things could go back and forth in a much more healthy way.
We do, in a way, but it's not recognized as such. The "author" would be the mythical 10x developer that races to get feature tickets completed, the editors are the ones getting stuck with integrating a pile of stinking hotness into the existing codebase. That kind of setup can work fine if everyone is aware of the others' value, but usually non-technical managers only see the hotness.
Sorry, I was not clear.
I don’t mean software system, but a culture of operating in that way.
Even the language “Code Review” is not one of cooperation, but creates thoughts of a “movie review” where one rates and finds error; not collaboration and mutual editing.
I have always found articles/discussions of code reviews fascinating. In the 10(ish) years I've been employed as a programmer I've never worked in a department with more than ten people and every time the work is distributed such that each person works as an IC on their own siloed project(s). Receiving anything besides the "LGTM" rubber stamp or nitpick comment seems almost impossible.
This is really important to get right. There's a balance as you want to avoid introducing crap but a little restraint and wider awareness makes a huge difference.
I recall being significantly put off after making my first significant contribution online to a non-trivial repo.
I was a volunteer and had been making various very modest contributions (ie simple fixes) along with efforts around docs and answering questions from users. A major contributor was known for being cantankerous. He knew what I was bringing in general, and seemed grateful yet didn't hold off with pretty forceful feedback on my first attempt at something more substantial - I had taken a day off to get it in good shape, I knew it was solving something with broad use (ie not just for me!). There were a litany of points. I fixed several but each time there would be more, it was like water on Gremlins! I gave up and stood back for a bit. Ironically it would have been trivial for him to take the code in but it sat forever and the whole thing was a waste of time but I guess I learned a few things!
Watched a guy get fired mostly because he made code review such a nightmare for everyone else while getting lapped by everyone in terms of actual output. I’m not sure there’s much to be done once you’ve hit “toxic.” The only question is whether the standard can be reset for the whole team.
The only thing I disagree with him on this is that function length doesn't matter. Longer functions do tend to be harder to understand. I will admit that sometimes a lot of helper functions can make it harder to piece together, so you can overdo it, but usually putting one or two of them back into the main function gives a very readable function.
So I would tend towards more smaller more decomposed functions with descriptive (but concise as possible) names by default. This is very different from old-fashioned C standards, which I feel are very outdated for modern tooling and hardware.
It's definitely important to have formatting standards to avoid arguments about that. But you are never going to completely eliminate stylistic things because part of that is function decomposition or intersects with the actual substantive design. Because some people feel quite differently about function size for example.
I've worked with plenty of devs who embody the toxicity here, lots of stylistic nitpicks and don't see the big picture. All the noise generated distracts from catching the important stuff.
People being an obstacle to positive change, because they believe a better change is possible.. when the changes wouldn't in any way prevent further positive changes from happening.. are the bane of my life. 'bikeshedding' probably covers it, but i feel like if someone could come up with a better metaphor that wasn't specifically about triviality or level of comprehension it could be a profound contribution to human behaviour
I’ve had a PR into an open source Anthropic library open for almost 2 months. It passes all of their rules, fixes an obvious bug that causes functional issues, passes all tests, and has a description that is descriptive with a complete user story explaining the functional issue with images. I really like using Claude and I like Anthropic’s mission but it definitely kills morale to fix something and be put into PR purgatory.
2) Ship / Show / Ask (https://martinfowler.com/articles/ship-show-ask.html), where "Show" and "Ship" are non-blocking PRs (or even directly committing to trunk, if you use trunk-based development), since not every(!) PR needs reviewing and/or should block the PR creator
You are not allowed to complain about style in code reviews.
If it’s important enough for your to comment about, it’s important enough for you to add a linter to the build and block the build if the style isn’t met.
If it’s not worth enough of your time to add a linter, it’s certainly not worth your peer’s time to deal with comments about style.
We have some styling rules that are so much in the "Style Over Substance" category that I have a strong tendency to subconsciously categorize them as "not important, at all".
Especially when the IDE doesn't respect them when you use the code generation.
But as it was a strong irritant for some PR reviewers, and as I couldn't force my mind stop doing it, I automated it by writing a custom linter.
GIT seams to be optimized for network of trust. With one person at the top approving what gets merged into the release.
This person of course does not do all of the verification, other then broad strokes of what the change does, and who wrote it, reviewed it and tested it.
I feel like companies do not want a large tree like structure for their development teams.
Without a network of trust it can become mob rule, which is what this article appears to be describing.
If you join a company and are confronted with dozens of PR comments - assuming this process isn't sufficiently well documented - see if a tech lead or manager will host an open forum review. Review the code and the comments and discuss + document. A one off discussion won't be a permanent fix, but it can help!
> Using CREG (Code Review Emoji Guide) puts more ownership on the reviewer to give the reviewee added context and clarity to follow up on code review. For example, knowing whether something really requires action (), highlighting nit-picky comments (), flagging out of scope items for follow-up () and clarifying items that don’t necessarily require action but are worth saying ( , , )
I don't know the original story, but 300 hundred PR comments on a first PR (or any PR for that matter) is insane, and my guess is that somebody on the team is having an argument back-and-forth in that PR, rather than focusing on whether the code should be merged.
If that happens to me, I would kindly invite them to talk about it elsewhere and give me pass, because the fact that you guys are still debating it, means that it's not a decided thing, and you can't prosecute me with a law that hasn't passed yet. Once it's actually settled, I'll be happy to go back and refactor it, but it is illogical to make it a blocker in the present.
It also sounds like something is wrong with the on-boarding process. The first PR should not be of such a nature that would invite loads of comments. It should be something that is functionally simple, and already has an agreed-upon design.
I was once on a team where this was not unusual, and the problem was the tech lead was the one engaging in 300 PR comment arguments.
This would result in stuff sitting in purgatory forever.
The funniest part was this was, in 20 years, the absolute worst codebase I'd ever seen, so he wasn't even maintaining some high standards in the codebase by doing so.
Sounds about right. If so many technical decisions are made by a single individual while reviewing PR, rather than having a proper standard agreed upon before hand, it's no wonder that it grows into a horrible code base.
I think PRs are also the wrong place to litigate any of this.
Requirements, specs, architecture, etc sure.
But most systems I've worked on that were especially BAD were that way not because of some PR-reviewable syntax or style, but because the whole foundation was wrong.
The most idiomatic, consistent, linted, unit-tested, etc code in the world doesn't matter that much if the underlying architecture is a mess.
Combine this with a large timezone difference and you get a (literally) slow motion disaster. Imagine waking up and finding your change thoroughly reviewed but not approved because of a triviality. Not a great day.
This is my life right now and it is terrible, can confirm. I’ve tried advocating for what you say to no avail. Or like, just opening a new PR into my branch, or directly committing to my branch. I’m not territorial, I just want progress. Instead people love to bicker and cover their own ass.
I worked at a startup where a developer used to bully other developers in PRs. Management looked away. I warned them but they didn't listen. That team is now dissolved.
...just how big was that "first ever contribution" pull request that it gathered 300 comments? My first commit in the current company was a 2-liner addition to an existing function plus another 20 lines for a test that verified that indeed, the change does affect the outcome.
Even a simple (and very appropriate for a junior/newbie) change like you suggested could invite dozens of comments if the engineers don't know what they're doing.
"Dozens" means "at least 24", so that'd been slightly more than 2 comments per line in my example. Which is an insane amount of comments by any standard imaginable: because long before you reach 1-1 comments-to-code ratio, you'd stop, and write a single "no, all of this is a wholly incorrect approach, complete re-do is needed" comment instead.
You’re right, but some people out there will flex and grandstand in their reviews if they operate adversarially instead of collaboratively, which is what your suggestion would encourage.
Code reviews are weird. I joined a very small team as a new developer. I got a task, solved it but at code review the senior developer told me to solve it in another specific way. I coded it and it was fine. The situation repeated for few next tasks. It was a bit frustrating because I felt like doing the job twice while "pre-review" of my tasks by the ultimate code reviewer before I started doing them could save both my time and his. But it just wasn't part of the process.
Does such thing have a name? A quick check up with a reviewer about whether my plan for implementing the task is align with their preferences?
When it comes to code reviews, the return on investment faces the Law Of Diminishing Returns. While many of the comments made in code reviews might be interesting, they are not so interesting that they pay for themselves. If you put some dollar value on the time invested, you'll find that the vast majority of this process is simply burning money. And not only money, but also, as this article says, morale.
A curious fact about the politics of modern tech teams is that some of the same people who consider themselves "anti bureaucracy" are strongly in favor of this type of bureaucracy, even when it brings no measurable benefits.
My opinion of code reviews has become more negative over time, mostly due to the fact that I have not seen it achieve reliable positive outcomes.
I'll give specific examples:
https://www.futurestay.com had the worst code that I'd ever seen. When I joined the company I was horrified at the level of tech debt. And yet, for years, they had a rule that at least two engineers had to review every PR. So this terrible code, the worst I'd seen in my 24 years of coding, had been approved by two engineers.
And likewise:
https://openroadmedia.com also had very bad code, and also had a rule that nothing could be pushed to production until at least 2 engineers had reviewed the PR.
In both companies the code review slowed down the deployments, slowed the down the team, and created a culture of internal sniping, while leading to no improvements.
But how can I say the code was objectively bad? Because many bugs were found in production. And what would reduced the number of bugs in production? More tests, especially end-to-end tests. And so I eventually came to this conclusion: it is best to skip code review, and instead have the team invest that same time in writing more tests, especially high level tests. For the most part, if the engineers on a team are bad then the code reviews will also be bad, but if the engineers on the team are good, the the code reviews will not be needed. And in both cases, the path to improvement comes from writing the kinds of tests that ensure no bugs get into production.
As I've grown in my career, and taken on higher level management jobs, I've also realized that code reviews do not scale to large-scale leadership, but tests do. Code review seemed like a good idea when I was leading a team of 3 engineers, but not when I was leading a team of 30. When I was leading a team of 30, the only tool-of-oversight that worked for me was automated testing. And so I've concluded this is where the effort should be made. Sitting at my own computer, I cannot review the work of 30 engineers, but I can still run the tests on the various software, to see if they are passing. In particular, I can look at API interfaces and then change the dummy test data to break the tests, and this lets me see quickly thorough the test coverage is, and how prepared we are for unexpected shocks.
When I am running a team I will assign software developers the task of writing various tests, including high-level end-to-end stress tests. These tests sometimes reveal problems. We then respond to the problems. But we don't waste time responding to problems that have not been proven by a test, which is to say we do not do code reviews. Code reviews are all about predicting what might be a problem. But much of the concerns are phantoms. It is better to respond to real problems that have been revealed by tests.
I've also come to realize that software developers, quite naturally, develop strong opinions about questions of style, and yet these questions have no long-term impact on the health of the tech in the company. Other than enforcing the rules of some automated linter, the time spent on issues of style are 100% wasted. If you allow the team to spend even one minute discussing issues of style, then you are setting money on fire. But I know this opinion is unpopular, because software developers enjoy enforcing their own opinions about style. But it is all bikeshedding, it has no real-world impact.
I have the impression that there is some middle era, in the career of a software developer, where concerns about issues of style and organization tend to peak. The junior developer does not know enough to care about such issues, but somewhere between 4 years of experience and 10 years of experience, these issues feel important. I think you need more than 10 years of experience to see the waste. In particular, you need to run one large project where the team invests a lot of time into code reviews, and you need to notice how much tech debt builds up, despite the code reviews, to realize that code reviews do not offer a path forward.
On recent projects I've told the team there will be no code reviews, but instead we will focus on building tests. I get a surprising amount of pushback on this. Less experienced software developers get angry with me and tell me that I'm being unprofessional. In some sense they are correct, in the sense that "professional" refers to "standard norms that are accepted by a profession" -- I am deviating from those, clearly.
The most reasonable criticism I get is that code reviews could catch not just problems of style but problems of algorithms. What if, they ask, some junior developer introduces code that is ignorant of the implications of Big O Notation? What if they introduce code that runs in polynomial time? But I would ask, how do we know if it is running slowly? We can only know that through tests. So lets build tests that measure time, and stress test with large loads. Big O Notation offers an excellent example of where software developers can worry about the wrong thing: what if a junior level software developer writes code that runs in polynomial time but on data that will only ever have a few hundred records? While the algorithm might be sloppy, the code will run quickly because there are few records? In that case, any time invested to find a different algorithm will be a poor investment. Once I'm running a team of 30, the only thing I care about is whether code is actually slow, and I discover that through end-to-end stress tests, not code reviews.
Kent Beck, when he invented Xtreme Programming, also popularized the phrase "Do not write a comment, instead, write a method with an easy-to-understand name that communicates what the comment was going to communicate."
I've come to a similar conclusion. Do not write a comment in a code review, instead, write a test that would catch whatever danger you want to warn about. And if you cannot find a way to express your concern as a test, the return-on-investment of worrying about that concern is probably zero, so we should ignore it.
> I've come to a similar conclusion. Do not write a comment in a code review, instead, write a test that would catch whatever danger you want to warn about. And if you cannot find a way to express your concern as a test, the return-on-investment of worrying about that concern is probably zero, so we should ignore it.
What if your concern is "this approach makes the code difficult to test"?
You discover that by writing the test. You don't try to predict that ahead of time. The goal is to get away from trying to predict problems ahead of time, because most of those problems turn out to be phantoms.
For me this says more about the company culture than any inherent flaws with the code review process.
reply