Hacker News new | past | comments | ask | show | jobs | submit login
How I review code (engineering.tumblr.com)
234 points by peterstensmyr on Jan 24, 2018 | hide | past | favorite | 138 comments



"Senior engineers sometimes need to be reminded that highly performant, abstract, or clever code is often difficult to read and understand later, which usually means asking them for more inline comments and documentation."

Ha! That's not a senior engineer. Senior engineers write the most simple-looking code that just works. In every rainy day scenario imaginable. The clever code writers aren't there yet.


Reminds me of the "Evolution of a Haskell programmer":

https://www.willamette.edu/~fruehr/haskell/evolution.html

Make sure you don't miss the punchline, "Tenured professor".

It's the same with the progression of engineering seniority: increasing levels of cleverness and unnecessary sophistication, until you reach a point where you don't have anything to prove anymore, and you can feel comfortable writing the simplest and most elegant solution.


That was great— though not being versed in Haskell, I tried to follow their link to the original version, but it had died.

Here's the original:

http://www.ariel.com.au/jokes/The_Evolution_of_a_Programmer....

The punchline got me pretty good.


It's kind of amusing to me that a lot of engineering interviews also seem to be focused on doing things that you generally won't and shouldn't be doing for the position. What's the point of seeing if someone knows data structures and algorithms they won't be using if they can't write a web application that interfaces with a SQL database without doing queries in a for loop?


> What's the point of seeing if someone knows data structures and algorithms they won't be using if they can't write a web application that interfaces with a SQL database without doing queries in a for loop?

Isn't this the basic architecture of a node js "event loop" Cms that gets content from a rdbms? ;-)

  # pseudo code
  While new_user_request
    query_db with request parameters
I know it's not what you meant, you meant do the above, rather than:

    # pseudo code
  While new_user_request
    query_db for user data
    query_db for session data

    For parameter,
        user_data,
        session_data
        query_db with each item


I've been doing a lot of Haskell on Codewars.com recently. This is exactly what I see. I write up a long solution, that uses the basics like pattern matching, heads of lists and such. The solution that has the most "Best practise" up-votes are usually something involving importing control.monad and other similar stuff.


If you write code that other people have to ask you about, you will reach a point where you don’t have time to write code anymore because you’re too busy answering for your own code. You may interpret this as “helping your less fortunate teammates” but the wise know they can reduce your influence on the team by sending more people to ask you questions.

Giving them more time to clean up your mess and you none to make it worse.


Sites like Codewars seem to me like good ideas starting out, but it doesn't take that long to turn into an arms race of complex one liners that people vote up as the coolest solution, not the best in terms of productivity for a company.


Ha it does make me think of this tweet https://twitter.com/KevlinHenney/status/381021802941906944

You nailed it really, senior engineers code is the most simple looking as generally they've picked the right abstraction for the problem.


Is this a senior engineers are literally superheroes meme I've missed?

Everyone is capable of making poor decisions and straight up logic errors. Senior devs sometimes more-so because we tend to get entrenched in a particular issue solo for longer periods.


Not at all and it's now how I view senior engineers. True senior engineers though do tend to find the right abstractions more often than other engineers but it doesn't mean they can't make huge mistakes with bigger implications.

My thoughts are also on actual senior engineers, not 2 years of experience etc.


I think of it more as a 'no true senior engineer' argument.


Text of the tweet so people don't have to click:

A common fallacy is to assume authors of incomprehensible code will somehow be able to express themselves lucidly and clearly in comments.


Senior Engineers are probably in meetings, not writing code..


ding ding ding


There is a conundrum here: architectural mismatch.

Often, the "right abstraction" for a problem is not call/return based. So you get to choose between having the right abstraction, and code that is simple when viewed with that abstraction in mind, but "weird". Or alternatively choose the wrong but better-supported abstraction and have code that is needlessly complex but "straightforward".


> they've picked the right abstraction for the problem

I hate to agree with you, but "picking the right abstraction for the problem" elegantly expresses what I've been trying to tell people for the last few years now, but far less succinctly (I usually ramble on about "underlying data models" and "what this actually is in the real world, not just how we view it in our application")

The reason I hate to agree with you is that I just became very disappointed that this is a difficult skill to master. "Just use the right model and the code is easy, duh. Why aren't you doing this?" Well. Now I just feel like a jerk. I thought they just gave me the "senior" title because I was old.

It's unfortunate that one of the most important skills in the industry is so intangible and difficult to quantify. Even more difficult to teach.


Ha! That's good engineers. Senior engineers write an ambitious unusable framework, get promoted and move on to the next project. So the article's advice is ironically spot on.


Tell me more about getting them to move on. I’m asking for a friend.


We call people senior after five years. Mastery can take a lot longer than that (and mastery doesn’t mean the end of learning).

Sooner or later someone will have to debug this code late at night. Don’t make it require brain cells.


is this a no true scottsman?


That might be true in theory, but in practice most people I know with that title (& most with titles above it) do this sort of thing more often than junior developers.


I echo the author's point in "Review the code with its author in mind".

Without comments, sometimes it's really really difficult to navigate the code. I have been adding more comments than ever: don't assume every line is obvious, write a comment to explain what the next few lines really do.

    # base case: stop dividing when we find the largest square.
    if width == height:
        return width, height
    else:
        # otherwise, we know that we can break the land into several
        # pieces, some are already square-sized, but there must be
        # left over, which is the difference of the width and height,
        # and can be divided again.

        remain = abs(width - height)
        return largest_square_plot(remain, min([width, height]))
^ This code is only for me to read so I didn't really care much about grammar... but a year from now I shouldn't have trouble understand the code in a minute or two.

Some of my function/method has a pretty long docstring which may include explaining the rationale, and perhaps even some ascii diagrams. If you have trouble understand a piece of code after a few passes, that's a bad code. Also, use more newlines...

> I check every Github email I get; I make sure that I don’t get notified for everything that happens in the repo

Not sure about others, but I am tired of reading PR notifications in my mailbox. I don't know how kernel developers can live with this.

I have been thinking about just build a bot.

* receives PUSH from GitHub

* adds events to a queue

* notifies based on priority

* pings me once in a while to remind me that I have outstanding PRs to review (as reviewer and as author of the patch).

If someone needs me to review right away, he/she can reach out to me directly in chat.


On a tangent: Assuming the first line is "def largest_square_plot(width, height):", you may be interested to know that your code computes the greatest common divisor [1].

If I'd been the one to write this function, I would have done it like:

  def largest_square_plot(width, height):
    """Computes the largest square to tile a plot of the given width and height."""

    # because we want the grid of squares to fit exactly
    # the size of the squares needs to divide both width and height
    # to get the largest square, we use the greatest common divisor
    from math import gcd
    square_size = gcd(width, height)
    return (square_size, square_size)
... but now I realize that probably not everyone is intuitively familiar with the kind of math that involves the GCD, so your code is actually much more intuitive without that background.

[1] https://en.wikipedia.org/wiki/Greatest_common_divisor


I recently read an article that was linked from HN comments on a different topic. It spoke about how software is not about code or documentation but rather about building complex mental models of the systems, which the author called the Theory Building View. It was interesting to read but one thing popped into my head while reading it. Too often we make comments about what a function does, which may be necessary but is not sufficient. My light bulb clicked on and I thought that what we really need is to document the function's reason for existing. This doesn't change even if the code inside does and is the thing that you actually really care about as it gives you some idea about the architecture.


A good counter argument to Comments:

https://blog.codinghorror.com/coding-without-comments/


That article doesn't make a compelling case.

It suggests taking this code:

  // square root of n with Newton-Raphson approximation
  r = n / 2;
  while ( abs( r - (n/r) ) > t ) {
      r = 0.5 * ( r + (n/r) );
  }

  System.out.println( "r = " + r );
And refactoring it to this function:

  private double SquareRootApproximation(n) {
      r = n / 2;
      while ( abs( r - (n/r) ) > t ) {
          r = 0.5 * ( r + (n/r) );
      }
      return r;
  }

  System.out.println( "r = " + SquareRootApproximation(r) );
I'm all for this refactoring, but something was lost in the process. What kind of square root approximation is being used? Does the algorithm have a name? What would I search for if I wanted to read more about it? That information was in the original comment.


There's an infinite amount of detail that's impossible to capture in a comment and which invariably changes over time and doesn't hold in the future.

For my team, the solution has been writing longer commit messages detailing not only what has changed, but also the why and other considerations, potential pitfalls and so forth.

So in this case, a good commit message might read like:

``` Created square root approximation function

This is needed for rendering new polygons in renderer Foo in an efficient way as those don't need high degree of accuracy.

The algorithm used was Newton-Raphson approximation, accuracy was chosen by initial testing:

[[Test code here showing why a thing was chosen]]

Potential pitfalls here include foo and bar. X and Y were also considered, but left out due to unclear benefit over the simpler algorithm. ```

With an editor with good `git blame` support (or using github to dig through the layers) this gives me a lot of confidence about reading code as I can go back in time and read what the author was thinking about originally. This way I can evaluate properly if conditions have changed, rather than worry about the next Chthulu comment that does not apply.


Now you have to look in two places for the information- the code and the commit messages.


How so? The code still documents what is happening, the commits however lay out the whys.

The point is that these two are separate questions, and that trying to use comments as a crutch to join the two religiously is a headache. It's impossible to keep everything in sync and I don't want to read needless or worse misleading information.

What's worse, in comments we often omit the important details such as why was the change made, what other choices were considered, how was the thing benchmarked, etcetc.

That said, comments still have a place. Just not everywhere for everything and especially not for documenting history.


I disagree. I think the "whys" belong in the comments- in fact, that's the most important part of the comment if the code is cleanly written. I don't want to be happily coding along, get to a glob and have to go to the repo pane, hunt for the commit that explains this particular thing, then read a commit message. Put it in a comment in the code. Pretty please.


That isn't actually important unless you have multiple algorithms, which is when you create an ISquareRootApproximator interface, and have NewtonRaphsonAlgorithm and any other classes implement it.

Then you can have them duke it out running identical unit tests in the profiler.

That creates a clean separation between the people who just need to know what a function does and those that need to know how that function works. People who just need an approximated square root fast can understand perfectly well what ISquareRootApproximator.ApproximateSquareRoot( r ) does, and don't necessarily care whether your "get me a square root approximator" function returns a NewtonRaphsonAlgorithm, or a CORDICAlgorithm, or a BitshiftsMagicNumbersAndLookupTablesAlgorithm, or something else, so long as it approximates a square root.


I won't speak for anyone else, but I've written some really good code but then I had a hard time understand what the heck I did after a year.

Joe's arguments are weak

* code should be readable <--- I agree

* good comments require good writers <---- not really, comments are for fellow programmers to read. If you want to document your APIs, yes, spend good time on it, and get feedback from your colleagues. Inline code comments are internal, so write in the most natural way possible. I've read "shit" and "fuck" before (see Firefox codebase, very funny).

* refactoring <--- I agree, but refactoring and documentation cannot be mutual exclusive. That's wrong.

* His example is way too simple. Try a more difficult approximation function.

Joe needs to realize that code produced at work is not meant for one single individual. If I leave, I want my co-workers and their future co-workers to have a good time navigate through codebase.

Use comments wisely, but don't avoid them! Adding 10 extra lines of comments to the file is better than a one-liner no one can understand. Let's not run a 100-line competition when we are writing code professionally. I shouldn't need to frustrate my reviewers and beg me to explain. Use newlines to make your code more readable as well.


> * refactoring <--- I agree, but refactoring and documentation cannot be mutual exclusive. That's wrong.

Commenting is in tension with refactoring, because nothing enforces that comments remain correct when code is refactored, so they tend to go wrong.

> Use comments wisely, but don't avoid them! Adding 10 extra lines of comments to the file is better than a one-liner no one can understand.

But worse than a one-liner everyone can understand. I think Joel has the right of it: write comments as a last resort when you can't make the code readable enough without them. But don't use them as a crutch to avoid fixing the code.


I think his example proves the opposite of what he intends. His example is just begging for a discussion of how the approximation actually works, edge case considerations, error bounds, assumptions, limitations, complexity, and references. At least that's what I would want if I was looking at it with fresh eyes. Sure, the author's contact info is a little tongue-in-cheek but without comments I can only know what the code does, not what it's supposed to do.


> I echo the author's point in "Review the code with its author in mind".

I don't understand how the rest of your post relates to that, although I think it's an interesting point and following discussion.

On the topic of reviewing code with the author in mind, I'm not sure I agree at all with the linked article. Does it matter who wrote the code in any way? Good code is good, and bad code is bad. It may be a helpful hint to remember the author was a senior engineer (who may "know more" than you do), but is it really something to keep in mind the entire time you review?


If you know your team well, it will help you keep an eye out for common mistakes they've made in the past. It may also help adjust your tone, as developers you've worked with for a long time will understand light humor or other well-intended comments that might be read as off-putting by newer devs.


The anti-pattern to avoid here is assuming code written by senior engineers is inherently "better" in some way than that by a junior. Yes, it typically is, but the code should speak for itself. Ad hominem assumptions add little value.

Similar to blind testing in musical auditions. http://gap.hks.harvard.edu/orchestrating-impartiality-impact...


FYI: If your variables are floats you will be splitting hair by the time this code terminates.


A conundrum for me is how to get other people to code review the way I want to be code reviewed? Particularly, I noticed code reviewers on my team are pretty pedantic, obsessed with correctness, and need to be explained why each change is okay. These are people that regularly write good quality code themselves, but there is a high amount of distrust. Why doesn't a team of talented programmers trust each other?

(in case it needs to be stated, to date, I have reviewed about as much code as I have written, upwards of 100k lines, as have most of the other people on my team. We aren't amateurs, but it often seems like we're babies.)


>obsessed with correctness, and need to be explained why each change is okay.

Isn't this the point of code review? When I click accept on a code review, I am saying that I have looked over the change and believe that it is correct and okay. If I just arrive at the conclusion by saying "Joe wrote this, and I trust Joe" the there is no point in me reviewing it.


It continues with "and need to be explained why each change is okay". If reviewer suspects there is a bug, reviewer should check for it instead of having reviewee explain him or defend every detail. It is micromanagement this way - not just being similar or kinda like micromanagement, but it is literally it. If every five lines big code change in run of the mill fronted requires two hours long negotiation, then something is wrong.

When it is unreadable and hard to see whether it works it is different thing, but then the complain in review should be some specific variant of "hard to read".

I review mostly for architectural compliance and "bad idioms" or code smells. There is difference between not like I would write it and badly written mess and many programmers confuse them.


>If reviewer suspects there is a bug, reviewer should check for it instead of having reviewee explain him or defend every detail.

It is not a matter of if the reviewer "suspects there is a bug", but rather a matter of if the reviewer is convinced that there is not a bug.

If the reviewer needs to have someone explain why every minor code change is correct, they are either lazy or incompetent (or the code is badly written). Further, it is generally a bad idea to go to the person who wrote the code to explain why it is correct, as you are then much more likely to make the same mistake the original author made. However, being "okay" often goes beyond correctness, and into business decisions. Often times, these business are not documented (or if they are documented, it is in some management document that is not linked to from the code), so in order to review it, the reviewer has to ask the author what the bussiness consideration that led to the change was.


And other times, reviewer is doing that to look detail oriented and responsible. He can do that so everybody sees how great and attentive he is. It is opposite of laziness in such case, but still unfair to reviewee.

It is similar with being petty in code review - sometimes people do it because they think it makes them look like better coders.

As for checking business case, yes sometimes you have to do it. But there is also such a thing as doing it too often and second guessing every requirement that you see during code review. This has tricle down effect on analysts/managers/or even customers (he got pissed after a while of this - true story) who then have to litigate and defend every detail and solve programmers conflicts over insignificant details.


> However, being "okay" often goes beyond correctness, and into business decisions.

Code reviews occur far too late in the development process for an uninvolved developer to provide a good, timely review of “bigger picture” concerns like business consideration (or software design) simply because the reviewing developer needs time to come up to speed on why decisions were made. In these cases it’s better to have the reviewing developer involved in a review of business requirements before the code is written. The code review can then stay focused on code.


Imo, business reasonability should be reviewed after the coder finished work, but not so much by programmer. It should be done by analyst, project manager or tester who knows a lot about the business. Basically, it should be done by person who communicated with customer or is responsible for overall vision.


> I review mostly for architectural compliance and "bad idioms" or code smells. There is difference between not like I would write it and badly written mess and many programmers confuse them.

This ^ I just had this discussion last week in a PR, if the comment is a matter of opinion, the whole team should agree and put a rule in the linter or style guide. If not every one is free to have their own preferences.


Proving correctness is not the point of a code review. In fact it would be difficult to make such a proof in sufficiently complex software. Functional correctness is typically "proven" by tests.

A code review ensures that the non-functional quality of the code is high. I.e. that the code is understandable/maintainable by someone other than the programmer himself, that there are no anti-patterns or dangerous-but-correct usages of language features, that the implementation fits to the intent of the specification etc.


Disagree. A code-review is not purely checking for non-functional properties of the code, it's checking for overall code quality, and that includes bugs.

Occasionally a reviewer will spot a bug. Occasionally there will be a false positive that turns out not to be a bug.

You really want to deliberately discard this bug-finding opportunity? Why?

Even if it's a false positive, doesn't that indicate that something bears commenting?


> Occasionally a reviewer will spot a bug. Occasionally there will be a false positive that turns out not to be a bug.

And most often of all, while explaining what they've done, the reviewee will say "and this does... oh crap, wait" and fix the mistake themselves. Among all their other virtues, a code reviewer is a level 2 debugging duck. (Although all of my code review experience is with buddy check-in style reviewing rather than sending it off to be reviewed. I can understand endless nitpicky questioning being annoying if you're doing it offline.)


This "oh crap" effect is one of the reasons I like "self review" as the first step in a code review process.

That is, go over the diff and add commentary explaining to the reviewer why you made particular changes (not explaining what the changes do, that should be in comments in the source).


GitHub supports posting comments to a commit. Might be a good fit.


> Occasionally a reviewer will spot a bug

Isn't the code review generally pretty late, i.e. just prior to release? At that point, the code should be passing all unit tests and I'd expect obvious bugs to be pretty unlikely. Non-obvious bugs generally won't be spotted in a code review setting.


Not impossible that the reviewer might spot a bug. There's value in more eyeballs. Useful for spotting dangerous constructs or things like UB in C/C++ that appear to work fine... for now.

Also, lots of software houses don't have formal testing of all new code. Not unusual with GUI code, for instance.


Not sure what you mean by 'late'. I guess it all depends on process defined for your project. Most SCM systems have some way to make code visible before it is published to the live code base (via branches, etc).

Also, passing tests is meaningless if the tests are worthless or incomplete. So, those alone can't be used as a metric of code sanity; Meaning obvious bugs are still possible.

Tests and reviews are just to reduce the number of released bugs. and hopefully increase maintainability.

Also, what's obvious to one developer might not be to another.


> Proving correctness is not the point of a code review.

> A code review ensures that [...] the implementation fits to the intent of the specification.

I don't see how both of these can be true.

> Functional correctness is typically "proven" by tests.

You can (and, in my view, should) review tests, too.


I'm an overly pedantic reviewer because I feel the review is one of the few places where I can counteract the accumulation of technical debt. Once the less than ideal code is in the codebase it's there to stay. I also ask a lot of pretty dumb questions during a review because I want to make sure that my understanding of the requirements matches the understanding the other engineer had.


>I'm an overly pedantic reviewer because I feel the review is one of the few places where I can counteract the accumulation of technical debt.

Pedantic is the wrong approach to technical debt though. Almost every aspect of technical debt is about trade-offs.

If what you're doing appears pedantic it should come accompanied by an explanation of why the trade off the developer took is not ideal.

>I also ask a lot of pretty dumb questions during a review because I want to make sure that my understanding of the requirements matches the understanding the other engineer had.

Your code should really come accompanied by tests which you can read and use to understand what the precise interpretation of those requirements was.


> Once the less than ideal code is in the codebase it's there to stay.

That is not objective fact across projects. That means either the process or culture is bad.


I've never seen a project where you go back to fix something just because its implementation is not ideal. If it works it stays, because there is always something more important to do.


I do that regularly. Not really when it is less then ideal, I don't think less then ideal necessary constitutes technical debt. But when it was hard to read and I just had to read it, when it was hard to modify and I had to modify it. Or when I was modifying something nearby and this was for free. We also have occasional actions like "remove this particular code smell from everywhere where you find it" and explicit "refactor that part" issues that are routinely assigned as part of sprints.

Obviously, we do weight how much time it takes to fix it vs how much of the debt it actually is. We would not had that much independence in these decisions if we did not.


This, this and this.

You get told to quickly write something as a Proof of Concept. Then get told, it works so put it in the code and reelase it. We will rewrite it later.

You never get to rewrite it later.

Over time the code base goes to hell. Every single time


I have hard time to believe you never refactor function you come across. I see how refactoring whole architecture or something major is avoided, but when it comes to smaller functions and pieces of code, refactoring it as you go takes around same amount of time as refactoring it during that detailed code review (e.g. often very little).

The really big mess tend to be emergent - when features and code base grew too much for original architecture over time. You cant prevent that one by detailed code review and it takes a lot of effort to fix it.


At my last job, if you had unecessary changes for the work item (e.g. refactoring) you failed the code review.

But then we had 5% unit test coverage on a 5million+ LOC code base, they were terrified that refactoring (with no tests to back it up) might break things.

We were writing medical software and they were very risk adverse (obviously), so if i refactored a method I would have to raise that and would mean test cases would have to be written and manually run to ensure I hadn't broken anything.

So refactoring hardly happened (The code was an absolute mess), when i left they were working on a project to fix it but the timescales for it were 5 to 10 years.


Not exactly. In a lot of fast paced environments it’s rare to get a chance to rewrite something that functions properly just because it could be labeled as technical debt. It’s not about culture, it’s about moving on to the next thing. I code review hard to make sure we have the smallest amount of technical debt possible because it’s usually there to stay for a considerable amount of time.


That would fall under "culture".


Exactly, just sounds like good review practices to me.


If you can't see why a line of code is okay without looking at it, doesn't that mean that it might need a comment or something explaining why it is?

As a corollary, when I ask a question in code review, I usually don't want it to be answered there - I'd prefer it to be answered in the code, if need be using a comment.


Exactly, because you don't want every future reader of the code to have the same question.

This is why I hate implicit assumptions. If I need some special knowledge or some special assumption that is no where in the immediate vicinity of the code, then maybe your approach to the problem is flawed. Sure it'll work but good luck to future coders working with it


I mostly agree, with a small twist: sometimes the question is about the code; sometimes it is about the change. The code should be described in the file; the change should be described in the review.

This discussion makes me wonder about more tightly connecting the code with past reviews. What if when reading some code, the user had not only access to the revision history, but also the review history around the code in question? Something along the lines of storing reviews in revision control metadata.


it took me a couple teams and some patient mentoring to get the picture. if everyone is playing along its a total joy.

when posting the review, you try to eliminate as much of the cruft as possible.

when reviewing code, sit down, you're actually going to read it for understanding.

your mutual goal is to improve code and product quality. if after a few passes something seems wrong, or unclear you raise it. now the two of you get to have a pleasant discussion about what the right thing to do is. maybe some rework. maybe just a comment about unhanded error conditions. maybe nothing.

if someone raises an issue that not important either way, just change it to save time. and don't raise something unless you think its important. so, if you both think its important, you can have a discussion and reach consensus.

i think thats the real problem, that somewhere along the way the consensus culture got lost. that was a really important thing.


Too many companies have "standards" that dictate exactly how things like code formatting should look, and are much weaker about good design. So reviewers look for problems where the light is: standards violations are easy to see and point out, whereas more important problems tend to be more nebulous and subjective.

I would take the position that written code standards enforced by humans should not be allowed. Either a given standard is important, in which case you enforce it automatically, or it isn't, in which case you drop it. That frees up code review to focus on more important issues.


>A conundrum for me is how to get other people to code review the way I want to be code reviewed?

Write a document which describes how you want code to be reviewed, share it with the team and then lobby for everybody to agree upon it / sign it during retrospectives?

I think it's better to develop a strong rapport with members on your team and only join teams with strong shared values about what constitutes good code, but in lieu of that you could do worse than to draw up a shared written agreement about what constitutes good code (and iterate upon it).


Ive had this pedantic-ness in the past. I've noticed the obsessive compulsive behaviour of being attached to code. People I had to work with cared more about having every for-loop contain a ++i instead of i++ when generally most compilers optimize for that. They ignored the fact that I pointed out a problem in their architecture and basically told me to get on with it. If someone is attached to code and takes offence because you gave constructive criticism, they shouldnt be in the job.


>> they shouldn't be in the job

or may be we should find a job where constructive criticism is part of the culture and you then leave the current job! Relatively easier to find where the change exists than investing effort to bring about a change.


You probably need to get everyone on the same page on the coding idioms and style.

Personally I find that non-idiomatic code can be a bit distracting when I'm trying to evaluate the code for its overall approach and architecture.


And if people are writing 150 character plus lines, those are really, no joke, a lot harder to read. If the line is going to break anyway, why not just break it yourself at a spot that makes sense?


> These are people that regularly write good quality code themselves, but there is a high amount of distrust. Why doesn't a team of talented programmers trust each other?

This has nothing to do with trust or any other personal matter. Everybody makes mistakes, everybody might misinterpret code they use and not every programmer has a complete understanding of the scope of the changes they are making. This happens for both senior and junior engineers. (Obviously one would expect this happening much less for seniors)

The only way minimising the amount of future issues is to have thorough, rigorous and pedantic code reviews. Another major benefit for this approach to code reviews is having more people on the team understand deeply the component you coded.


> Particularly, I noticed code reviewers on my team are pretty pedantic, obsessed with correctness, and need to be explained why each change is okay.

That’s funny; I thought you were going the exact opposite direction with that. I wish my code reviewers were pedantic and obsessed with correctness!

The few time’s I’ve had the pleasure of working with them (usually open source) were the times I’ve learned the most about programming in the shortest time, by far. Especially about readability and quality of comments.


> Why doesn't a team of talented programmers trust each other?

If it's not immediately clear to a reviewer that your change is correct, then either your code isn't clear, you're not writing tests, your colleagues don't have sufficient understanding of the code base, or some (non-trivial) linear combination of the above.

The first two cases are something you can work on. If people don't understand your code, then try to write code that is more clear. If your colleagues are as talented as you say, then they should be able to understand your code without explanation. If you're not writing tests, then well, maybe there is just no hope.

If your team does not have sufficient understanding of the code base, then do communal code-review sessions where you pick parts of the system and read through it together. Or organize knowledge share sessions.

Ultimately, as Lenin said, 'trust is good, but control is better'. The whole point of code reviews is that you want to get extra eyes on your code. Trying to bypass that by asking your colleagues to blindly trust you is undermining that process and ultimately undermining the quality of your system.


One thing we did to get past the most picky review comments was institute some static analysis with some agreed upon rules which are run automatically during build checks of pull requests (sonar and checkstyle). If someone brings up an issue along these lines that you believe is unrelated to the task at hand you can ask them to propose a change to the rules instead.

That said, in the end, correctness is important and justifying and communicating your change to others is the purpose of a change request. If you aren't following define standards and idioms then you should get on board. If they aren't defined then they should be defined and reviewed to get everyone on the same page. If everyone is using code reviews to grandstand or show off their knowledge that's a cultural problem that needs to be addressed.


It's probably not that they think you are wrong or can't be trusted. Think of it like defending a thesis. A typical code review process for me is to just ask the author a bunch of questions. If they have satisfying answers then great, the code goes in and we now have recorded answers for a bunch of questions about it (helpful if issues arise later on). But it shouldn't be surprising if there is something they haven't considered, despite being someone who writes quality code. Perhaps your team is guessing about correctness rather than asking about it and it comes across badly?


> We have repositories for the PHP backend, our database schemas, our iOS (Swift/Obj-C) and Android (Java/Kotlin) mobile apps, infrastructure projects written in Go, C/C++, Lua, Ruby, Perl, and many other projects written in Scala, Node.js, Python, and more

Why do organizations allow this? I realize that some platforms require their own languages (iOS, Android), but outside of that, just pick one or two and hold the line.


Well there's a can of worms!

Thoughts based on experience:

Asymmetric information situation: newly hired smart engineer says that new subsystem should be written in <language-du-jour>. He says it is so much better than <dinosaur-languages>. Everyone on HN says it is so much better. You however, haven't had the time to try it out on a medium sized project to determine if this is true or the usual new language hype. New Engineer seems to know plenty about it and is insistent. Do you tell them no, or do you let them run? Well now you have N+1 languages. Iterate.

Some engineer in your organization develops something that turns out to be useful, as a back-burner project without official approval. They do that in whatever language they think will look good on their resume. Do you pay to have that thing re-written in one of the house languages or do you let it ride and add it to the mix. N+1 languages. Iterate.

And...in general good luck with not "allowing this" in the context of software developers. Cat herding and all that.


The answer to both situations is "No" and "No". Many engineering managers have a difficult time saying "no", to the detriment of the business.

We are building a product for customers, not a playground or post-graduate program. There are legitimate reasons to add another language but they must be evaluated with the needs of the business in mind, these include long-term maintenance costs and hiring/training costs regarding new/esoteric skill sets, among others.


This sounds great in theory, and in companies that I personally own I am able to achieve this level of control (mostly). However in my experience on the battlefield you rarely have the luxury of this level of absolute control.


Surely there's a middle ground? Plenty of good companies allow for fun experimentation. However, I also wouldn't say that switching to a new language or frame work that may boost long term productivity and hiring effectiveness is considered a playground


If I read "good companies" to mean those that have enough excess resources to make lots of foolish mistakes, then yes, totally agree.

"Keeping devs happy" by itself is a terrible reason to introduce a new language. If your developers cannot find happiness by working together and building something great, you have bigger problems.


> If your developers cannot find happiness by working together and building something great

Just a friendly reminder that there are lots of different kinds of people who approach their work differently. Not being "business-driven" isn't necessarily a bad thing.


sure but wouldn't you agree that being in business at all, and not unemployed, is a bigger thing, and that the luxury to play is earned by success. if you can't pay people including yourself, I suppose you can always stay at home and play/experiment all you want.


It depends on your perspective. Different people are motivated by different aspects of engineering – lots of folks simply like "playing" as you put it, and the business's bottom line isn't important (except insofar as the business can employ them).

From the business's perspective, they want the best engineers they can get, regardless of the engineer's motivations — as long as they can get them to provide enough value to the business, it can be worth some trade-offs like having hackweeks, etc. It's the business's job to align the incentives of their workers, and sometimes this includes throwing them a bone in choosing a preferred language, even though it may not be the most efficient.


Making foolish mistakes aka learning?


Talent retention is needed tp be considered also. Engineers do want opportinitues to learn new things and experiment. If you're too draconian you might see top engineers leaving or moral being low with discontent engineers who might in turn be less productive.


There are always interesting problems to solve and new technologies and solutions to use. That fun definitely never ends. But a whole new language (not counting small glue/config languages/json dialects/etc), a new source base for the team, one must be cautious (and they always start small, more temptingly). It's a potentially huge and maybe not great investment being pitched. Do your due diligence.


> as a back-burner project without official approval.

Can't this be seen as evaluation of a language?


Or you acquire a company that had a different set of approved languages. Do you throw their tech away and rebuild from scratch? No. N+M languages.


Why? Having many languages allows to pick the sweetspot for each subsystem. Each language has special strengths and weaknesses, there is no silver bullet that excels at everything. Go, C/C++, Lua, Ruby, Perl, Scala, Node.js, Python... each of these are THE best choice for certain classes of problems (and terrible for others). It may be because of language features that elegantly express a solution, or particular efficiency considerations (speed, latency or memory footprint), or integrations with specific libraries (if language X has a lib that does 90% of the requirement, use it instead of reimplementing from scratch in language Y). Also, it may be a team consideration: the talent pool for front-end engineers has different preferences than the talent pool for some back-end systems.

In a large system like Tumblr, they are likely highly distributed with many different subsystems running in different parts of their infrastructure, each with their own SLA and resilience requirements.

Their team is probably large enough that most engineers focus on only some of the subsystems, and communication between them is via defined interfaces.

Though having a diverse ecosystem fosters an openess of mind, instead of enforcing "One Metaphore To Rule Them All". You find yourself borrowing efficient patterns from other environments, as they eventually start cross-pollinating.


> Each language has special strengths and weaknesses, there is no silver bullet that excels at everything. Go, C/C++, Lua, Ruby, Perl, Scala, Node.js, Python... each of these are THE best choice for certain classes of problems (and terrible for others).

Not quite true... not all languages have a sweet spot in a production environment. Node.js isn't particularly excellent at anything, the attraction is mainly "I know JS, and I don't want to learn anything else right now", which is a terrible attitude for someone who wants to have a career in tech. Knowing more than one tool in the toolbox is a key skill, since there is no one-size-fits-all.

Jury is still out on Scala. It's a big language, but unclear if there's a production sweet spot or not compared to other JVM hosted languages.

The rest of the ones you listed have their definite sweet spots. There are tons of languages though, and most don't have one.


> Node.js isn't particularly excellent at anything

That's not very kind. Yes it has its flaws but it's not bad at being a higher abstraction over async-everything, augmented with a huge module repository. Support for isomorphic javascript can come in handy for some applications.

I've never done Scala but my understanding of it is that it could be a good fit for modelling certain types of business logic while being able to take advantage of exiting java libs out there.

> Knowing more than one tool in the toolbox is a key skill, since there is no one-size-fits-all.

Yes, we agree here. That was the point I was making.


Node is adequate at async, but not excellent. As I said in a sibling comment, things that are synchronous still, like GC, make it fall down at scale. Isomorphic JS isn't that useful in most real situations.

My point about Scala is that there doesn't seem to be a clear cut niche that it excels at. There are numerous other non-Java languages that run on the JVM that could be in the running, and they all get to take advantages of the whole Java ecosystem too.


Ruby, Python, Perl, PHP, and (maybe) Lua aren't amazing at handling async operations like Node.JS can.

Go is, in many respects, a great replacement for Python and Node in that it's simple and handles concurrency well. Scala and other JVM languages fall here too, but come with their assoc complexity.

Python, Node, and Ruby have tons of battle-tested code which often makes the language the "strongest" if for nothing else than there is code that works and it's not critical enough we rewrite it ourselves.

There are definitely sweet spots for each of these languages.


Node isn't great at async, despite the hype. Important things are still synchronous, most notably the GC, so you can very easily wind up with service stalls at scale. The module churn in Node land means that few things are actually battle tested.

BEAM languages are the sweet spot for concurrency right now.


I had a manager that loved Java. He wanted to add a new helper service onto an existing service that was written in Python, but the new service would be Java (because Java is the best, obviously).

This new helper was responsible for transforming, combining, and synthesizing large API responses with somewhat variable and highly nested data structures into other large API responses with somewhat variable and highly nested data structures. It certainly was not Java I was proud of, but it would have been trivial (and FAR more readable) in Python. Ugh.


But there are so many benefits to standardizing on one language (to take the extreme case). You create the business rules, APIs, and utilities in whatever language you choose and that can be shared by everyone. That becomes the core of your business. If you need something new, you create it and submit it for review, and then everyone can use it, not just the folks who choose what was your preferred language at the time of writing.

I've been in organizations where you have the Language/Platform X people over here and the Language/Platform Y people over there, and they duplicate work, don't collaborate, and their libraries don't interoperate. They have different hiring needs, and developers can't easily move between them. Ultimately, everybody with their pet languages moves on, and new people come into a very fractured environment and ask themselves WTF is this? Over the years, developers (especially new ones) are only a fraction as effective as they would be if the company had just paid the relatively small up-front price of nudging everyone into the same ecosystem.

And BTW, I have yet to encounter a problem that is hard to solve in my preferred language, but is way easier to solve in another. I think the key to that is to choose carefully what you're going to commit to.


> And BTW, I have yet to encounter a problem that is hard to solve in my preferred language, but is way easier to solve in another. I think the key to that is to choose carefully what you're going to commit to.

Long-running light-weight servers are nearly impossible to write in PHP, but a breeze in Go. There are tons of problem spaces where the only available libraries are in C/C++ or Java, which means its going to be easier to solve the problem in those languages than it is to build a new library from scratch. Python has some of the best packages for a number of problems, like machine learning, data science, image manipulation. Even if Go is purpose built for web crawling, beautiful soup (Python) is still what I'd choose for getting a scraper up and running quickly. And of course since The Web Is King everybody has to use JS for something these days.

If you're a large shop doing a lot of things, it's going to be more work to stick to one or two languages than it is to pick the best language for the job.


People don't just move between teams, they also move between companies, even moreso in my experience.

There's this mindset in some companies that they're IBM, they are the world, but the it's not like that anymore. You can't expect people to sit through your company training on how to use your proprietary tools and then spend 30 years working there. You want to be able to use all types of people coming in with all types of experience because it's very diverse out there. And they want to use and build skills that they can take to their next company or leverage to start their own company. You want highly motivated individuals with entrepreneurial mindsets right? What kind of environment will attract them?

Go ahead and say, "We only use COBOL here," and see how that works out. Great everything is interoperable. But you've got a team of C players. Maybe that's fine though.


IMO, we should focus more on standardizing on runtimes rather than language. Languages evolve, and new ones come out that solve problems better. We luckily live in an age where many of those languages can interop at the runtime level.

As a developer, and as a team, it sucks to be told "you must use Java and JavaScript for time immemorial." An easy compromise is that any package/library/service/etc. must target the JVM or JavaScript VM and provide interop with existing code.


>Though having a diverse ecosystem fosters an openess of mind, instead of enforcing "One Metaphore To Rule Them All". You find yourself borrowing efficient patterns from other environments, as they eventually start cross-pollinating.

I've had the opposite experience. Having a lot of different technology stacks increases the barrier to entry for anyone who might be interested in working on another part of the ecosystem. This results in increased siloing of people and information.


Totally agree with this. I've had to pick up the pieces on more than one occasion when some key piece of code was written in the flavour of the month and the writer shortly left. I've heard other Devs say you should pick the right language for the job but I think there should be a heavy bias to what the rest of the code base already is using


None of those, except perhaps for the golang are a flavor of the month languages. Using "Right tool for the job" is also important.


And golang is a relativley "safe" FOTM language in that basically anybody ought to be able to learn enough about it to work with it in a short amount of time. When Scala was a FOTM, it probably did a lot more damage.


> Why do organizations allow this?

While I generally agree with you on principle, I also feel you need to be pragmatic too. It’s important to also use the right tool for the job as long as you have sufficient engineers who can support that alternative language/framework/platform. Sometimes a language can also be favored at one point, but lose favor later, and in some cases it doesn’t make sense priority wise to rewrite it because it still works.


In the case of my current org, we've used Ruby/Rails and Elixir/Phoenix, but we just go bought by a bigger org that has been mainly Microsoft for years. We can either drop everything (ignoring current paying customers) and rewrite it all in C#, or we can have a mix of MS/Ruby/Elixir.


Why?


The more languages you have the more you have to know just to know what the software knows. This makes things hard. Heck jumping from what I call back end languages like Go to Java to C# requires a mental switch if you've been doing one for a week or more. I'm using Go now, but I've got 13 years of Java experience with a Java Ring (https://images.techhive.com/images/idge/imported/article/jvw...) on my hand. When I switch back to Java I mutter the phrase, "Nope. Nope. Nope. That's Go, not Java."

You should use the right tool for the job. Python is good for AI stuff. Java is good for performance and maintainability. Go is interesting any might give Java a run for its money.

You should limit adding tools because they're cool. Do that for bow ties not code.


Big enough companies may have lots of different projects and many individuals won't work on more than two or three languages max on the daily, though.


Give then context of the conversation I think the company was limited to one team or perhaps a few. Even within a larger company multiple languages limits the effectiveness of Human Resources. You can't as easily move people between teams.


The context? Tumblr? They have over 400 employees.


That doesn't matter. I've worked in large insurance companies that had SQL procs, Java, and Old Cobol. They had over 1k IT employees across the country. Sure there are odds and ends in bash, but bulk of the core business systems were in Java.


My code is well commented. //eslint-disable-line all over the place.


Invoking eslint-disable needs to have comment of it's own justifying why the line in question is being skipped. What use is the linter if we just disable it everytime it complains?


I dont have permission to remove lint rules but some of them are absurd.

I refuse to remove 'extrenious' parenthesis that make the code more readable to junior devs who may not know the language specific order of evaluation in a logical expression.


For me the most important part of reviewing any nontrivial changes it actually check out the branch and test every change I see. This keeps a lot of issues from reaching the QA team and catches issues they could have missed since they don't actually go through the code.


This is an ideal but is hardly scalable if you're doing 3-4+ PRs per day and are expected to do your own coding as well (plus attend bureaucracy).

You can effectively do this by checking that every nontrivial change has sufficient automated test coverage. This saves you from having to test changes yourself and saves future devs from having to go through your thought-process when they touch that code next.


Something the author doesn't bring up, but that we started doing about 9 months ago at my company, is synchronous reviews. Meaning the committer is on the phone or in person with the reviewer. It's great. We don't do it for all PR's, but anything medium sized or above, or even small one's if they involve critical logic. The way we usually do it is the committer walks through the changes with the reviewer. Often the committer will realize their own ways of improving the code. And with the added context, the reviewer can often provide better feedback. Plus the X factor of just two people talking who come up with ideas, improvements, etc. And half our team is remote, so this wasn't a natural outgrowth. We make it happen, but I think it's worth it.


That sounds very similar to Rubber Duck Debugging.


I am a junior developer and my latest feedback was that one of the main skills I should develop is to make better, more well-thought, critic and deep code reviews (including of PRs from more senior developers).

Any tips on how to improve this?

Would a checklist help? Have a clear process on what to review first?


A checklist can help. (he says, and then does a mental one because there isn't one nearby).

What I look for is:

1) What's the problem being solved? Does this look like a reasonable approach? Is the code pythonic (Obv: for python)?

2) What edge cases are there? Does this handle the important ones? Does it punt properly on the less important ones?

3) Look for a short list of bug classes that have come up in the project before that have lead to emergency patches. E.g. Decrefing, Checking mallocs, any exec sorts of things. (This is a clear application for a checklist)

4) Are there tests/documentation/other required fixtures and stuff?

5) Does the code generally match the style of the project?

1000) Code formating and whitespace and line wrapping and all that bikeshedding stuff.

Feel free to short circuit anywhere once it becomes clear that there's more work required.


1000 should really be handled by automated tools. Takes useless burden from the reviewer, and emotionally easier for both sides too.


It can be, especially at a company. (But then watch the bikeshed discussion on the tools. And PEP8 is a guideline, not a set of hard and fast rules. Beautiful is better than ugly and all that. ).

What I see as one of 5 maintainers of an open source project is that when a review comes back with a bunch of formatting comments, it's because the reviewer didn't step back and see the other parts. Raymond Hettinger has a talk where he discusses it, but it's the sort of feedback that can be given in almost any case and can obscure the more fundamental issues with the code.


Ask for examples of good reviews to emulate. Look at the PR without comments first, and give your own review. Then compare with the original review, and see what areas you emphasized more, and emphasized less than the original review. Talk with the original reviewer, and ask about mindset behind why they asked for the changes they did.

Also, read a lot of code reviews. Just like reading a lot of code is helpful for becoming a better developer, reading a lot of reviews is helpful for becoming a better reviewer.


Unless your current reviews are really superficial probably not. You can only review at your level of understanding, so if they're asking for more depth to your code reviews you probably need to develop a deeper understanding of the architecture and design of your codebase. A checklist would do the opposite of that in most cases.


> I look for code that is well-documented (both inline and externally), and code that is clear rather than clever. I’d rather read ten lines of verbose-but-understandable code than someone’s ninja-tastic one-liner that involves four nested ternaries.

"Clear" and "clever" aren't in opposition, and likewise "verbose" and "understandable" aren't correlated.

I think this characterisation, and especially the example, shows a lowest-common-denominator straw man of "clever one-liners" which seems to miss the reason that some people like them. In particular, it seems to be bikeshedding about how to write branches. The author doesn't say what those "ten lines of verbose-but-understandable code" would be, but given the context I took it to mean "exactly the same solution, but written with intermediate variables or if/else blocks instead".

This seems like an analogous situation to https://wiki.haskell.org/Wadler's_Law where little thought is given to what the code means, more thought is given to how that meaning is encoded (e.g. ternaries vs branches) and religious crusades are dedicated to how those encodings are written down (tabs vs spaces, braces on same/new lines, etc.).

Note that even in this simple example there lurks a slightly more important issue which the author could have mentioned instead: nested ternaries involve boolean expressions; every boolean expression can be rewritten in a number of ways; some of those expressions are more clear and meaningful to a human than others. For example, `loggedIn && !isAdmin` seems pretty clear to me; playing around with truth tables, I found that `!(loggedIn -> isAdmin)` is apparently equivalent, but it seems rather cryptic to me. This is more obvious if intermediate variables are used, since they're easier to name if they're meaningful.

In any case, compressing code by encoding the same thing with different symbols doesn't make something "clever". It's a purely mechanical process which doesn't involve any insights into the domain.

To me, code is "clever" if it works by exploiting some non-obvious structure/pattern in the domain or system. For example, code which calculates a particular index/offset in a non-obvious way, based on knowledge about invariants in the data model. Another example would be using a language construct in a way which is unusual to a human, but has the perfect semantics for the desired behaviour (e.g. duff's device, exceptions for control flow, etc.).

Such "clever" code is often more terse than a "straightforward" alternative, but that's a side-effect of the "cleverness" (finding an existing thing which behaves like the thing we want) rather than the goal.

If the alternative to some "clever" code is "10 lines of verbose but understandable code" then it's probably not that clever; so it's probably a safe bet to go with the latter. The real issues with clever code are:

- Whether the pattern it relies on is robust or subject to change. Would it end up coupling components together, or complicate implementation changes in unrelated modules?

- How hard it is to understand. Even if it's non-obvious, can it be understood after a moment's pondering; or does it require working through a textbook and several research papers?

- Whether the insights it relies on are enlightening or incidental, i.e. the payoff gained from figuring it out. This is more important if it's harder to understand. Enlightening insights can change the way we understand the system/domain, which may have many benefits going forward. Incidental insights are one-off tricks that won't help us in the future.

- How difficult it would be to replace; or whether it's possible to replace at all.

This last point is what annoys me in naive "clever vs verbose" debates, and prompted this rant, since it's often assumed that the only difference is line count. To me, the best "clever" code isn't that which reduces its own line count; it's the code which removes problems entirely; i.e. where the alternative has caveats like "before calling, make sure to...", "you must manually free the resulting...", "watch out for race conditions with...", etc.

One example which comes to mind is some Javascript I wrote to estimated prices based on user-provided sliders and tick-boxes, and some formulas and constants which sales could edit in our CMS (basically, I had to implement a spreadsheet engine).

Recalculating after user input was pretty gnarly, since formulas could depend on each other in arbitrary ways, resulting in infinite loops and undefined variables when I tried to do it in a "straightforward" way. The "clever" solution I came up with was to evaluate formulas and values lazily: wrapping everything in thunks and using a memo table to turn exponential calculations into linear ones. It was small, simple and heavily-commented; but the team's unfamiliarity with concepts like lazy evaluation and memoising made it hard to get through code review.

Also, regarding "straightforward" or "verbose" code being "readable": it's certainly the case that any particular part of such code can be read and understood locally, but it can make the overall behaviour harder to understand. Just look at machine code: it's very verbose and straightforward: 'load address X into register A then add the value of register B', simple! Yet it's very hard to understand the "big picture" of what's going on. Making code more concise, either by simplifying it or at least abstracting away low-level, nitty-gritty details into well-named functions, can help with this.

When used well, "clever" code can reframe problems into a form which have very concise solutions; not because they've been code-golfed, but because there's so little left to say. This can mean the difference between a comprehensible system and a sprawling tangle of interfering patches. This may harm local reasoning in the short term, since it requires the reader to view things from that new perspective, when they may be expecting something else.

When used poorly, it results in things like nested ternaries, chasing conciseness without offering any deeper understanding of anything.


An HN gem of a comment!


I don’t know how much time he spends code reviewing. But if at the end of the day he wants anybody to get the complete context of what the change entails by looking at the PR... I would think the code review process is more elaborate and time consuming than many companies can afford.


> clever code is often difficult to read and understand later

I have seen this many times and they are actually usually talented developers that are just not used to working in groups....

but what I have seen more often (back when I did code reviews)... is lazy copy and pasting or something analogous.


I don't understand why you need to know the progamer behind the code ? You need to be totaly impartial when you judge something.

So i think it's a wrong way to review the code. You don't need the WHO but the WHY.


> I’d rather read ten lines of verbose-but-understandable code than someone’s ninja-tastic one-liner that involves four nested ternaries.

One of my pet peeves is the inability to solve a K-Map for 6 variables and do stuff based on the resulting boolean expression. Just because it is utterly unreadable. I've tried this with many reviewers but nope.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: