Hacker News new | past | comments | ask | show | jobs | submit login
Reflections on 10k Hours of Programming (matt-rickard.com)
421 points by herbertl on Aug 6, 2021 | hide | past | favorite | 215 comments



Author here, thanks for posting.

Here's some more color on the reflection on comments, straight from the Linux Kernel documentation that capture my intention much better:

"Comments are good, but there is also a danger of over-commenting. NEVER try to explain HOW your code works in a comment: it’s much better to write the code so that the working is obvious, and it’s a waste of time to explain badly written code.

Generally, you want your comments to tell WHAT your code does, not HOW. Also, try to avoid putting comments inside a function body: if the function is so complex that you need to separately comment parts of it, you should probably go back to chapter 6 for a while. You can make small comments to note or warn about something particularly clever (or ugly), but try to avoid excess. Instead, put the comments at the head of the function, telling people what it does, and possibly WHY it does it."[0]

[0] https://www.kernel.org/doc/html/v4.10/process/coding-style.h...


> NEVER try to explain HOW your code works in a comment

Never is a strong word, sometimes the algorithm is just inherently complex and it's worth it to explain what you are doing.

Why is much more important, but it's also partially solved automatically by git blame and adding the task number to each commit message (and writing good commit messages which IMHO is more important than writing good comments). Each line of your code has commit message associated with it whether you care about it or not - make it useful.


Yeah algorithms in particular are where I'm hesitant to say "never explain how your code works." They can be dense, deeply-nested, and hard to break down into semantically-meaningful variables and function names. Sometimes it's good to explain what's going on there.


That's what people don't understand. We don't think in code, we have a translation layer from human language/thought that we think through when parsing a programming language. As long as the comments are kept accurate, they can greatly improve your ability to parse through a block of complex code, which is crucial when you're trying to hotfix an issue. It also gives newer programmers to the codebase a chance to help understand what that block of code is doing beyond it just being some blackbox with an input and output.


Yeah the how is important when auditing the correctness of code. I would say "Never listen to an advice about never doing something, there is always a case were breaking the rules is good"


This is generally how I treat comments, docblocks on everything for api documentation, but inside functions, yeah, something clever or non-obvious gets a comment. I also tend to document complex interfaces / abstract classes more heavily to give people a sort of road map for implementing it.


From the linked post:

> Know when to break the rules.


I have been trying to explain this for some time and I’ve just had a new thought so I will add it here.

I think that we fundamentally start to lose the mental distinction between the language semantics and the idiomatic code. It’s all part of the scenery. So for a seasoned programmer, the what and the how blur together. There is no how to getting a value from an object, a struct, a tuple, or an array. You just do. Unless your language doesn’t have one of those things.

Non-idiomatic code is full of how. So what we seek is code that just tells us “what” and comments are a bargain we make with each other to guarantee a lower bound on that. Meanwhile the self documenting code people want a different arrangement, where you have to stoop to documentation less often by putting that energy into sustaining a constant output of idiomatic code.


Most nontrivial algorithms are just a bunch of nested for loops with array indexing and ifs mixed in. There were no advances in computer science that made this kind of stuff less confusing as far as I am aware. We made a lot of advances to make simple boring business code do 30 nested method calls, but at the end of the day the basic algorithm is still a bunch of loops and ifs.

It's actually funny how all the focus in CS seems to be on modeling the business part in more natural ways, while the "solving problems" part is still done like it's 1970s.

You can make the algorithms clearer with good variable names and extracting some parts to well named methods (and you should, usually) but it's often not enough.

Example taken straight from Wikipedia:

    function BellmanFord(list vertices, list edges, vertex source) is

        // This implementation takes in a graph, represented as
        // lists of vertices (represented as integers [0..n-1]) and edges,
        // and fills two arrays (distance and predecessor) holding
        // the shortest path from the source to each vertex

        distance := list of size n
        predecessor := list of size n

        // Step 1: initialize graph
        for each vertex v in vertices do
            distance[v] := inf             // Initialize the distance to all vertices to infinity
            predecessor[v] := null         // And having a null predecessor
    
        distance[source] := 0              // The distance from the source to itself is, of course, zero

        // Step 2: relax edges repeatedly
        repeat |V|−1 times:
            for each edge (u, v) with weight w in edges do
                if distance[u] + w < distance[v] then
                    distance[v] := distance[u] + w
                    predecessor[v] := u

        // Step 3: check for negative-weight cycles
        for each edge (u, v) with weight w in edges do
            if distance[u] + w < distance[v] then
                error "Graph contains a negative-weight cycle"

        return distance, predecessor

This is pseudocode with some stuff abstracted away, yet they still added "what" comments. You can remove some of them and split it into several methods but for Step 2 I still kinda feel it's not enough information to make it obvious what is happening. What does it mean to "relax an edge"? Why do we need to do it V-1 times? Does order matter?

I don't think fancy programming language features help here. You can change iteration into tail recursion or point-free functional code or whatever is fashionable but the underlying complexity remains.

And there are algorithms out there that are much trickier.

Source: https://en.wikipedia.org/wiki/Bellman%E2%80%93Ford_algorithm...


Code doesn't exist in a vacuum, it exists in the context of our general knowledge.

I would argue that the only comment really useful here would be one pointing to the paper/wikipedia article/whatever that explains the algorithm.

If no such article/reference exists and/or your implementation has nuances you want to explain, that's where technical documentation should come in. For instance, check the linux kernel's explanation of circular buffers and how to use them [1].

[1] https://github.com/torvalds/linux/blob/master/Documentation/...

> What does it mean to "relax an edge"? Why do we need to do it V-1 times? Does order matter?

These are questions related to understanding the algorithm, not the implementation. There are much better ways to convey that information (including examples, diagrams, lengthy explanations, etc.). Trying to convey that information through comments in an implementation is an exercise in __not__ using the best tool for the job ;)


To an extent List Comprehensions are, if not a fix, at least an analgesic. But FP is definitely a case of The Future is Here, It's Just Not Evenly Distributed. Calling it an 'advance' might be a stretch, at this point. Lost wisdom?

I would chalk the complexity of the example you provided up to the inability of half of developers to handle four nested conditionals, and for most of the rest to handle five.

I sort of handwaved over the Death By A Thousand Cuts scenario, which I definitely believe is important. It doesn't matter how idiomatic your code is if you have enough of it. I don't maintain that idiomatic code is free by any means. It just doesn't have the factorial surface area of spaghetti code. It scales better, but scale absolutely matters.

If you count conditional branches, you'll see that parts of this code are pushing up against those limits. You're definitely into 'how' territory, but you've also shunted it off to a function that just 'does stuff'. So my code that relies on this for step 1 of a bigger answer doesn't give a shit (as long as the answer is correct, including boundary conditions).


It makes more sense if you emphasise COMMENT as well.

> NEVER try to explain HOW your code works in a COMMENT

Comments are for explaining why, not how. If some code is algorithmically complex enough to need explanation, that should go in documentation, not an inline comment.


Comments are documentation.

I've never seen or heard of a team that keeps documentation for their complex algorithms. It's just not done.

And if it was, I'd bet that they get out of date very fast


One thing a friend of mine (hi Adrian!) did at my old workplace, that I found incredibly annoying at the time, but later realised was exactly the right thing to do, was set up a wiki for us to use and then refuse to answer any question or acknowledge any information except via that wiki. After a while I just gave in and before telling him about anything, stuck it on the wiki. Years later when I left, every time anyone asked me a question I could honestly answer "it's on the wiki."

People love to hate on Bezos (less here than elsewhere, though, I guess) but this is exactly the tack he used to convert Amazon's IT setup to services, and I think both were shrewd moves.


Don’t agree. Longer pieces of code are way easier to reason about. Deep function nesting, which seems clean is very confusing mentally, and also difficult to describe in terms of function name.

Explain *why* you’re doing something a certain way. This also includes a little bit of how and what. Other than that, just make sure your business terms are reflected in the code, or at least have a proper definition of things, which you can easily reason about.


They may be easier in some cases to reason about - which is great until you start writing tests.

Once you start writing tests you find that the size of test functions will reflect the size of the function you're testing. The test functions start to have huge set-ups in order to test a small bit of logic, and you end up spending a lot of time fixing tests every time you make a change to that function.

This is why it's often better to find ways to make functions smaller.

There is a difference however of people refactoring code into smaller classes / functions vs what I'd call "hiding" code in classes / functions. If all someone has done is broken a large function into a chain of functions then of course that is not good.

When people refer to small functions / classes they refer to breaking up the concepts that a class / function represents into smaller concepts that build upon one another. This also increases the reusability of code.

So in short - I agree, longer pieces of code can be better for reasoning about, but I disagree that that necessarily makes them better.


I worked with a guy who liked writing little <10 line functions, deeply nested, in dozens of files. Each individual function was nice and easy to read but figuring out what the entire call tree was doing required hours of archaeology.

Edit: He also refused to ever delete any code he'd written "because it might be useful later." I once spent three hours tracing the source of some values, only to discover that the output of the entire call tree I was working my way through was being quietly thrown away at the end.


Eh, to each their mental models. I have much less trouble understanding nested and, some may say, very-class-segregated code than mostly procedural one as the general shape of the code + the pattern names used generally gets me 95% of where I have to be to understand what is the issue at hand. But then I found out that not a lot of people draw diagrams when they are coding ? (Which in my case helps immensely - and after some time you can just draw the diagram in your head)


I've always felt like this is where IDEs could help. You should be able to "inline" the implementation of a function or even whole class without actually changing the code.


I've always felt the same, so I recently made an interactive mock-up of such an IDE feature: https://emilprogviz.com/expand-calls/?inline,substitution Not mobile friendly, best seen on desktop. To inline a function call, either click the "+" icon, or go up/down with the keyboard and press spacebar when the caret is on that line.

(There's also a video where I explain the reasoning behind the mock-up, but I guess the motivation is already blatantly obvious to most readers of the above discussion, so feel free to skip the first chapters: https://youtu.be/wdf6S6oumH8 )

Then I learned that the Smalltalk editor Pharo has a similar feature! Seems many people have the same idea. Today's major IDE makers should have copied this feature already!


Agreed. When looking at complex code, it can be helpful to expand the functions inline to be able to view all of it in a single screen flow. Switching between too many files and functions sometimes creates friction, making it more difficult.

However, when writing the code if a section of code is repeated more than 2 times, it should be considered to break it out into a separate function with a useful name.


Unless you’re doing something like performance analysis, I think needing to peek into the definition of a function to understand a piece of code is a smell: it indicates that the function is only abbreviating and not providing an abstraction.


Well a lot of the time when you are looking at something you are trying to diagnose some problem or another, which means you want to be 100% sure of what something is doing. Even if the function provides a good abstraction you might still want to make sure that it is doing exactly what it says on the tin because you are desk checking some weird behaviour and you want to make sure each step is acting correctly


Agreed, but that means you verified that the use site was correct and are now examining whether the abstractions it depends on were implemented correctly.

My point is just that you shouldn’t normally have to keep both sides of a call site in mind at once.


The WHY is also very useful in comments, especially in business/UI/glue code where special cases abound.


I would say the why might be more important than the what a lot of the time


Came here to comment along those lines.

The WHY often will be elaborate enough that it deserves a design document that includes things like: who is the owner of those requisites, what is the rationale behind them, who are the stakehilders who signed off, what interactions it has other components, what alternative designs were considered, etc.


This. And in particular if there are multiple ways to do it, and you had to choose a particular way for a reason.

Comments should explain the decision process (if one was necessary) that lead to this variant. Because if there's no comment explaining it, the next developer looking at it will say "oh, but it's much simpler if I just do it that way", and will run into trouble because he didn't realize the pitfalls.


What's your take on the difference between explaining how code works and describing what it does?

I only ask because of bits like the TCP rate estimator sometimes have step by step comments that might be interpreted as explaining how the code is doing something.

For example take the app limited detection function: https://github.com/torvalds/linux/blob/master/net/ipv4/tcp_r...


My take is that step by step commented explanations make sense when it's something either extremely complicated or making use of a system or tech that's not widely known.

Otherwise, clearer code, smaller methods and better variable/method names are the way to go imo.


The exception to the WHAT/HOW divide is working around a 3rd party bug. Documenting workarounds can help prevent reintroducing the same bug when someone edits or copies the code. This can be especially helpful if the 3rd party bug gets fixed in such a way that your workaround now fails.


> try to avoid putting comments inside a function body

This makes me sad. Think about others who will have to reverse engineer your code. Not every functionality should be split into a function, unless you want to add 10 layers of abstraction and misdirection.


I prefer over-commenting a lot, over under-commenting.

The only harm is, when you miss the important bits over too much blabla, or when you forget to update comments. There are not many worse things, than wrong documentation/comments.

"you want your comments to tell WHAT your code does, not HOW" but I agree very much to this.


I've seen this interpreted by smart engineers in a counter-productive way, in that they believe their code is so good that none of it ever needs any comments and hence they never write them despite the fact that their code isn't all that clear. It's more hubris than anything.

Otherwise, there are many techniques you can use to make the code self-explanatory. I've taken a comment before and restructured the code to literally read like the comment.


It’s an old assertion of the Refactoring community that a commented block in the middle of a function is a separate function trying to break free. If the code is interesting in its own right, move it.

That also extends to how. If the algorithm is interesting or complex enough you should move it as well, as the point of the host function is not to show how clever you are, but to accomplish a task.

Empirically I’ve found that I get much less pushback on algorithmic tweaks when I isolate them to a separate function. The reader doesn’t have to include them in their local reasoning, and I suspect but cannot prove that they tell themselves that they can always revert that code cleanly, so let’s just leave it be for now and see what happens.


Interesting reading about RuneScape in your post. I got into programming because of RuneScape and writing bots! Great memories.

I enjoyed reading this post. Cheers!


Writing increasingly sophisticated bots and trying to figure out what exactly would earn you a ban was my own first practical experience with coding.

I mostly managed to dodge bans on my most used accounts at the time, but interestingly when I checked back a year or two ago the majority had been permanently suspended. I wonder if they have some method of retroactively determining if someone was botting?


Thanks! Likewise. Truly a great sandbox for budding programmers.


I got into programming at 14 because of writing scripts in Gemstone, a text based MMOMUD. Didn't even realize what I was doing was programming until college though...


Gemstone! That game changed everything for me


I think sometimes you want to put why in a comment. Just today I wrote a two paragraph comment (usually I don't write many comments at all) that describe the business need for why this the code is necessary (in my case, connecting to a 3rd party data source that overshares, describing the nature of the oversharing, outlining the -- very complex -- steps in the pruning algorithm, and noting that there are corner cases we might miss and also noting that we shouldn't care -- a handful of bad apples in a machine learning algo will probably disappear in the wash).


Yup, I rarely ask "what's this code do" or "how does this code do that".

But "why the fuck" is a daily occurrence.

Self documenting code explains how and what. If you need a comment to explain how and what its a code smell. Sometimes it's ok but ideally it should be rare.

I try not to overcomment but when I write code that looks confusing or clever I try to explain why I've done it that way.

If I'm optimising, I'll explain why.

If I'm making an assumption about something related but elsehwere I'll explain that.

If I'm leveraging some context that isn't explicit, I'll explain that.

...

Also, type of code matters, glue code rarely needs exhaustive comments.

Library code often gets line by line commentary so when someone comes in to change something for their use case they have a clear understanding of what they're really changing, and who and what might break because of their change.

I notice library code tests often get more comments explaining what and why I'm adding an assertion.

Glue code test rarely needs comments in tests.


Another reasonable exception is referencing external sources - whether it's a specific library you borrowed code from, a piece of documentation needed to better understand what's going on, or a reference to a SO post explaining some weird workaround you've implemented.


The best comments are about intent. I can read the code and tell you what it does, but I have no idea if that's what it's supposed to do without knowing the intent.


Define "comments".

Like many arguments, disagreements below are mostly not disagreements about facts but disagreements of definitions at heart, only masquerading as disagreements about facts.

Beyond a trivializing definition of "whatever follows a #", each time the word "comments" appears in the discussion below, it seems to actually refer to something entirely different from one person to the next.


"Comments are good, but there is also a danger of over-commenting."

IMHO The only way to tell if you are able to comment well is to do this: Work on some project - then in 2 years come back to that project and add a feature to it. You will learn real quick how good your comments are. Spoiler alert: You wont care about "how" - you will want to know "why".


> tell WHAT your code does, not HOW

And also Why.

I personally like an executive summary of the code, ie, I just got to this file, what am-I looking at.


Small typo, it's "Malcolm Gladwell in Outliers" not "Outsiders", I particularly noted it because I learned the word "outliers" in English thanks to that book.


Thanks, fixed!


I agree with almost all of these viewpoints. The comment one though isn't quite right.

Sometimes you just need a comment. For example, I was writing some code to an API and the company was offering a private experimental feature. The API was JSON, but this one feature was enabled by adding a small stringified JSON object to a field of the larger JSON post body. If a developer looks at it, they'll think "this must be wrong" but with a comment that briefly mentions why, everyone will be saved some time.


I find it very helpful, when implementing part of the compiler, to include a link (for D) or a paragraph number (for C) that references the corresponding specification section:

https://github.com/dlang/dmd/blob/master/src/dmd/cparse.d#L5...

Ever since the Intel CPU spec went online, I started doing this with the code generator - providing links to the man page for the instruction being used:

https://github.com/dlang/dmd/blob/master/src/dmd/backend/cgx...

And for bug fixes, reference the issue which often gives a detailed explanation for why the code is a certain way:

https://github.com/dlang/dmd/blob/master/src/dmd/expressions...

Ever since I enhanced the editor I use to open the browser on links, this sort of thing has proven to be very, very handy.


I've done the "reference the spec section" thing before -- the difficulty is that often section numbers or similar things get stale when a newer revision of the spec document is published; so it can work for a slow-moving thing like the C spec but is more awkward if the spec/manual gets a revision every six months.


It's a good point. This is why for C the references are to a specific standard, i.e. C11. For D, I use named links, which we try to not change. The bugzilla numbers do not change, which is a feature of bugzilla.

Even so, the utility of the links far outweighs the possibility of needing to adjust them.

P.S. I've found using links to Microsoft documentation to be extremely frustrating, because not only does Microsoft move them around all the time, but will wholesale permanently delete vast swaths of reference material.

Notice that the github links are to line numbers. Line numbers are ephemeral, I don't know a way around that.


> Notice that the github links are to line numbers. Line numbers are ephemeral, I don't know a way around that.

If you go to your link, under the button with 3 dots there's a "copy permalink option" that creates a link to the specific commit you're looking at, ensuring the line number doesn't get out of date. Example:

https://github.com/dlang/dmd/blob/9003e2ae7e11faed433638e9cb...


Perfect! Thanks!


One way I’ve found is to quote the thing I’m citing. Context and length dependent, of course. This also has the advantage of showing how the doc looked when you write the code, so the comment rots less slowly (and if something changes, you can see by comparing the excerpts)


I've considered that, but was concerned about copyright issues.


Interesting. IANAL but is saying “we have to do this dumb workaround because of xyz” criticism under (US) fair use?


You're probably right, but I don't want to deal with a lawyer on this. It's not worth it.


Looking at the code on GitHub, this extension may be useful: https://greasyfork.org/en/scripts/2709-linkify-plus


Not needed, I fixed my editor to automagically recognize URLs and make them clickable!


Yes I agree with the sentiment, but I think long comments are useful for explaining why a choice was made. Or an idiosyncrasy of the system like you said.


This. Comments are for why.


Comments being for why or what is not really important imo, having strong feelings about this probably hurts more than it does good. The goal is for someone else (or you) to understand what's going on at a point in the future, you include what is needed to achieve that goal. Whether that's what or why or something else doesn't really matter. Sometimes you don't know how much you have to include, but experience and asking colleagues will help you get a better idea.


IMO commit messages are for why.


Do you really prefer digging through years of commit messages to find why a particular line exists? It's a very impractical place for those. And good luck finding the origin of that line if there was an extensive refactor in between.

Also, how do you notify a later viewer that there's a "why" they should check? A comment that says "check commit messages for this line"? :)

Not to mention: you're losing the notes as soon as you move repositories, or worse, version control systems. Yes, I've hit the "SVN git migrate" wall way too many times. For that reason I even started leaving issue numbers in comments for when it's particularly important, in case we lose the commit<->ticket link down the line.


Typically, it isn't digging. It's the last commit, maybe the one before it, and I have tools for doing this (e.g., magit, fugitive).

I don't notify anyone, nor do I need to. When I encounter a WTF moment, I look at the commit messages. When there's nothing there, I curse the last developer.

I guess it can be fun spelunking through commit messages, though, to see when the comment was added and if it still applies.


LOL. I'm going to explain this for myself and then never look at this cursed thread again.

What is important is to clearly communicate the intent of code to the next developer (incl., yourself) who comes across it. Comments are one tool provided to do this. I write comments. I approve pull requests containing comments. But comments are not the only tool, and they're not the best one.

Now most of the time there's no particular difficulty establishing the intent of code. Sometimes, though, there's something weird, something that looks "wrong" but isn't. In those cases, I isolate the weird change using abstractions (at least a variable, but probably a function, a method, a class); I name it and any components well -- long, descriptive, using appropriate conventions; I put automated tests around it, which will fail if anyone changes it without understanding it, providing helpful and explicit error messages (where tools allow); and I write a developed, full explanation in a commit message (which IME will not be too hard to track down, if you've isolated the "weird" change using an abstraction).

That will almost always communicate the the intent. If it doesn't, I (reluctantly) comment it up.


Commit messages are the worst place for that. Someone should write a post once and for all about this myth so we can refer to it in the future.


.... sometimes, yeah. When "why" is highly temporal, it's a great fit, since that doesn't often have anything to do with the code or behavior itself - put that in the commit message, and/or in your ticketing system. It's a waste of space and a distraction / source of confusion in code.

For many other things though, it's so easy to "cover up" commit history. Changing how you indent things (or splay one line into multiple), correcting a spelling error, etc all make it dramatically harder to follow than an in-code comment, even with good tool support (which is usually mediocre at best). Of course, you can keep those changes separate, and add them to an "ignore these commits" list for git... but few teams are capable of maintaining that in the long run.

(I assume/hope you just got downvoted in a burst of emotional bikeshedding. certainly doesn't seem negative-reputation-worthy to me)


Until a few years later when you have to move the source code to a new repository or totally different source code control system and lose all the commit messages. In theory there should be a way to migrate the entire tree and keep the commit history but in practice from what I've seen that seldom happens.


Or refactor something and git no longer thinks the new files are related to the old one since there is too little overlap.


Yeah but you (or another developer) probably won't see the commit message later in the future when you come across that segment of code.


I think vs code has an extension that lets you see git info (committer and commit message) inline if you hover over a piece of code.


Git lens


By using "should" and "probably" it's acknowledging that there are exceptions.

Though I don't think your example is an exception. I would refactor that part out to a seaprate function, and give it a docstring like "json in json, because the api is weird, [link to api docs]." Basically, weird things like that should be refactored into their own spot, instead of called out with a comment in the middle of something else.


10 years later I am very happy when I find I left myself a little note explaining to my future self my thinking at the time… more often then not though I’m left trying to remember those busy days building all the shiny new things


The only side projects that even have a chance of being rescued from certain death are the ones my past self was kind enough to explain.


sometimes a comment is needed:

   /* DO NOT DELETE the above test
      If you're on localhost and opening/closing sockets
      at high speed, you might end up being connected 
      to yourself by accident. That's what you get 
      when everything is a file and all files are
      represented with integers
   */
For example.


Exactly: I think it's almost a rule that you must comment code that does something out of the ordinary for a good reason.


I agree - I think I needed to add a bit more nuance, which I did in https://news.ycombinator.com/item?id=28087165


Appreciate the clarification. Comments can (and are) definitely overused and abused, but they absolutely have their place.


With comments it strongly depends on what kind of project your are doing. Is it code which will run in a production environment? What are the capabilities of other other people working on the project? Is it a learning project, in which you soak up every line and explain it for yourself and your future self? Is it your own code, which you are writing? Or is it code you found online and only want to comment? And comment for whom? ...

I find especially in learning projects, where I might write some quite dense code, it helps to describe what each line does and answer all the "idiot questions", the ones that you can answer a few minutes later or perhaps hours later with a face palm, when everything is clear again, that arise, when you have to cold boot your brain to pick up the project again or use the code in a real project later on.


For a larger scale one-man project, you can also use mnemonics to keep track of what code's doing what, but that is unlikely to work out smoothly in a team-work setting.


Even in situations like that one, I think it's possible to avoid the comment:

- setting up a type (with a comment linking to that API's documentation)

- setting up a shape validator & a test that asserts that that that field is a string with the correct shape

- Using a variable name that makes it clear that something intentional is happening: `const weirdlyStringifiedJSONObject = JSON.stringify(...)`

A comment might still be the simplest thing to add, but it doesn't mean it's the only way to document that weirdness. Comments can be fragile compared to the same documentation expressed in code

[edit: list formatting]


Those definitely work, but adding a comment is the easiest and clearest way. No need to avoid comments like the plague, just avoid overuse.


I agree that these are all good things, but they cover the what and not the why. I find design documents useful to cover the why--something that results in a long comment in my experience is usually indicative of some non-trivial design decision.


>Know when to break the rules. For rules like "don't repeat yourself," sometimes a little repetition is better than a bit of dependency.

Glad to see DRY called out here. I've seen so much crazy code simply to avoid breaking, The Rule.


What's frustrating is that DRY is often paired with something like "The Rule of Three". Do repeat yourself about three times until you actually understand what it is that's being repeated (or whether it's just coincidental) so it can be refactored. But a lot of people forget that part and focus only on DRY.

Is the magic number 1024 the same in all instances, for example. Or is it merely a coincidence that it's the same in a few places. Is the apparently repetitious code:

  context = create_context();
  context.action(params);
In several places really the same? Or is it a coincidence that right now none of them pass any parameters to create_context and use the same params in their call to action?

Avoiding the repetition makes it hard to even ask that question. And disentangling the different cases later is riskier than keeping a few pieces of repeated code around for a while to understand the situation better.


The way I see it is as:

> The DRY principle is stated as "Every piece of knowledge must have a single, unambiguous, authoritative representation within a system".

I don't avoid copy-pasting as a rule, but would deduplicate a block of code if they represent different occurrences of the same "knowledge" of a shared procedure, but create separate functions or duplicate code if each one encodes "knowledge" of a different process to be followed. Of course it's hard to define and codify "knowledge" in an objective form that different people can use the same way, but in my experience, I still feel this rule (and the rest of my gut feeling for refactoring) hasn't provided bad guidance in the situations I've encountered so far (though it's not always applicable, takes domain understanding and experience to identify "knowledge", and it's sometimes difficult nonetheless).


I like this formulation, because it calls you to answer the question of whether several snippets of repeated code represent the same knowledge or different knowledge.

GP's example of the constant "1024" present in multiple places doesn't necessarily mean they represent the same knowledge; one could be a bit mask, another could be the default number of items to show on a page, and another could be a buffer size. DRY-ing these different pieces of knowledge could be disastrous if the requirements for one of the uses changes. New requirement: only display 10 items on a page. Good luck with your bit mask reading the second and fourth bits now and filling your 10-byte buffer 100 times more often. New requirement: We added more flags, the bit mask is now 0x80000000. Good luck serving pages with 2 billion items and your memory usage randomly spiking by 2GB.


Instead of calling it 'The Rule of Three', call it WET (Write Everything Twice); makes it easier to remember (although I think that is one less repetition than suggested in the parent.)


The problem isn’t the acronym, it’s the dogma that grows around it. WET will be wrong in certain circumstances too.

Our industry is full of dogma and false prophets. It’s unbelievably frustrating.


Nobody should always DRY or WET; my context is I wrote some code for a greenfield part of a project and was told to DRY multiple times. Was easy to reply 'I prefer WET here' and link to an article on WET rather than explaining the redundancy.

Interestingly, I just looked at the wikipedia page for DRY (https://en.wikipedia.org/wiki/Don%27t_repeat_yourself) and see they also mention another acronym - AHA (avoid hasty abstractions); I might use that in the future.


In that case, you could keep the acronym the same, but just explain it as Write Everything Thrice.


The name is largely irrelevant, though WET may be more memorable when attached to DRY. The important element is the deliberate introduction or allowance of repetition with the intent (and ideally actually doing) of examining and refactoring later. Later is often not even that much later, IME. Usually the next step I take after putting in the repetition because I can see the common structure properly while it's all fresh in my head, but I couldn't see it when I initially started writing it (or saw the wrong part as the common structure, which bit me enough that I committed to the rule of three idea).


Write Everything Thrice?


just stay MOIST and you're gravy


"Three or more, use a `for`."


Trade-offs are the heart of this profession. Developers who stick blindly to DRY, SOLID, any dogma really, come off as inexperienced.


Every few months there is a comment complaining about DRY, and a counter comment complaining about misunderstanding of DRY. So allow me:

DRY originated from The Pragmatic Programmer, and is not about avoiding duplicated code. In fact, quite a bit of duplicated code doesn't break the DRY rule as originally defined.

If I make my own list, I'll be sure to put "Understand what DRY is before complaining about it." :-)


We should refer to removal of accidental syntactic repetition because of an overapplication of misunderstood "DRY" as "Huffman coding."


Same here, furthermore I've written so much crazy code early on to avoid breaking The Rule, looking back I'm double horrified; at what I did, and knowing that a lot of junior programmers are going through that phase now.


I like WET: "Write Everything Twice [before factoring it out]"


> Syntactic sugar is usually bad.

I disagree with this one. It can be bad, but if it helps syntax get out of the way to reveal the intent of your code, I think it's pretty good


I'm actually on the opposite side of this. I know HN is more web-focused so embedded isn't as much represented but I'm scared every time I'm trying to learn Javascript or even Rust.

Syntactic sugar complicates code for me, it may help to reveal the intent what the code is doing but I *have to* know how the code is doing it. Of course you can just learn what the compiler will do in every of those instances but that's a lot to learn and it increases exponentially. On the other hand in C you have assignments, conditionals, loops and function calls - nothing else.

In the end the code will do the same thing but I'm willing to spent few more keystrokes and maybe an added comment just for the peace of mind of knowing what is actually happening. (One may say that the assembler output is still a blackbox but the general structure of the computation will remain intact)

It may be outdated, I agree, but it's just something I can't shake off


Idk I understand what you are saying, but in my experience you just end up building an intuition around what the sugar is doing, and you can de-sugar mentally to understand about what the code compiles down to. But I can understand if you're really working on HPC or something like that how you would want to make it completely unambiguous.

With Rust I think it can be a bit tricky, because a lot of it is context dependent, and the compiler will often make a lot of things easier for you until it can't, at which point the error which got you into a situation may be pretty far from the code you just changed. That's why I think syntactic sugar is best when it's quite dumb and local. I.e. X is just another way of writing Y.


> a lot of it is context dependent

I think you nailed it here what I couldn't put in words. I don't mind context-dependency when it's in the code I'm writing (calltree, variables, flags etc.) but for some reason if it's context-dependent in the language itself, hidden from me, it feels uncomfortable because I cannot change it, have to learn it and then have to debug it.


I've been thinking about it a bit more and I think I've gotten to the bottom of the point I'm trying to make. But again, it's very subtle and unimportant 99% of time but it's the way I tend to think about code

First example (C):

for(i=0; i<N ;i++) { t = data[i]; ... }

Second example (python):

for t in data: ...

The difference is that in the second example I'm telling the compiler what I want to *have*. It will give me exactly that ('t' will contain consecutive values from 'data') and it will do it's best automagically. In the first example however, I'm telling the compiler what to *do* and it will just and only follow orders (iterate 'i', copy the value from data+i to 't').

Insignificant difference most of the time but an important one to me. And sugar makes it more complicated to distinguish what I told the compiler to do vs. what I wanted to have. Helps me with debugging and getting cycle count under control


do vs have python vs c interpreter vs compiler

but then who is the doer and haver c vs asm complier vs assembler


Any thoughts on Ruby? As someone who's worked with it a great deal of the time, I find it extremely simple and concise to write in.


I haven't tried Ruby at all to be honest. If I need to do some quick'n'dirty scratchpad calculations on x86 I'm using python but I find myself still using 'C-like' approach instead of idiomatic python. I.e. even 'for x in X:' gives me heebie-jeebies for some unexplained reason (though I use it). I think it's because I've been conditioned for so many years to think about data/code as just that, data in memory. So every time I need to access something I'm thinking *where* it is first, not about getting it - that's secondary and, paradoxically, unimportant. I'll use it in the next stage/line but by then I already have it (I don't know if that makes any sense at all).

It makes no sense to do it most of the time in most circumstances, and I think downvotes I'm getting represent that (though I'm not preaching it or anything, just commenting about my point of view), but if you have a hard deadline of X cycles you think about code differently.


I totally understand. Having recently transitioned to embedded C/C++, I can tell you that the way I think about memory and memory access in this world is very different than my thinking about it in python for over a decade. I really like iterators, they help eliminate a whole class of bugs, but after the last couple of years, I'm not sure I'd be so comfortable with them if I'd been counting cycles for most of my career as well.


> I can tell you that the way I think about memory and memory access in this world is very different than my thinking about it in python for over a decade

Would you be able to shed some more light on this in the opposite direction? I would appreciate it because I'm moving to a higher role and I would like to develop tools my team can use to streamline development but I can't implement them myself and I'm very anxious to take over the tools team because my brain is wired in a very different way (I know what they need but not how it should be written) and what I think is important is probably last year's snow there so I'll be slowing down things unnecessarily. Any advice?


If I understand you correctly, the main difference is to stop thinking in terms of bits and bytes and think more in terms of amorphous data. Like, for x in X: is really more appropriately expressed as for value in values:

Values is just a container, and we'd like to grab one of what it contains one at a time to do something with it (I'm assuming native English speaking here, other language's pluralization might not be as natural to this convention). It might even make sense to start with some kind of notation like, for value in valueList:, to help internalize, but this can be a crutch. Because of the ducktyping nature of python, values might be a list today, but the same code can easily handle any iterable (I've done enough maintenance programming where it was clear some variable had never been renamed, it was once a list, and was now something else. Type annotations on newer codebases help here). The other thing to realize, is that heterogeneous collections is one of the things that Python excels at, and something that is likely to make your head hurt in C (I'm not even sure how I'd implement mixed type arrays in C without doing something gross with void*). I'm not saying you necessarily want to mix types since it can often lead to a bad time, but it still helps to think about just how amorphous the data can be.

Another thing that will feel really gross, is that Python is going to be orders of magnitude slower than your used to in C (with similar memory bloat). Just get over that, and know that if you absolutely need to speed up parts of the code, you can write those parts in C fairly easily.

Anyways, good luck.


If you're used to writing embedded code then I'm not surprised - you want to be able to fully introspect everything that is happening since understanding performance is so important.

Syntactic sugar is often better in high-level languages because it provides a level of indirection and allows for more flexibility in API design.


It complicates code until it makes it easier, sometimes.


Yes, maybe usually was too strong of a word - I think this varies with the language. In Go and JavaScript (where I mostly work), it's usually not that useful.

In languages like C# and Java, I've found syntactic sugar to be very useful (e.g. lambda notion)


I'm under the impression that, in Javascript,

> async/await is syntactic sugar on top of the promises and provides a way to handle the asynchronous tasks in a synchronous manner.

as is stated on this website I just found with one second of googling to support my thesis: https://dmitripavlutin.com/javascript-async-await/

If that is true, I believe it's incredibly useful syntactic sugar.


Fair enough, there are clearly lots of useful syntactic shortcuts. I'm referring to things like prefix/postfix increment/decrement, object spreading vs. object rest, and ternary expressions.

I'd avoid cases where extending the language is done in a non-obvious and ambiguous way.


I love me some spread operators and ternary expressions.

…mumbles something about prying those from my cold dead hands.


Yeah I think there's a time and a place for ternaries, but when used correctly they're just so concise.

Like if you have super long expressions in your ternaries, or you have a bunch of nested/chained ternaries it can be hard to parse, but in a lot of cases it's so nice to get something onto one line instead of needing an if/else


Yes in c/++ it also allows you to use const.

const foo = bar ? 1 : 2;

Without ternary you wouldn't be able to use const specifier on foo.


Love the ternary operator in gawk. Here's some extreme examples, from a text formatter I wrote over 20 years ago. It is over 6,000 lines of awk, and I haven't used it in over 15 years.

Note: These lines should be split, but many of them are not. How do I do that? I read formatdoc, but that did not help.

; # see where text starts; mut = ((substr($0, lf1-length(lf1str), length(lf1str)) != lf1str) \ ? 1 \ : ((substr($0, lf1, 1) == " " ) \ ? (lf1+1) \ : lf1) \ ); # start of string;

---------------------------

# return file and line number with optional text; function filineg(ffilename, ffilerec, fltext) { ; # first version always gives a line number, second does not for line 1; return (ffilename \ (ffilerec == "" ? "" : (":" ffilerec)) \ ((fltext == "") \ ? "" \ : (" in \"" fltext "\"") \ )); }

------------------------------------

  outstr[2] = (head \
        ( (lnum > 1) \
         ? (substr(bigsep, 2, lnum-1) " ") \
         : (substr(spaces, 1, lnum)) \
         ) \
        (header2==""?" ":header2) null("output space if null - 5-20-98")  \
        substr(bigsep, 1, \
        ltocout-lhead-lpn-1 \
        -(lheader-lheadchop)-lnum \
        -((null_loc > 0) \
    ? lnullstring \
    : 0 \
          ) \
        ) " " pn); # parens make it line up;
----------------------

function telldepth(deptxt, dep1, dep2) { ; # change depth from dep1 to dep2; ; # good luck figuring this out in a month!; ; # tell how depth changes; recnumprint("{" deptxt " " \ ((dep1=="") \ ? ((dep2=="") \ ? ("leaves depth as is") \ : ("sets depth " dep2)\ ) \ : ("changes depth " dep1 " to " dep2) \ ) " in " filine() "}"); }


Prepend each line with two spaces to format it like code.


I have to say, I really love the back and forth between you and the various posters in this thread. Lots of good discussion and points being made, and reasonable compromises being agreed on - or even polite disagreement with thoughtful explanations why.


Lambda isn't syntactic sugar, it's a language feature.

In Java it's kind of a syntactic sugar (lambda is a shortcut for an anonymous class with 1 method) because of historic reasons, but I'd prefer if it wasn't the case TBH.

It was always a dirty hack.


C is just syntax sugar on top of asm.

We can go on and on. Syntax sugar is a compiler-level of abstraction.

It’s good


I see it as a language/api/environment smell, but also sometimes pragmatic. A good example would be the old MFC - somewhere between 'not ergonomic' and 'ungodly mess' if you were using it manually, but paired with VC++ 5.0 it was pretty good for bashing out general business and productivity apps.


Can you share the ones you like? I agree with OP that usually “sugar” hides hideous performance issues like allocs


There's a really good book called "Code Complete" from 1993 (revised several times) that I think captures a lot of this sentiment in a more organized and illustrated way.

A lot of stuff on this list has been tribal knowledge for decades (well, except for the part about build pipelines, language choices, and Stack Overflow, etc.).

My first code mentors in the 80's & 90's said some of the same things in this list, and I passed them down to my mentees (?) as well.

https://www.powells.com/book/code-complete-9780735619678


This is a fine list, but not unlike many others. What I find missing are the deeper elements rather than the rules of thumb or general observations. Things that I find valuable after 10k+ hours are ways of transforming a conceptual understanding of a problem into abstractions that produce a comprehensible and maintainable implementation. Talking about naming variables in depth alone would be more valuable than this listing, and I do hope there will be follow-up posts that go in to some of these points as well as cover topics that don't neatly summarize as 1-liners.


Great idea. Some in-depth follow ups are on my radar (I've been trying to publish 1 post a day, yet, never run out of ideas.)


Might disagree with the author on what is deliberate practice. I would say the implementing code at work is not deliberate practice in the sense that it’s defined in the original 10k hours thesis.

I think in the strictest sense deliberate practice is picking something that you’re weaker at, and spending time deliberately working on that.

I imagine that some part of the author’s 10k hours was working on projects outside their comfort zone, while others were within it. The argument would be doing stuff that you’re comfortable with or don’t have to think too hard about would not be deliberate practice.


Agreed - tough to nail down a definition of deliberate practice for software engineering. I credit three sources

- Working on a large open-source gave me access to high-quality reviewers from around the world.

- Likewise, at Google, I was surrounded by some exceptionally smart people.

- Finally, like you said, working on projects outside the scope of my work and expertise.


Thanks for your comment. I enjoyed your article. Deliberate practice is something I’ve thought about before in the context of programming and it’s hard to nail down. Especially because of how some of the skills are not programming at all. E.g. How might you deliberately practice skills like requirements gathering or managing stakeholders.


As an R&D engineer my grandfather used to say that if you're working on something new everyday for 20 years that's a lot of experience. But if you're working a job you can master in a year, is it actually 20 years of experience, or 1 year of experience 20 times?

I'm not saying you're wrong - just implementing code at work by itself isn't deliberate practice - but I think it can be if you work just outside your comfort zone. Personally, I find work incredibly dull when I'm not learning something new.


Good list, and the litmus test is I would have disagreed with a lot of it when I started, or at least I'd have wondered why it mattered.

My summary, and it won't be useful at all, is that a lot of programming decisions come down to judgement. Comment or not? Config file or DSL? Is it ugly? Is is a rare feature of the language? All of these things are the kind of thing that you could argue if you wanted, but an experienced programmer will likely have better arguments.

One thing I still don't agree with is the first one. There's a lot of things, mostly trivial, that are easier to find on SO than in the source code. In fact something like "what's the idiomatic way to concat a string in $lang?" is best found on SO rather than the source code, because the source code will allow more than one way to do it.

For number 2 problems ("In many cases, what you're working on doesn't have an answer on the internet."), the answer is again judgement. For this you want to have a network of programmers you've built up over the years that you can ask. I have a couple of good friends on chat that I can just pop a question to, and it saves a heck of a lot of time. Hard to find though, they have to be someone who is basically gonna work for free for you, and you have to provide a similar level of service when they have a question for you.


>In many cases, what you're working on doesn't have an answer on the internet.

>That usually means the problem is hard or important, or both.

Really ? If a problem is important someone likely already tackled it. I mean 15 years ago this was less likely, but these days there's so much work in the open and search is very good, when I find there aren't any references for my problem it usually means I misinterpreted the problem or I'm doing something very niche.


You're implying the reverse causation. The reflection is "can't find answer" -> "might be hard", not "might be hard" -> "can't find answer".

Surely there is copius documentation on a significant number of hard problems.


When you get above the 40th percentile in problem difficulty or novelty, there are no answers or even discussions on the internet.


I feel like your argument is a little along the line of: “ Everything that can be Invented has been Invented”


No, just that CS/programming is mature enough that important stuff has already been explored so if I'm hitting a dead end either I'm doing something very specific that hasn't been investigated yet or (more likely) I'm not framing the problem correctly.


After 30k/40k/??? hours of programming, I wish I had spent a chunk of the time learning how to play the piano well or something.


Ouch. Well, never too late to take up piano or woodworking or something. After 40k hours of programming, rock climbing might be out though... your poor hands!


Luckily I managed to pop out of the other end of the machine with my limbs intact.


There's not as much money in piano as in programming.


Looking back, that shouldn't have been so much of a driving force.

Now that I think of it, the right answer would have been several 10k hour professions. I can't say that the products I designed in 2020 were more complex or amazing than those in 1980. It would have been a chance to achieve high levels of proficiency in multiple fields.


The "10k hours rule" is of course just a rule of thumb, with a lot of exceptions depending on the skillset. I believe the number of hours required for mastering the violin was 25k.

I think the programming field is large enough to learn and improve your whole life. You probably have to change domains, programming languages, frameworks, architectures, companies, and switch from web to mobile, from frontend to backend, from serverless to embedded -- but there are always new ways of experiencing the beginner's mind.


I will say, I think the one point in this article I disagree with is about finding answers on the internet. There are certainly been many times I've needed to dig into the source of something I'm working on, but much more often than not I've been able to find an exact (or close enough) answer with a couple of different Google searches.

Granted, this may be because the areas I work in are less technically complex than, say, writing low-level code or IoT instructions.


And probably because you work with one of the most popular languages and frameworks in the world? Try an old, obscure niche language, and the experience will change completely.


10,000 hours programming is not specific enough of a target to equate to Gladwell’s examples. It’s like saying “I practiced playing sports for 10,000 hours, so I am a professional athlete.”

Now, if you developed using a single language targeted on a specific platform for 10,000 hours while challenging yourself at a high level, you would have a very strong level of expertise in that area.

Furthermore, Gladwell made the distinction that the hours spent should be deliberate and tailored to improve skills. Working on tasks handed down to you by your superiors at Google is not deliberate practice.


I could not agree more. By this logic, every young professional is incompetent and every older one is very competent. But reality is very different. I would say age and hours of experience bring wisdom, but not necessarily better thinking.


Too many of the items are deeply influenced by Golang principles or quotes I have seen from Gophers. I think this is a very interesting Go-oriented reflections on 10k hours of programming.

I think it boils down to each 10k hours of programming on (or mostly on) a certain language will give your different reflections that we think are general.


"And the disciples came, and said unto him, Why speakest thou unto them in parables? He answered and said unto them, Because it is given unto you to know the mysteries of the kingdom of heaven, but to them it is not given. For whosoever hath, to him shall be given, and he shall have more abundance: but whosoever hath not, from him shall be taken away even that he hath. Therefore speak I to them in parables: because they seeing see not; and hearing they hear not, neither do they understand."


lol... and... relevant


At least where I work, the DNS rule is "Batten's Law" because that guy's said it so many times, and been right so many times.

https://www.battenworks.com/

  It's DNS
  And, when you're sure it's not DNS, it's DNS


> Well, I'm certainly not a world-class expert

> Most recently, I worked as a professional software engineer at Google on Kubernetes

If this is not a world-class expert, who is?


... "which allowed me to have my code peer-reviewed by some of the best engineers."

The "I'm not an expert" mindset probably comes from working with many other people who, collectively, are certainly 'more expert' than the single author.


Don't be afraid of simplicity: The most simple it is, the better it is


Agreed!


For me the most helpful general guideline has been. Make the code easy to change.

I can break rules like DRY, if the repeating myself makes it easy to change.


Yes, I often feel like 'refactorability' is way undervalued. The focus is on writing abstractions that can be easily extended to handle more cases but as soon as some case doesn't fit, it will be hacked in because the code is abstracted in a very narrow way and cannot be easily refactored to accommodate this new case 'natively'.

Do that for a couple of years in a single codebase and the beautiful abstraction design has changed into a state machine with a transition between almost every pair of states


Yep, modularity is immensely valuable. Requirements change. Technology changes.


Configuration is hard. Most of the cases I see people eventually crank out their own DSL and "teach" other people to use it.


Author here, ironically, guilty of this https://matt-rickard.com/virgo-lang/


You may be implying this, but DSL is the root of all evil, imho. Unless it is code, and compiles, and can be debugged, or inspected via intellisense, then you have to "throw it over the wall", and cross your fingers.


Reflections on 40k hours of programming. Developers are a pain in the ass. But I still love making things work!


> While rare, sometimes it's a problem with the compiler. Otherwise, it's always DNS.

I don't quite understand this one? Does it refer to domain name server? So if you're having network issues it's always DNS problems?

> Corollary: Most code out there is terrible. Sometimes it's easier to write a better version yourself.

I never cut-and-paste directly from stack overflow. I cut from several sources, combine the best practices and adapt everything to make something novel, and hopefully "better". But i think it's important that you understand what it does before you use it. This might seem self-evident, but my experience tells me not everyone agrees with this methodology. If i find inspiration to a solution, i usually leave a link to the original source or stack overflow page.


The DNS part is a joke (I guess?) about how unreliable and surprising DNS issues can be. You could do the same with caching.


Oh certainly. In fact, caching issues (on multiple levels) is a constant pain in the ass in my line for work anyway. It can be so subtle sometimes that you spend hours looking at something that shouldn't be an issue. But it is, and i think we all know why.


This is a great read. Posts like this make me not fear being “just a developer” for the rest of my life.


Software engineering is one of the highest leverage things you can do. Keep it up!


Good list, and indeed not really advice for beginners.

I'm not sure reading the standard library for language X is necessarily a good way to learn the conventions and practices of a language. Typically a standard library contains a lot of complex corner cases that don't need to be worried about when writing ordinary code.

(Expanding). The standard library authors don't know the users of the code. While when you're writing a piece of code with a limited number of developers (could even be thousands) you might, for example, say "all our accessors for class X return const results" while the standard library authors have to handle the non-const case"


The advice is "learn from the best", not "learn from the standard library". The Golang standard library just happens to be among the best, according to the author.


Good point. If you're writing your own library, maybe. If you're writing business logic mostly, probably not. Thanks for the perspective.


> In many cases, what you're working on doesn't have an answer on the internet. That usually means the problem is hard or important, or both.

For me, it’s usually meant I’m doing something so wrongheaded it’s never come up before.


10K hours??! Try 100K hours (55 years x 50 wks/yr x 40 hrs/wk).

I still don't know much.


What would you expect to be able to do after 100k? Coding and debugging any codebase in any of the 10 most used languages without internet? God knows if I'll still be alive after 55 years


A lot of time was spent acquiring problem domain knowledge: life insurance, manufacturing, transportation, sales, packet-switching networks, typography, printing, food industry, graphics, GIS, etc.


That definitely broadens your horizon. The more I think about it the more I think that your value as a programmer comes from your knowledge of the domain.


> 4. Syntactic sugar is usually bad.

I agree now, after switching from Python to Go.


> If it looks ugly, it is most likely a terrible mistake.

I agree with this, looking back at my first year of professional coding I just realise how important code review is. Even if you are working for a small software agency / consultant, you have to force them to do code review before merging. Otherwise, you will spend a long time trying to make your code look better which mostly you won't be able too because lack of experience


I recently reached 10k hours too this week. (I've coded more than 10k, but only tracked time for 10k) Here's my language usage for those 10k hours:

https://twitter.com/alanhamlett/status/1423738961550184449


I don't understand the 11'th one can someone explain to me also the given link seems broken.

> If you have to write a comment that isn't a docstring, it should probably be refactored. Every new line of comments increases this probability. (For a more nuanced take, the Linux Kernel Documentation)


One way that I was taught is via the HTDP "design contract". So in this case a "docstring" translates to defining the method contact (its inputs and outputs) and the purpose of the function (what the function does).

https://course.ccs.neu.edu/csg107/design-recipe.html is a random link that describes all the steps, with something like 2. being the one mentioned (I think, I'm not OP.)



He means that you should avoid commenting single lines but instead write comments on whole functions.


I appreciate these, but in my experience it's not that relevant. The harder problems in programming are more of knowing your ecosystem. Knowing the design patterns necessary to create well meaning, understandable code in the first place. This is especially cumbersome when working on projects with complex data patterns.

For example, working with GUI applications and C/+ can make your code a big pile of garbage really quickly because representing your data in something like an ORM is not standard, you can't do the many tricks like getters/setters in C/+, but you can in Python, or C#, etc. The benefits of VM languages are, in my opinion, not appreciated enough.

In my opinion knowing the right tools for the job is far more important than how you comment your code, how you name your variables, or anything else. I wish there were more posts where people impliment an application with multiple tool sets and compare them, give insight into what some things are good for and not. "Tool Benchmarking" might be a good term for it.


Lucky for you #6 is "Have a wide variety of tools and know which ones to use for the job." :)


> I wish there were more posts where people impliment an application with multiple tool sets and compare them, give insight into what some things are good for and not. "Tool Benchmarking" might be a good term for it.

I recently started working on a project like this. It's hard work.

Building a complete tool in one language takes a long time. Not to mention 2, or 5.

Moreover, many (most?) programmers are only "fluent" in one or two languages. Even if you "know" a language, you might not know the full build toolchain, or you might not know common idioms, etc.

Most people probably want to work on building new/interesting/fun stuff, not reimplementing the same toy program in several different languages.


I've actually found that knowledge diffusion between different languages and different types of developers can be significant - e.g., game developers work much differently than distributed systems engineers.

Not always, but sometimes, an interesting knowledge arbitrage opportunity when fields start to collide.


> 23. While rare, sometimes it's a problem with the compiler. Otherwise, it's always DNS.

Is DNS here just referring to Domain Name System, as in to reference problems with systems out of our control?


9 out of 10 times when I would troubleshoot people's problems with active directory / exchange / lync / office 365 it was... DNS!


A lot of organizations control their own name servers. There are also heaps of problems that can manifest on the client.


Domain Name System - As a developer working on Kubernetes. It's always DNS.


Each point is true to me. Very nice job articulating these ideas.


> Syntactic sugar is usually bad.

When I read this, ES6 classes came to my mind :)

Great post btw!


Let me compare the ones I might've disagreed with as a newbie (out of college, maybe a few tens of hours total programming) and what I think of them now, after 17 years as a programmer.

- "Browsing the source is almost always faster than finding an answer on [the web, Stack Overflow didn't exist yet]." This would have taken a very long time as a newbie, not being familiar with C-style languages (I was working in PL/SQL and XForms, mostly) and the architecture of big software. Nowadays, depends how many levels of dynamic dispatch (the devil) the code uses. If it's more than one, the code is so unreadable as to be worthless to try to read.

- "Know the internals […].", same as above.

- "Syntactic sugar is usually bad." Depends how much more intuitive the sugar is than the salty version. Still the same opinion nowadays.

- "While rare, sometimes it's a problem with the compiler. Otherwise, it's always DNS." Disagree both as a newbie and now, but then I'm probably not working on anything similar to OP.

- "Some programmers are 10x more efficient than others." Certainly I was a <1x programmer as a newbie, but I don't think I've ever seen a 10x programmer. I've seen programmers which get features "done" by committing so many programming horrors that we were still dealing with the tech debt years later while they were at a FAANG, and I've also seen programmers which can whip up excellent code quickly but are unable to treat colleagues as adults.

- "There's no correlation between being a 10x programmer and a 10x employee (maybe a negative one)." I wouldn't have thought so as a newbie, but this rings true now. Visibility, agreeing with the boss on whatever they think is cool, being able to serve up banter on request, and joining all the "social" events are important because nobody is able to gauge programmer productivity yet.

- The "Heptagon of Configuration" is an interesting observation which I don't think I ever agreed with, but for different reasons. As a newbie because for most systems whatever we were using was usually decent enough, and now because I don't think this trend is cyclic but instead chaotic. We go from environment variables to Bash to INI to flags and so on. Usually this change is because we adopt some language or framework which staunchly refuses to treat anything but the Chosen Language as a valid configuration format, and so the existing configuration has to be adapted to work with N+1 opinionated (for the wrong reason) systems with as little pain as possible.


For me, the most valuable insight has been to understand the problem at hand with its intrinsic complexity, constraints and requirements. Only then can you construct efficient and clear abstractions with as little accidental complexity as possible.

Sometimes this can mean writing a very small amount of glue code calling external libraries. Sometimes it can mean avoiding a library/framework and rolling your own solution which solves a specific subset of the problem, enabling a smaller footprint and less dependencies. No silver bullet, really.


> If you have to write a comment that isn't a docstring, it should probably be refactored.

Yikes. Can't read past that.


Instead of a comment informing a user how something works (or how something does not work), write a test demonstrating and documenting its use.

The test will not forget, and has a docstring explaining what and why.

But crucially, it is automated, and does not rely on mistake-prone humans to check (as is the case of a comment).


There's lack of proof to back up the 10k hours. I get that he has worked 15 years at prestigious companies but I have to take his word on it. Some aphorisms jelled with me but other's didn't, so we're again at a "he said; she said" impasse.


He’s an ubermensch to be perfectly clear - most things that work for him aren’t going to work or apply to the rest of us because we aren’t tall, white, chiseled Adonises that went to Ivies and worked at Google.

I think most people should take the advice of experts like him - people that didn’t start from the bottom - with several grains of salt.


How is reading a library’s source code FASTER than StackOverflow?


Sounds like the author is closer to 30,000 hours.


> 5. Simple is hard.

Yes.


Simple is easy to maintain, so this is more of an inane statement. It's supposed to makes sense by reaffirming something vague in the context the reader's experience. This is an example of the worst kind of bullet points, pervading the "learned lessons" blogspam.


Good post, and a good list. Here's a mind dump on some of the items:

Right off the bat:

> Well, I'm certainly not a world-class expert, but I have put my 10,000 hours of deliberate practice into programming.

Okay, I don't know the authors exact history here, but I seriously doubt that they've had anything that's even close to 10k hours of deliberate practice of programming. Why? Because programmers basically /never/ do any deliberate practice. We don't have anything that's even close to what a piano player does when they practice fingersetting, chord transition, scales, or the like. I'm not even sure what this would look like, but I suspect that it is fundamentally different. Most musical instruments have a physical or mechanical side to them that is completely disjoint from the "musical" part of it. For instance knowing how to play a cool solo a guitar (meaning knowing which notes to pick and how long to hold them for), and being able to have your left hand fingers in the right positions in the right times and your right hand picking the right strings has almost nothing to do with each other. The way it usually works (I'm an amateur player who hasn't played in years, so maybe some professionals can take over this analogy) is that you practice very slowly and ramp up quicker and quicker untill "magically" your muscle memory takes over and it all kind of just happens. The result is that you don't really pick each indivitual note anymore, you kinda just play that part (when practicing you've decomposed the whole piece into small parts), and each part is atomic to you. Starting in the middle of a part doesn't quite work.

The only programming analogy I can think of (from the top of my head) is how effortless it feels to write `for (int i = 0; i < n; i++)`, or something like it. But we only spend a negligible time on actually writing out the lines of code when programming, so while being able to effortlessly write out a bunch of standard `for` loops doesn't really push the needle in any way. For a performing musicial, of course, this is very different, because the /have/ to play it right! Imagine having to write all your `for` loops in sync with the beat of music.

> 4. Syntactic sugar is usually bad.

It's hard to really get exactly what this means, because if taken literally it is obviously not true. Pretty much any loop or control flow structure is just syntax over goto, but I assume the author doesn't really think we should manually write out the gotos. Having recognizable patterns in a codebase is good. Cramming in some esoteric syntax quirk of your language into your code because it technically fits is not good.

> 5. Simple is hard.

Again, difficult to get something concrete out of this since it reads as a zen mantra. If they mean that finding a simple solution to a difficult problem is hard then I agree. If they mean that a system being simple means it is hard to use then I disagree.

> 7. Know the internals of the most used ones like git and bash (I can get out of the most gnarly git rebase or merge).

Knowing the internals of e.g. `bash` is too far for me. I'm happy if I even knowing the surface of it! I guess they mean that cargo culting these tools usually doesn't lead you anywhere good and that invesing the time into them will pay off, as opposed to "cookbooking" it, like with git in that xkcd.

> 9. Only learn from the best. So when I was learning Go, I read the standard library.

I think this is important, but it is very difficult for programmers to do this,

> 10. If it looks ugly, it is most likely a terrible mistake.

I don't think this is good advice because it suggests that all code should "look nice" and if it doesn't it's just wrong. In my experience the number one metric (if you can even call it that) one should be concerned about is flexibility. If your code is "rigid" and difficult to change it really doesn't matter a whole lot how nice it looks, because, presumably, you're not really sure exactly what you're doing, and the code you write will probably not last very long in the codebase. It's hard to put a number on this, but I'm pretty sure most of the code I write does not last a month. All the time I'd spend worrying about whether this code was "ugly" is effectively wasted, because in a month it will not exist any more. Make code easy to replace.

> 11. If you have to write a comment that isn't a docstring, it should probably be refactored.

I also don't agree with this. I think it's completely reasonable to have comments in function bodies expalining what this part of the function does. See Jon Blow's related blog post[0]. Docstrings are for users of a function, but you might also want comments for the people reading the function body. Sometimes this can be avoided by writing Good Code, sometimes not.

> 12. If you don't understand how your program runs in production, you don't understand the program itself.

Very true; I'd phrase it as "understanding the problem", in Mike Acton lingo [1]. If you don't understand The Problem, then you don't understand what your program, which is (a part of) the solution to The Problem, does either.

> 14, 15, 16

This is good advice, and I imagine it's difficult for beginners (or novices) to make sense of what to do here; some circles advocate for always pulling in dependencies, others for always inlining 3rd party trees in your own codebase. I actually think this is one of the parts of programming as a field that has the potential for some kind of a paradigm shift. I'm not convinced that in 25 years we'll be still trying to write generic "libraries" for others to use, and other people will download semvers of libraries to ensure compatibility. There has to be a better way.

> 17. Know when to break the rules. For rules like "don't repeat yourself,"

Thank you! DRY is probably the worst of the n-letter programming mantras. DO repeat yourself! Do whatever you need to quickly get a better understanding of your problem!

> 18. Organizing your code into modules, packages, and functions is important. Knowing where API boundaries will materialize is an art.

This is also very true. Given an API, filling in the function bodies is often close to trivial, and coming up with the API in the first place is indeed an art.

[0]: http://number-none.com/blow/blog/programming/2014/09/26/carm...

[1]: https://www.youtube.com/watch?v=rX0ItVEVjHc


> Okay, I don't know the authors exact history here, but I seriously doubt that they've had anything that's even close to 10k hours of deliberate practice of programming. Why? Because programmers basically /never/ do any deliberate practice. We don't have anything that's even close to what a piano player does when they practice fingersetting, chord transition, scales, or the like. I'm not even sure what this would look like, but I suspect that it is fundamentally different. Most musical instruments have a physical or mechanical side to them that is completely disjoint from the "musical" part of it. For instance knowing how to play a cool solo a guitar (meaning knowing which notes to pick and how long to hold them for), and being able to have your left hand fingers in the right positions in the right times and your right hand picking the right strings has almost nothing to do with each other. The way it usually works (I'm an amateur player who hasn't played in years, so maybe some professionals can take over this analogy) is that you practice very slowly and ramp up quicker and quicker untill "magically" your muscle memory takes over and it all kind of just happens. The result is that you don't really pick each indivitual note anymore, you kinda just play that part (when practicing you've decomposed the whole piece into small parts), and each part is atomic to you. Starting in the middle of a part doesn't quite work.

Well, there is "learning new stuff(practices, tools, approaches) by applying it on a toy project" or "read the source code of $popular project", which I would classify as deliberate practice. But I agree, getting to 10k hours with only that is hard.

The feedback loop for noticing and weeding out behavioral errors is simply too long and even determining what led to a success/failure is too hard to be able to practice in a classic way.


> Okay, I don't know the authors exact history here, but I seriously doubt that they've had anything that's even close to 10k hours of deliberate practice of programming.

Yes.

10,000 hours of code kata might be "10,000 hours of deliberate practice".

http://codekata.com/

And in-any-case "Malcolm Gladwell got us wrong: Our research was key to the 10,000-hour rule, but here's what got oversimplified"

https://www.salon.com/2016/04/10/malcolm_gladwell_got_us_wro...


I don't think that's true. As software engineers, our job is to create working software that fits the needs of the clients. Raw coding is a part of that of that, but not the only one. Collaboration is important, knowing your tools is important, talking to people is important.

Code kata are a good example of people trying to stretch metaphors from other domains to fit software engineering. It usually doesn't really work. For athletes and musicians, the "performance" part is a small part (in terms of time) of their job. They spend way more time training. For software engineers, that's not the case. There is a spectrum of things more or less important, for sure, but nothing as clear cut as with music or sports.

The difference between deliberate practice and "mindless practice" is usually using a moment to reflect on what you did, and how things went. In scrum, this is often the sprint review. So by that definition, if you do your sprint review correctly, you're doing at least some form of deliberate practice. Same thing with code reviews, you have the opportunity to have other people look at your code and evaluate it. You can of course add to that your own for of review. But in general, as a industry, I'd say we're very focused on practicing on the job.


> … stretch metaphors from other domains to fit software engineering…

OPs — "Reflections on 10k Hours of Programming" — inappropriately stretches popular metaphors from elsewhere.


Don't learn internals of git it's a waste of time. If you feel you need to learn internals of a tool then it's not a good tool.


I thought that "internals" means read the Git code and understand how each module interacts. In that case, it is probably a waste of time unless you are really passionate about it.

I would just phrase it as "understand the tools' fundamental concepts and philosophy really well".




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: