I have the same problem (i.e. terrible short-term memory, though my long-term memory is fine), and I've picked up a number of compensating behaviors over the years. By far, documentation is my #1 go-to strategy. I document code extensively, even if I'm the only person who is ever going to read it again. People have mocked me for this, but I believe it's a superpower.
Most programmers believe a number of blatant falsehoods about documentation, with the most prevalent being "comments go out of date quickly, so there's no point in investing in them". Maybe I'm just hyper-aware of it because my short-term memory sucks, but code comments have saved me on so many occasions that they're simply not optional.
You can document your code. You can keep it up to date. It isn't that hard. You just don't want to.
I practice something I could only describe as “comment-driven development”. Whenever I need to implement something non-trivial, something that might have subtle edge cases or has to be done in a certain way because of the external dependencies, I first write the comment before writing the code. I might iterate on the comment, not unlike one might iterate on a design document, and basically use it as a mental tool to fully understand the details of what needs to be done. After I do that, I use the comment to remind me about all these details as I go through the implementation.
All this is doubly important when designing an API, where the comment (also) serves as the externally visible documentation for that API. It helps me put myself into the shoes of the caller reading that documentation without knowing the implementation details, which I believe helps me design a better API.
I hear that “comments go out of date” a lot, but in my case, if a comment doesn’t match the code, it’s usually the code that needs fixing.
"The literate programming paradigm, as conceived by Donald Knuth, represents a move away from writing computer programs in the manner and order imposed by the computer, and instead gives programmers macros to develop programs in the order demanded by the logic and flow of their thoughts.[4] Literate programs are written as an exposition of logic in more natural language in which macros are used to hide abstractions and traditional source code, more like the text of an essay."
I would go so far as to say it is literate programming. People tend to think literate programs require an extensive system to support them, such as org-modes's emacs integration, or weird meta-programming facilities, but the truth is a program written in any language that support comments can be literate programs, so long as the content of the comments dominates and dictates the structure of the code -- which is exactly what GP commenter's method sounds like.
> so long as the content of the comments dominates and dictates the structure of the code
The key feature of literate programming is separating the order that code is written in from the order that the compiler requires, not comment to code ratio.
"The key feature of literate programming is separating the order that code is written in from the order that the compiler requires"... in the languages that were popular at the time Knuth wrote the definition, which were very rigid and uncompromising on layout.
We don't need specialized literate programming tools today because modern programming languages are already capable out-of-the-box of sufficient flexibility in organization that the additional marginal benefit of a superspecialized higher-layer toolset to weave pieces together is no longer worth the effort of learning and propagating out to teams. Yes, they technically can do things programming languages can't, but I could even honestly quibble on whether those things are a good idea.
This is why strict Knuth-ian "literate programming" has never taken off on a technical level and why it never will; by the time it might have gotten enough attention to really take off, programming languages had already largely incorporated the necessary additional flexibility that made it unnecessary to use specialized tools to write good code.
The human reason, of course, is that merely making documentation easier or nicer is not generally enough to get people to write it, and arguably literate programming makes it harder to write. Not because of the tooling being bad or anything, but because of the bar being raised really high to have "proper" documentation under its dictates. People can write plenty of well-documented, high-quality code in any modern language right now. They just generally don't.
You are certainly correct that older languages had more constraints on ordering, but I completely disagree that modern languages have removed all restrictions. I assume in whatever language you are imagining you have things like methods in classes or namespaces. This forces you to write them all together. The ability to separate those out is powerful.
As evidence of this, compare a well organized C codebase to whatever language that you have in mind. Does it really look that different? Literate programs on the otherhand do like quite different.
Furthermore a big programming project is still just a list of files. What do you read first? What are the important concepts? There is nothing to tell you this information in a typical program. A literate program has a page 1 and you start there. You can think of it like the difference between API documentation, and a well-written tutorial. Modern code files are great API documentation, but are lacking in that other form of presentation.
Another feature modern languages don't have (and probably shouldn't) is the ability to integrate other media types into the code (images, math notation, etc). This is another feature facilitated by literate programming systems, although not the essence of it.
"I completely disagree that modern languages have removed all restrictions."
So do I. That's why I said the marginal advantage makes it not worth it, and said already that literate programming does things that conventional doesn't.
(If you don't know what a marginal advantage is, you may want to poke around on the internet a bit. It's a very valuable concept. And I see a lot of people using the word in a way that shows they don't really know what it means; it is commonly thought that it means something "small" or "insignificant". That's not what it means.)
"There is nothing to tell you this information in a typical program."
This is true, but it's not true because nobody uses strict Knuthian literate programming, it's because nobody is writing those. Languages generally do have a place to put a top-level summary and organize all the rest of the documentation. I know; I use them and write them. All my major systems have an official "top level" location, and generally I put it in the local system's language-specific documentation format (godoc for most of my systems). People just don't write them. Handing them better tools won't solve that problem.
> The key feature of literate programming is separating the order that code is written in from the order that the compiler requires
But what's the purpose of divorcing code execution order from the machine, and instead tying that to prose? It's simply to facilitate the reader's understanding of the logical structure of the program without the superficial parts.
Just because you do discuss those superficial features with the "literate-programming as code comments" paradigm doesn't make the program less literate, it just means the prose has to be less efficient.
I do agree it's better to use a system like org-mode for literate programming because it is better to separate the description from its execution, but I'd argue it isn't really the essential element of literate programming.
I do this too, and it’s great. I’d much rather rewrite a comment six times and then write a function once than rewrite the function and re-modify all affected code six times.
Also, difficulty writing a clear, concise comment is a strong code smell indicating that you don’t sufficiently understand the following block of code. Much better to realise this at the comment stage than at the debugging stage.
Yes! Rewrite that comment until it's sound! Being able to write in spoken language what the code is doing is so often related to the writer's ability to write that code succinctly.
I did this early in my career then stopped when I realized it generally made the code less readable. Generally, not always. Sometimes a comment about what the code does is invaluable if it’s complex, but usually it is redundant with reading the code directly. I feel like coming across code littered with comments about what the next line does reflects someone not cleaning up after themselves, a string of trivial comments would not pass code review on most teams I’ve been part of.
This applies to comments about single statements in the code, unless they have an esoteric side-effect or there is a particular why that needs to be explained.
But comments that describe a paragraph/block of code have a separate value. Being able to skim code faster because of a series of well chosen sentences is an aide beyond clear code. As the OP described it is a good way to design before the code is written.
Very similar situation here. I have always approached my code, my codes documentation, and the actual user documentation with this attitude: I assume that someone in the future looking at it will be on their first day on the job, and may not be familiar with the context I am so familiar with.
Comments on non-trivial code is invaluable. I do it constantly and appreciate traversing previously made comments, especially when beyond short-term memory. Regexes, multi-conditionals, all those weird edge cases. How does any of that go "out of date"? As time progresses, comments become _more_ valuable, not less. What is the use case where comments are bothersome or out of date?
For figuring out how to build complex systems I’ve started doing something like this too, but in prose form, in a separate document, purely as a form of thinking through how the pieces will all fit together. Most recently, it was for building a multitrack video player, and working out how to organize the various parts (decoders, mixer, compositor, player, handling seeking, etc etc). I drew out a TON of block diagrams while constantly feeling like they were missing the how and the why, and then I sat down and wrote out my thoughts it helped a lot with getting clear on how it would work.
I do the same thing. I then replace the comment with a method name that describes exactly what the comment described, e.g. _getLowestFooIdFromBarCollection(). I then implement the code inside this private method.
I do pseudo code sometimes too. I notice that your functions/methods/whatever are verbs that are assigned to nouns. It reminds me of:
Verbs in Javaland are responsible for all the work, but as they are held in contempt by all, no Verb is ever permitted to wander about freely. If a Verb is to be seen in public at all, it must be escorted at all times by a Noun.
I’m not too lazy for comments, but I hate it when the flow of explanation/logic gets piled up with details, all in the same visual style. A source becomes a lenghty book with randomly discussed topics mixed with code. Instead of comments I’d like to emphasize some parts and write irrelevant details and technicalities in a smaller font with a lower contrast. And/or collapse parts of it, but with manual collapse statements, not whereever program syntax allows (I find this completely useless). Also abbreviate expressions into comments that by default replace them completely. E.g.
<cycle through params and start a service for each one>
…
<set up an ocr subprocess and handle errors>
…
if (<is a numeric string>) {
<trim whitespace>
<extract integer part>
<send it to /path/to/route>
}
expands into a corresponding full-page code and more details when you click on these parts. Can’t simulate fonts and colors on HN, sorry. This isn’t exactly literate programming, but more like a programming markdown. I wish it was every language’s way to go. I know you can simulate that with functions and IDE, but having a continuous context has its own benefits (you don’t have to pass it around, which is tedious and again technical), and also naming things properly and in a unique non-polluting way is hard.
I ran into this issue and have settled on very descriptive naming for variables, functions, classes, objects, what have you. I still comment a lot, but it's things you might not infer from the naming conventions. Even when I don't think anybody will ever see the code I do it this way because when I open it up a year later to expand, between the git logs and my comments I'm ready to rock right away.
This is slightly different from my idea. I also comment a lot, create sections of code and use names which at least I can pick up later. My issue is not with names (words), but with a structure of text. A wall of it is “either read or go away”. I’d like to see through it as if it was a high-level description and only dig into details when required. This comes from an observation that you perceive best what’s on the page when you don’t have to jump around and collect knowledge. A page is a best perception unit. Everyone talks about good descriptive names, but then you open any project and see verbAdjNouns at best, which implies that you are familiar with their pretty local jargon.
There are TwoHardThings. I'll take longer descriptive names any day. There is a discussion whether the prefix of words like "get" (i.e. `getName()`) is necessary. Please keep as rare as possible the describing of the _type_ of object, like `MyDataDTO`, `ListOfPeopleArray`, `ApiRequestObject`.
I definitely do this as well when I have a complex feature or pattern I need to implement. It’s best to break down the problem into solvable chunks and writing the comments out first helps you break it apart and also make sure you haven’t missed anything.
This is the stuff that a software engineering professor will tell you to do, you ignore it for years, and then later on, after you’ve matured, you find yourself picking up again because it’s just one of those incredibly simple and responsible ways to break down complex problems.
I've also hit on doing this as well. I find it really helps to organize my thoughts, uncover edge cases, etc before I start to write code.
Then I might start sketching the implementation with function signatures docs, types, loops and calls, and gradually start filling it in beginning with the parts that are less clear to me.
Usually, when I do this, I immediately end up with a comment that doesn't reflect the code, and is cut and moved around so many times that it's not legible anymore. So I try to replace the comment with code, at once or piecewise (on what case, the comment doubles as a TODO list) during the process.
I believe "TODO" comments are valuable and should be left in; it shows the author also understands there's a better way, once it's viable. Most of my professional colleagues do not agree. I swear it's an ego thing.
I believe this is due to how IDEs have introduced useful features to elevate things like specific comment structures like TODO into something "official" and recognized and reported on.
So things like a quick "let's improve this part later, but this is fine for now" become items that are highlighted as not "DONE" and then are seen as eyesores or unprioritized tickets, which leaves a bad feeling for devs.
While I think these features of automatically tracking TODOs can have their uses, like most things if they become metrics they start to lose their value. Sometimes it's better to just have a keyword you can easily grep for when you do find some time, having these results show up as cases against your abilities to finish the work is unhelpful, but that's what the manager and maybe coworkers see, unfortunately.
Hum... Your comment reads like "I have this very specific and subjective process that works for me; people that do things differently are too egocentric to notice their incompetence". Your colleagues may react badly to that style.
I believe (but I'm not sure) I do get what you re trying to say, but not every TODO reflects some eternal trade-off. Some are for real clear improvements (like the "there isn't any code here" into "the code was written" on my example) whose history won't add any value after they got done.
Code comments are implicitly associated with code that already exists. Sounds like your making a presumption about what code exists and potential missing the forest for the trees.
On larger teams the likelihood than comments get out of date is probably higher - particularly if the original comment was very long and detailed and the original author has moved on.
The number of times I've seen obscure code that would have been elucidated by a good comment is vastly higher than the number of times I've seen outdated comments.
Funnily enough, most of the outdated comments I can remember were dead docs links despite that being a very common 'solution' to the problem of outdated comments.
One way to cap the cognitive load of adding and maintaining comments is to limit them to explain why a piece of code does what it does, rather than how. Rely on the fact that the how is self-documenting to other programmers who may need to maintain your code later (including you), but the why is not always so.
Eg, were other options considered and this one chosen? Why? Is this piece of code non-standard or particularly clever? Why? Is this piece of code more complex than it seems like it needs to be? Why? Does this piece of have non-local dependency that may not be obvious at first glance? Are there security concerns with this piece of code which are non-obvious to a novice or an outsourced worker? Etc.
There's the old saying, "since debugging code is more difficult than writing it, then if you write the most clever code you are capable of, then you are by definition incapable of debugging it". Pretend you're explaining your code to future you who has forgotten why you did what you did and isn't clever enough to debug it.
I think "why" comments are the most useful, but "what" comments are a bit undervalued. A few types of "what" comments are, I find, very useful:
1. Comments that are effectively headers:
# If the user is foobar, do thing
user = User.find(current_user_id)
user.preload_bizbaz!(cache: false)
is_foobar = user.bizbaz.foobar.status == "yes"
if (is_foobar) { do_thing(user) }
Yes, you can read the 4 lines of code to see the "what' - the comment is duplicative. But when I'm skimming through a large codebase trying to find or understand something, these headers can be invaluable.
2. Comments on dense code
# Matches any email like foo@bar.com (only .com, .net, .org email addresses)
return email.match(/^.*@.*\.(com|net|org)$/)
Yes I can that regex, but I can read that comment a lot faster.
I love header comments, but many people fail to get their usefulness, and I get the "good code does not need comments" mantra recited to me pretty often.
Here is the thing: Header comments are not really about explaining their following code. They are about reducing the lines you need to read to navigate through the code by a factor of 5 to 10. Huge time-saver, and helps a ton newcomers to become productive quickly.
// The following block should have the same
// result as this (simpler) code:
// <code>
// but for <reason> this version is better
This is great for things like opaque optimizations, things that mimic a library function except for one critical difference, blocks that become really convoluted to deal with special cases, and similar.
> I think "why" comments are the most useful, but "what" comments are a bit undervalued. A few types of "what" comments are, I find, very useful:
I agree. However, "what" comments need a little more skill and judgement to know what to write and what to leave out.
Having a terrible memory helps develop that skill, because you kind of get a sense of what you'll want explained to you again in six months and what's hard to decode.
> And the latter is a perfect example of what goes wrong, when someone adds "gov" to the regex but doesn't notice the comment to update it.
That always gets trotted out, but it's sort of like "don't write code unless you can guarantee the code won't have any bugs, so never write any code."
The trick is write code and comments, but be a little skeptical of both. If you're disciplined (especially with commit messages), it's usually pretty easy to figure out a mistake or omission after the fact.
I am 99% in agreement with everything you said, so of course I'm going to focus on the 1%! If you have a 4 line stanza that needs a comment to explain what it's doing then I'd suggest that needs breaking out into a function so that your code then becomes `if (is_foobar?(current_user_id)) {do_thing()}`. Just as readable with no comments needed.
There are definitely places where that's hard or impossible and your regex code is a good example of that. I write a fair amount of C at the moment and you can have some pointer arithmetic that's not even that complicated, but a comment on it makes scanning through the function much simpler.
> If you have a 4 line stanza that needs a comment to explain what it's doing then I'd suggest that needs breaking out into a function
I'm mostly in agreement, so I'm going to focus on my disagreement too! ;)
This view is pretty common in the ruby community. So much so that I think they take it too far at times. I sometimes find myself reading a class that could have been a single 24-line method but is instead one method with method calls, each of which calls 0-2 more methods.
This style is good for when I'm trying to quickly understand what that class is supposed to do - I just read the top-level method that looks like:
But it's terrible when I need to understand what that function is actually doing - eg to debug it or to find some underlying function call I'm looking for. I have to bounce all over the file to mentally reconstruct a linear sequence of code which could have just been one function with some headers.
Of course, this style is a reaction to impenetrable 500-line functions which are also terrible. I'd definitely prefer many small functions to that! I think it's a matter of judgement and experience to know whether some code is better as small stanzas with comments or small functions.
For your first example, use a method with a fitting name.
Second one is not a good example, because I think it's a very common pattern and email.match with com|net|org provides enough context by itself. It's just bloat imo.
The "why" is the most underrated aspect to good documentation.
I generally prefer the word "context".
The trickiest aspects of writing and editing software tend to boil down to two subjects:
1. What's the underlying system we are trying to manipulate? How is its data structured, and what kind of loops and logic branches can be hijacked for our new functionality?
2. Where and when is our new code interacting with the underlying structure?
3. To tie the first two together, what data/functionality are we exposing to the user? How and when? What is the user?
Most contemporary programming over the last two decades has been focused on nouns. The last 5-10 years has seen a shift in popularity to verb-based (functional programming) paradigms.
We can find insight from noun-based programming to answer "what?", and verb-based programming is good at answering, "how?".
Neither really do a good job of answering "where?" or "why?". Those have to be peiced together like evidence in a crime scene, and by the time you figure it out, you've probably learned everything there is to know about the entire codebase.
This is funny to me because we seem to be complete opposites! I have a great short-term memory, but my long-term memory is sadly completely trashed. I don't care if someone wants to document their code, but I never actually trust that documentation to be accurate or up-to-date. I expect it to be misleading and time-wasting, and so I only trust the code itself. I personally only write comments for bits of code that I find confusing, and I think it's a bit of a failing when I have to do that -- I'd much prefer to put in the extra time and effort to rewrite the code to not be confusing.
(I don't write code for a living these days btw, only for myself. It's much more fun, and I get to not care anymore whether other people think I'm wrong about things like these).
You may not remember, but you and I have had this discussion while working together. You were probably one of the people mocking me. ;-)
> I never actually trust that documentation to be accurate or up-to-date. I expect it to be misleading and time-wasting, and so I only trust the code itself.
Two things here:
1) I don't implicitly trust comments. Obviously, the code is the definitive source of reference, and where the comment differs from the code, it only means that something interesting happened. But...
2) I believe in commenting "why" or "who", and less often, "what" (and rarely "how"). My experience is that "why" comments age well -- even if the code drifts, the intent of a method or class changes infrequently.
For example:
# I (timr) wrote this method because I needed a way to
# invert the index for {situation}, and {method a} and
# {method b} didn't work because {reason}.
is a better, more evergreen comment than:
# this method inverts the index for {situation}.
which is far better than:
##
# inverts index.
# @args foo, bar, baz
#
Unfortunately, nearly all doc generation software encourages the latter, and so many comments are pretty darned useless. The first example is great because even if {method a} and {method b} and {reason} fail to be true in the future, some other programmer can come along and read it and say "ok, I understand why this was written this way, and the preconditions motivating it are no longer valid. maybe I can refactor."
> some other programmer can come along and read it and say "ok, I understand why this was written this way, and the preconditions motivating it are no longer valid. maybe I can refactor."
I appreciate having this kind of information available, but it often gets too verbose for my taste to keep as inline comment. For this, I typically push this kind of documentation to the commit messages. This however requires very disciplined use of git: you need to "massage" your commits so each of them is a self-contained change, and OFC avoid squashing during PR merges. Then, a git blame + git show will bring up the relevant information.
Oh wow! No, sadly I don't remember that, nor much else from that long ago. That's funny (assuming I wasn't a dick about mocking you!)
I actually completely agree with your ordering of least-useful to most-useful comments. I still tend to view these most-useful comments as just kind of interesting historical tidbits, rather than something that really helps me do my job, but at least we agree on the order :)
This so much. I really don't trust most comments at all. I think we also need to be explicit about the kind of comment.
There are comments that are categorically always bad. These are the "this code does X" kind of comments. Those go out of date really quickly in a shared code base. They don't even make sense for myself on a project I'm doing myself. The code should read like the comment would. Almost like prose. If it doesn't, I haven't made the code good enough yet. We no longer live in a world where the code needs to only be machine readable and unreadable to most mortals. We have the luxury of being able to use good abstractions and extract functions without worrying about code running slow because of too many indirections, stacks that are too big etc. We can optimize for direct code readability!
Then there are the "why" comments. Those can be invaluable. If I assume I have the kind of good code I can just read as to the "what is this doing", I can sparingly add information to things that might seem unusual or weird or inexplainable.
Same with tests mentioned in some sibling posts. Tests should be written in a self documenting way. I like to name my tests after what they're testing. Like "should behave in X manner when doing Y to Z" from a user's point of view, not on a technical level. User not necessarily meaning end user, say if you're testing an API or library function. Different language make this easier or harder but I do it in all of them. Armed with that documentation of what my tests pre-requisites are and what the expected outcome is, I can write the actual test. I should be able to deduce the expectations of the test from the test's "name", thus double checking that I am testing the correct thing. Many tests I find in shared code bases are utterly unreadable, have way too many expectations and side effects and test too many things at once. With the above technique there's usually only one or very few expectations. If I parsed out the test names from all my tests and just gave them to you as a document, it should almost read like a documentation of all of the expected behaviors of my piece of software.
but I never actually trust that documentation to be accurate or up-to-date
This is like saying food is bad, because it can spoil. Sure, but we have ways of preventing that and figuring out when it happens.
A simple git blame or history (or the equivalent) should quickly answer a number of questions. Is the code significantly newer than the comment? Who can I ask for verification? etc.
It's not perfect, but significantly better than the alternative.
Similarly, presumably code changes are approved by reviewers, who should be preventing the merging of code that invalidates its own inline documentation without an update to the comments.
Yes, I am also in this camp. I document very rarely, only when sure something is inherently complicated and needs to be written down. A great example of this is often times odd bug fixes related to the evolution of features, underlying services and data structures deserve callouts often linking to the bug in question. I do take quite a lot of care in naming things, clarity in how things operate and what the key data structures are.
We need to write code that is optimized for communication and clarity. We have a number of tools we can use to craft communicative code.
1. Variable and function names. These should be descriptive and never deceptive. For example, I've seen metric tons of code like this `const json = makeSomeApiCall(params)` where the contents of `json` is a the decoded response data and not a JSON string. This code is deceptive and obfuscatory. But if you write `const decodedFooResponse = makeSomeApiCall(params);` then you are accurately describing what the value is. This is particularly import in untyped/loosely typed languages where the value of the variable could be anything.
2. Code structure and layout. When writing prose, we achieve clarity through structure. Code is no different. Keep related things together. Avoid run-ons. In other words, write code that has smallish functions, with smallish interfaces, it's easier to reason about how the code will behave. Avoid unnecessary mutation of values--repeated mutation of the same value is particularly pernicious. Use white space to create visual separations of distinct ideas. Use abstraction and encapsulation to keep code focused. Avoid deeply nested conditionals, flattening the tree whenever possible. And a million other strategies that can all help.
3. Tests. Tests can be documentation, but they certainly aren't always. They also aren't convenient documentation--what they offer isn't going to show up in your editor when you inspect a method. Unit tests are a nice way of recording verifiable expectations for behavior. But if the test code is complicated or poorly organized, all it does is compound the misery of working on a messy codebase.
4. Comments are distinct from documentation here. Comments are little side notes like "this implementation is a bit of a hack. It is brittle because .... but we decided to keep it because X. See someURL for contemporaneous discussion." They tell us why a thing is the way it is or about some sort of risk or unexpected detail.
4. Documentation is absolutely a part of making code understandable. These docs are primarily going to be used in an IDE/editor when viewing function signature data. Ideally you can write JavaDoc/TSDoc/POD or similar inline docs. Examples are worth their weight in gold.
We should be using all our tools to write communicative code. We don't need every technique for every line or block, but over the scope of a package or project, we should judiciously employ them all. Every line added comes with a concomitant maintenance cost--we must ensure that every line we add is worth that cost.
Comments are absolutely necessary when explaining complex logic that is hard to read for the casual reader (or yourself 6 months later).
Comments are not necessary when the code itself explains what it does by being clearly named, isolated and organized.
Comments are not documentation and shouldn't be used as such. They are tools to improve code clarity for the person who is changing the code. They are not tools for someone who is using the code. If the intent is to teach someone what the code does so they can use it somewhere else, write documentation in a wiki with examples and link the wiki URL to that point in the code.
I have the same problem. Rather than fight it, use it as your superpower. I find it helps me write easier to understand abstractions because I can't physically hold enough in my head to understand anything more complex. Same is true for documentation!
We have some blatant falsehoods about how documentation should be organized as well. Virtually everyone seems to think that people know which file to open, and so we organize files as if the person is always in the right place.
This doesn't scale, because as the size of the project and the number of docs increases, we start refining generic knowledge into more refined slices of the problem domain. At first the docs are so far apart that you rarely miss, but as time goes you miss more and more. So while your outline might suggest that finding docs grows logarithmicly, it's linear at best.
The first line of any doc should answer "where am I?" and "why do I care" because the odds that they don't care go up over time, and people put a time limit on self-service. After 2-3 wrong pages in a row they start getting impatient, unless they got through those wrong pages in 5 seconds apiece.
If i could i would upvote this 10x times. Always write documentation (especially for your own code). Future you will say thank you to todays you. At least i have done so multiple times. And i always feel cheating my self when i feel lazy to write documentation.
One thing that I believe is under-documented in code is _business decisions_. The team arrives at a decision in a meeting or informal discussion, and it leads to some code that might be tough to understand without context. I'll usually add a comment briefly describing the decision, and initial and date it.
Initially I would just link to a wiki page in a comment, but occasionally these links break, so in my experience it's better to include the notes directly.
It is often also useful not only to document the decision the team arrives at, but also briefly write down what was decided not to implement and why. This is very helpful when people come in a few weeks later with a "new" idea that has already been discussed, or in onboarding new people to the project.
The commit message should contain all the unusual context needed to understand the change. In practice however, most people write terrible commit messages.
It's not just that. Commit messages have a property that they get overwritten. When I refactor some code for performance reasons, I don't want to remove documentation - which is exactly what happens if I make a commit. True, older commit messages are available in the history, but... not conveniently available.
Oh gosh, this 100 times. I comment everything, even rationale behind decisions, I'm recently writing a book as I write a compiler[1] because some parts are confusing me so much that I just can't solve them, and can't go back to work on them later, if I don't document them.
Even if things go out of date, it does not matter, because it still is somewhat close to what used to be there and can still help the future reader (me or someone else) figure out what might have happened since then. It's a million times better than no documentation.
I feel like nobody goes to look at changelogs, or PRs, or commits. Probably because they don't ever expect anything good from it. Also they're not really searchable.
But still, how would you search through commit history to figure out one thing? Comments are right here in the code, and books/external doc/rfcs refer to concepts.
I feel like commits are only good if you're spelunking, which is usually for a single reason: you're bisecting looking for a bug.
If you write down in comments the history of why the code is written the way it is now, not just the current code but all the things that were tried before and why they had to be changed, you'll have too many comments and it will be hard to read the code. That's why it's rarely done.
> I feel like commits are only good if you're spelunking, which is usually for a single reason: you're bisecting looking for a bug.
I've read history to see why things are the way they are, but I agree that most of the time it's looking to see when a bug is introduced and what was known about it at the time. That's a pretty important use, though. If all you get is a commit with no useful message, you can see what line of code was changed but not what the reasoning and investigative data behind them was. Many bugs show up from changes that were themsleves supposed to fix bugs or make something subtle work in a particular way, so the reasoning behind them is relevant when a new bug is discovered.
With a good issue tracker, a commit message like "fixed #386" is theoretically enough because the information is in issue #386. But tbh it's still friction to see lists of commits which contain nothing more than # references to pages somewhere on GitHub and no useful description. I prefer to summarise the issue and the fix in the commit message (and PR message) in those cases.
(To an extent it depends on whether you're using Git itself, or GitHub/equivalent, as the latter expand # references to include the one-liner description when displaying the messages. I find GitHub extremely slow compared with Git, and it has awful commit history tools (won't show the graph for example), so I use Git and see # references by themselves. When colleagues produces a lot of these, it's like a sea of unexplained changes, as if nobody can be bothered to say what their code does at all.)
Another completely different reason I've grepped through git history with the Linux kernel and other widely used projects like Glibc and GCC, is to see every change to an API or subsystem or function throughout it's history, in order to write "portable" code that will work with every version across a large time range. Occasionally I've even written a short document listing every change that's relevant to what I'm building, to help me build the thing.
This is particularly important with system calls, library functions, and internal APIs (e.g. for kernel modules). Although it's rare for an external API change to break existing code (though it does happen), it's common for an API feature which works today to be missing or buggy in the past, in versions which are still being used by someone. Internal APIs change more often, so finding the changes is even more essential. Writing portable code means finding the history of all those changes, including bugs and feature additions, to write code that works correctly when it's running on any version.
For example when I was writing code to use io_uring, a large part of the work was going through every change to the kernel io_uring subsystem to check every change affecting those parts of the API I was using, so I could avoid using them on buggy kernel versions, and so I could adapt to API changes that occurred. (This was also useful for future-proofing the code in that my test environment wasn't able to run the latest kernel, but in examining the history I'd also see "future" changes that my code would need to work with when shipped.)
The explanatory commit messages were essential for that. There's no way I could have understood the purpose of relevant changes in a useful timescale without those messages. Particularly for things which affected performance or thread correctness in subtle ways only with some machines and some applications, that you simply could not see from the code.
You might argue that comments should be there to explain all non-obvious aspects of the current code, but for code like which contains thousands of "Chesterton's fences" at high density, that style would be very comment-heavy, and that style is generally discouraged. In effect, there's more to the code than meets the eye. At least with the Linux kernel, the culture evolved to expect explanatory Git commits (before Git it was the mailing list, go back far enough and there were more comments in the code), so everyone knows to look at Git and lists now, keeping the code itself relatively clean as a result.
"Comments go out of date quickly" is true -- it's the "there's no point in investing in them" that's incorrect. In addition to being valuable in the short term (to both you and anyone else who ends up looking at the code), "comment just became outdated" is a highly valuable signal that a prior assumption made by other code could now be false, and you should look into that.
I have worked on teams where the code is very well documented (basically every public method, parameter, property or class gets a comment) and it's not at all hard to keep up to date. Sometimes people forget, but when you look at a PR and see the usage or meaning has changed with no corresponding comment change, it's easy to flag.
I feel like many people take the "comments will go out of date" adage the wrong way. To me, it doesn't say "don't write comments", it says "give your comments some love while you code".
It's meant to warn against letting comments become false, not against writing them. To me, at least.
> Most programmers believe a number of blatant falsehoods about documentation, with the most prevalent being "comments go out of date quickly, so there's no point in investing in them".
More broadly, a problem with our cognition is that when we remember something right now, we often think it is unlikely or even not possible that we wouldn't remember it a day, a week, a month from now.
IMO there is some common sense to comments that is often not followed. Example: I work on a project that has a pre-submit script that requires comments for tests. This leads to ridiculousness where I have code like
// Test that uploading data with XYZ function works
TEST(UploadingDataWithXYZFuncWorks) {
...
}
Like WTF! The function name says all that needs to be said. There are is no more info needed but I had to appease some over zealous programmer.
Meanwhile, in an existing test in the same file I see code like this
// Set the dimensions
SetDimensions(0, 0, width, height)
Again, WTF!
Comments are important for explaining why something exists, what its assumptions are, edge cases, etc..., but you can also go overboard with comments and you can make your code more readable by using readable names for functions and variables.
I agree it's a lame excuse not to document code, however there are just some parts that just changes way too often by multiple people where outdated comments can actually lead you astray. The stuff I touch doesn't change often so I'll comment to explain the weird and ambiguous looking parts.
I'm not saying you have to document each line of code with equal rigor. In fact, there's an art to knowing what to document, and like most arts, you get better at it over time. Sometimes I feel silly about what I've documented and what I've neglected, but that's OK. I learn.
The "code changes way too often" excuse is basically a restatement of the myth that I wrote above. Yes, code changes over time. It's pretty easy to change the comment with the code, when necessary. The team members who don't do this aren't doing their jobs.
But at the end of the day, even if code deviates from comments that's still fine. Having an outdated, well-written comment is a historical record of what the code was supposed to be doing. That's useful.
The added bonus to writing good comments is you’re practicing technical writing directly associated with your code.
It’s similar to writing your own flash cards, but by rephrasing the code into a comment you’re visually and linguistically using a form of repetition to stick the code inside your head with a human readable explanation. This can make you a better technical communicator when discussing any code with others.
There is a lot more than “I will read this comment in 6 months and it will save my ass” going on when you practice writing good comments.
I really like this because in my opinion, half of my job is good communication, another quarter is planning/design, and the final quarter is actually writing code.
> You can document your code. You can keep it up to date. It isn't that hard. You just don't want to.
Documentation that is close to the things it is talking about isn't hard to keep up to date. What is hard is knowing, when I make a change here, that someone (maybe me) talked about this thing over there without any pointer from here to there. Can I do a careful search of everywhere there might be documentation every time I make a change? Yeah, I guess; it'll definitely slow me down in the short term and I'll still miss things.
Does "documentation" mean separate documents completely disconnected to the codebase? Isn't that the problem? Code gets updated according to necessity whereas documentation is for human consumption and not required to make the thing work.
> Does "documentation" mean separate documents completely disconnected to the codebase?
At worst, yes. I've increasingly seen docs in Notion, but Confluence was common before and probably still is. I agree that a big part of the solution is "get it all in the repo" but I don't think that's enough.
Even within a repo, writing can be pretty distant from the code it discusses. In my experience, programmers and reviewers are great about comments on or adjacent to changed lines (not coincidentally, what shows up in git diff and code review software). Both are still pretty good about comments that they can structurally expect to be present, like function documentation at the top of a function.
It falls off a lot for programmers when it's a comment talking about something more than maybe 20 lines away, whether that's a quick sketch at the top of a loop or a block comment somewhere in the file about invariants or caveats or gotchas. At that point, reviewers simply will not catch that the programmer didn't update the documentation, unless they happen to be unusually familiar with the file in question.
Comments that reference code in other files are thankfully rare if you have reasonable modularity, but will also be missed by both programmer and reviewer unless the task in question touches both files in a relevant way.
Documentation in the repository that is not a part of code is hit and miss. Developer setup guides are often brittle to particular system configurations, but usually get some effort every time the new hire trips over something. Runbooks similarly, and less limited to the new hire, and hopefully you look them over proactively on occasion. Large scope architecture documents are doomed; I am more optimistic about something like ADRs which should be relatively narrow and aren't intended to be updated beyond deprecation.
Much of this can be helped with tooling, but where it exists its poorly standardized.
Docs don't go stale like bread. They atrophy from disuse.
I always force the onboarding process to go through our docs, and I spend a little time with each new person observing their progress looking for regressions in the docs. You can't get that with old-timers because of the echo chamber/curse of knowledge effect.
This breaks down when you have a place that never hires new people. And rather than thinking that's a flaw in my process, I'm starting to think that's a flaw in the business itself. Without fresh ideas and feedback a project stagnates.
Depending on the nature of your memory and thought process, the insights gained from potentially-false statements in the documentation when they are true might be outweighed by the blind alleys and misunderstandings generated when they are false. You'd have to be very good at remembering where things came from and tagging them with their level of certainty. An important skill! But not a trivial one.
> You'd have to be very good at remembering where things came from and tagging them with their level of certainty.
I wonder why not more people do this? Not just for code, but for everything. I remember where I learned everything I know, I don't trust anything unless I remember the source. How do other people think, do they think that whatever pops up in their head is the truth without any source? Then how do they know it is a fact and not just a hunch or a guess?
It would be neat if your code editor could easily highlight the code newer than the comment(s) and tests based on your git history (without jumping out of your work, of course). Even something as simple as "highlight all code newer than the current line", for example, would be quite useful.
Of course comments stay in sync with the code... if you're the only one working on that code. As soon as you have multiple developers, you can forget about it.
"Of course comments stay in sync with the code... if you're the only one working on that code."
If I am in a rush to implement something and then am sidetracked because of a stupid small bug, then I just fix that small bug in the code. And then another one. In these states I only pay attention to code, and not text. If I would also have to read the text, then I would forget the original task.
So sadly no, comments do not automatically stay in sync with code for me, unless I put in the extra effort of cleaning up afterwards.
What if the small bug got introduced because of lack of time to cater all the verbose comments?
But seriously, in the cases I remember - not really. Most bugs are a cause of lack of higher level understanding of a certain module. Some clear comments can help with that, but better is proper higher level documentation and the time to read and maintain them.
> Most bugs are a cause of lack of higher level understanding of a certain module. Some clear comments can help with that, but better is proper higher level documentation
Agreed. However I've found that the farther away the documentation is, the less people will use it.
If it's external, it's very hard to get people to use it.
I'd not approve your PRs. "Cleanup later" virtually never happens, and it's too likely someone will re-fix your rushed fix incorrectly because of misleading comments.
And you are free to be as slow as you want, but my flow state works a bit different and context switches are expensive. Which is why I want the minimum of comments and rather have self explaining code and proper higher level documentation.
> You can document your code. You can keep it up to date. It isn't that hard. You just don't want to.
Yeah I can or an individual can, that misses the point of the advice which is intended for groups on large projects with deadlines. Like once a manager says get this done now and we'll find time later for the rest... That never happens
If you are diligent about documenting AND updating, sure, seems very reasonable. But most people aren't. You know what the best documentation is? Working tests, much harder to have wrong tests than wrong documentation. Even some of the largest projects like Ruby on Rails have had incorrect internal code docs.
No. Tests are not documentation. Tests are tests, written in code, which must be explained. You should document your tests, too, because I guarantee the next programmer won't understand your tests as well as you (think you) do. Also: the "next programmer" will be you in a year.
This is maybe a close #2 on the list of documentation falsehoods that programmers believe.
Yes. Test _can_ very much be documentation if you write them as such, although they aren't complete documentation. Tests are the specification, ie the the how---documentation provides the why.
Test can be written to tell a story but most people aren't taught this way (or simply don't buy it or don't write tests).
No, tests are not specification, they are examples, specifically on the form "do this, and this happens".
Imagining a specification out of this is like solving those problems of "what is the next number on this sequence: 1 18 31". It's simply can not be done, you can guess something, but you will never know if it's the real answer.
That _can_ be examples but I prefer to write them as specs. I mean, there are entire libraries centred around writing tests as specs. Just because you don't do it that way or buy into the idea doesn't mean it's not possible or valuable.
Specs are complete. If you create real world software with complete tests, I will really want to read your Turing Award receiving discourse. It should be great.
You are right about commenting tests, but it seems wrong to categorically reject their value as documentation for users. I use tests all the time to get code examples and understand what's actually supported. One of the great things about open source projects is that you can see the tests.
> You should document your tests, too, because I guarantee the next programmer won't understand your tests as well as you (think you) do. Also: the "next programmer" will be you in a year.
I hear this a lot, but I haven't found the same with myself. I can read and understand old code I wrote.
I attribute it to my IQ being mostly held up by reading and writing comprehension skills. The way I process information makes it easier for me to remember and understand old code (I believe), compared to a lot of programmers whose inherit skills align more closely to things like math and logic.
Documentation can be a rabbit hole for me, and I often don't feel I benefit from it within the code.
I appreciate this is as good tactic for many people, but implying it's for everyone is too dogmatic.
I wouldn't describe it as a "falsehood that programmers believe", just one of those perhaps unfortunate realities that exist for some codebases that have a decent set of unit tests but little else in the way of up-to-date documentation that explains all edge case behaviour etc.
The only documentation you can really trust is that which 100s of others rely on regularly, but a significant percentage of the code most of us work with isn't going to fulfill that criteria.
Explaining edge case behavior is one use-case for comments, and not the most valuable one in my estimation. Aside from that, often incorrect edge case handling is in both the code and the unit test because the problem is that the developer didn't understand the requirements. In my experience, in an undocumented and difficult codebase the tests will be as mysterious or unreliable as the code itself, which makes sense since they are usually written by the same people.
You'd be surprised. There are generations of programmers now who think nothing of writing 20+ unit tests with quite clear names demonstrating what the behaviour should be under a variety of conditions, but with virtually no other documentation. Especially true in dev shops that have high coverage requirements for a successful ci build.
Tests are often a pretty good way to learn how a given piece of code behaves. Particularly, if the author is being nice and provides a "usage example" type test.
Tests verify what the code is doing. And I agree: they are a great source of insight when understanding an unfamiliar codebase.
However, comments are typically better at answering the why.
If you are diligent about documenting AND updating
I don't understand why this is viewed as challenging. Writing a sentence or two here and there is orders of magnitude easier than writing the code itself. And any code review process should help to prevent situations where the code and comments are out of sync.
Lastly, I feel like there's a larger human issue. I write comments to explain certain why's in the code because I care about my teammates and I care about the project.
If others don't do the same, I think it speaks to a lack of care for their fellow engineers and the work itself. I think, "I just spent five hours figuring out something that you could have explained with a single 30-second comment?"
I'm baffled that some engineers think that this is okay.
> You know what the best documentation is? Working tests
Tests do have a slight overlap with documentation, but it's that, only slight.
If a piece of code has some weird non-obvious behavior, the presence of a test for that particular behavior is a signal that it's actually intentional, not a random bug.
But, it doesn't tell me anything why that design choice happened. That's what the documentation is for. So facing such code, I sure hope it's well documented.
Show me a set of tests that I can use to build something more easily than using the documentation? It sounds like just kicking the can down the road to the next person.
Same issue here. It's one of the reasons I've become so enamored with Ada.
With Ada, it's not only easy, but encouraged, to encode so much information in about how things are modeled into the program itself. Not only does it function somewhat like documentation, it also lets the compiler helpfully yell at me when I still manage to forget how things actually work. It's saved me so much stress and debugging time.
Now if only any of these 'safer' languages would add even just strong typedefs. Even if they don't particularly encourage their use, it'd be something.
> "comments go out of date quickly, so there's no point in investing in them"
The key difference is where the described thing lives.
A comment describing the next few lines of code or some loop that follows, or similar, won't go out of date since it's easily updated together with what it describes.
A comment living in another file, describing loosely some far-reaching but still-evolving aspect will likely be outdated soon, if the thing it describes changes and that comment is out of sight and needs extra effort to be remembered and updated.
I (should better maintain) my little compendium of shell 'one liners' that do all sorts of incredibly useful things, that getting right first time, were hard won.
Having that handy you can seem like a god at times to your colleagues when they are trying to solve something. "Oh, did you know you can do this? I'll message it to you."
I think my working memory isn't very good, but there is something in there that is very good surely, because how else would I have been able to computers for this long?
Usually when writing some code, which deals with something new for me, I get many "idiot questions" in my head. I try to write comments in a way, which will answer my future self's "idiot questions". Answering all those questions, I feel more like I truly understand, what I am doing.
I often don't comment code, especially in personal stuff, but when I occasionally do, which happens mostly when things get overwhelming, I find bugs or fix things that I was stuck on. Writing forces you to understand better, name better, and almost feels like providing you with another perspective, all without leaving your own self.
Joking aside, documenting as a habit alongside coding is like a superpower. I find that writing can act as validation against my understanding of a problem; if I struggle to write about it, then it's likely that I don't understand the problem as well as I thought.
I'm in exactly the same position. My memory is awful so I write notes to my future self. It's not hard to keep comments up to date if you keep them near the code and don't use a lot of boilerplate.
> It's not hard to keep comments up to date if you keep them near the code and don't use a lot of boilerplate.
Yep, exactly. It only gets "complicated" when you start having these heavy-handed doc generators that are parsing your code and breaking the build.
I feel pretty strongly that sphinx, rubydoc, python docstrings et al. are fine tools, but you have to have a light touch with how they're applied. Autogenerated external docs are a separate problem, and shouldn't discourage developers from commenting their code.
There are some “tricks”. For example blending what you want to remember with something memorable, creating a mental picture & recall it a few times. That’s what I got from the book Moon Walking with Einstein.
Write things down. Use reminders. Place things where you will notice them when you need to be reminded of them, like the proverbial string tied around your finger.
Code is already a formal specification of what the machine is set to do.
All your documentation can do is make it more ambiguous. Usually the documentation is wrong as well, but that might be because the programmer didn't know how to specify clearly what he wanted.
> Code is already a formal specification of what the machine is set to do.
I take back my other comment where I said that "tests are documentation" is the #2 falsehood about documentation that programmers believe. This is the #2 falsehood.
Yes, "formally" the code is the spec. That doesn't help you very much though, squishy human, because you're not a computer.
// there is a method overload which accepts an array,
// but it's buggy and crashes. workaround: cast to tuple first.
...
then you seem to be assuming people only want to write and read the first kind. Because no matter how well you 'know all the fine details of the programming language', comments can tell you things the code can't.
Even aside from the pointless gatekeeping of "competent programmers" - ok what about people who aren't employed as programmers but still need to read and write code? What about the people who just have to deal with it being an unfamiliar language because nobody who knows the language is available?
Substitue any comment that tells you something which the code cannot tell you[1], to see why "competent programmers understand the language thoroughly" is not a good justification for being against comments.
[1] e.g. why the code was written this way instead of another way, or why it exists at all.
A competent English reader would understand what I wrote without any further comments needed, right? Apparently understanding of the language isn't enough. That's the point I was trying to make, but meta.
In this scenario you can't fix or remove it, it's third party library code which is outside your (scope, remit, time or effort constraints).
First you wouldn't write such stuff in comments, that's useless and not what comments are for. Comments are to explain non-obvious things in code, or clarify assumptions that are made, not to narrate that you found bugs but are too lazy to fix them.
If a third-party library is broken, you either fix it or stop using it.
As a developer, you're responsible with ensuring your application works well and is easy to maintain. It doesn't matter if someone else wrote some of the code.
> First you wouldn't write such stuff in comments, that's useless and not what comments are for. Comments are to explain non-obvious things in code,
First, that is a non-obvious thing in code. You look at code someone else wrote, you use your "competent programmer's understanding of the language", you see the overload they "should" have used, you are about to rewrite their function call to cut out the unnecessary cast, and the comment tells you the non-obvious reason why you shouldn't do that. Thus making it a useful comment.
> "If a third-party library is broken, you either fix it or stop using it. As a developer, you're responsible with ensuring your application works well and is easy to maintain. It doesn't matter if someone else wrote some of the code."
If touching any piece of code written by anyone makes it "your application" and immediately mandates that you rewrite all of it to your standards, that is "not how software development works".
I've noticed as I've gotten older that I lean more on remembering pointers to information rather than the specific information. I know something exists and can find it quickly, but may not know it off the top of my head.
For example, how to get the length of an iterable in a given language. I may not remember the function name (or if it's a top-level function vs a method) but I know I can search "string length in $LANGUAGE" and find it. This scales better than memorizing every language feature I'll ever use.
---
PS: Dash is a great mac app, highly recommend. Your job will likely cover it if it's something you think you'll use.
I've stopped worrying about remembering details, and just assume that my neural network brain will figure out what is worth retaining. Sometimes it does, sometimes it doesn't, but life goes on. Not having anxiety about forgetting things feels wonderful.
Very often it's a lot easier to remember the journey to a piece of information than the information itself. I remember reading about some Greek scholar who'd imagine his long speeches as walks, which helped him memorize them. I think we are more suited to learning journeys than destinations. Maybe there's simply more concepts to connect neurons to.
Tangentially, I've gotten better at remembering walks and drives as well, not through any conscious effort, but simply cultivating a sense of curiosity about the world around me. The more I learn about how the world fits together, the more interesting tidbits I notice in a location, and when I see the same place again it reminds me of the previous time I was there.
> For example, how to get the length of an iterable in a given language. I may not remember the function name (or if it's a top-level function vs a method) but I know I can search "string length in $LANGUAGE" and find it.
This is actually a really good example of something I've started using GPT-3 or GitHub Copilot for.
If I'm in a Rust program and I don't know Rust, I can type
# set a to the length of the items array
And Copilot or the GPT-3 Playground (if I don't have Copilot handy) will write the next line of code for me, without me having to go and look up how to do length-of-array.
This is the basis of transactive memory. You maintain a transactive memory with a collection of tools some call the "external brain" like text pads, calendars, search engines and more.
I think one of my greatest improvements as a developer over time is my ability to sift and sort information very quickly. I don't need to remember a lot of details, but like you said: remembering pointers to information.
Combine these pointers with google and an intuition for finding relevant information, I will nearly always arrive at the correct information/solution.
I wrote Private Comments[1] specifically to address this problem in code. My coworkers can maintain context about how a given thing works months after the fact. I can't. So, I leave private comments throughout the code. Things they'd never want committed, but save me hours of re-leaning when next i encounter a given piece of code.
Currently has plugins for Vim (proof of concept) and Emacs (actually good). It'd be lovely if one of you folks would make a VSCode plugin for it. I've thoroughly documented the API and diagrammed the code flow you'd need[2], so that this would be as easy as possible to add to your favorite editor.
I am on the opposite end of this. The older I get the more I see the value of rote memorization.
Just taking the time to actually learn what the standard library of the programming language you are using provides can dramatically increase your productivity. The same with important libraries you are using. Yes, it is too much to memorize everything but just knowing what actually is there will help you a lot. You can not search for something you don't know exists.
I also really love working on solo projects because I tend to memorize the general shape of the code I am working on. This makes me so much more productive. I might not remember every single line of code in detail but will have a pretty clear idea of the shape of the program. This means I can plan new features or doing refactors in my head and see if they would work. I don't need to sit in front of the computer. I can do the hard mental work while taking a shower or going for a walk and can type in the code afterwards.
Well, memorizing algorithms lets me know what is possible sure, but memorizing the actual api methods and arguments seems like a colossal waste of time for my specific circumstances. If it’s possible in one language it should be possible in another. I haven’t ever reused _exactly_ the same stack twice in my career, so it seems hard to optimize for the exact incantations rather than lazily find out what is in the library vs community packages. Looking them up is cheap enough, and I save shower time for high level thinking about what kinds of programs could be useful.
I agree, knowing all details of the language/framework you're using is the key to amazing productivity. In IT terms you could compare it with the performance gains you're getting from holding data in cache instead of querying them on demand over the network.
However it isn't always possible. Nowadays a typical dev team uses 10 different frameworks or more, and after a couple of years some of them will have changed.
The problem with rote memorization is that it makes life a pain when something changes from under your feet, you have to unlearn and relearn which takes double the effort. However, it's good to rote memorize more static info and references, tables of contents and so on. The rest I leave it to use, what is referenced multiple times becomes permanent.
I tried this when I was learning Go; I just took a few hours (if that) to write really low-level code by hand, instead of doing what I usually do and copy / paste code from somewhere else and edit it. It helped me memorize basics like creating a slice or map and things like that, and improved my confidence in the language by a lot in a short amount of time.
I should do that more often with whatever I work with. However, my current job is so random (lots of odd jobs left right and center in between ad-hoc meetings and interruptions) I have no need for it.
> Your mind is for having ideas, not holding them.
I will have to disagree. New ideas are rarely totally new, and often build on old ideas. Keeping old ideas in your brain will help you come up with new ideas.
One of the big advantages of having stuff in memory is that you can do background processing.
I can't tell you the number of problems I have figured out while going for a long walk, or driving in my car, or even sleeping (all of a sudden I wake up with the solution).
Keeping old ideas in your brain will help you come up with new ideas.
Agree! Though, "Your mind is for having ideas, not holding them" doesn't mean that you should literally never remember anything. It's just not necessarily clear from context-less quote.
I can't tell you the number of problems I have figured out
while going for a long walk
The point is not to forcibly resist remembering things. The point is to free yourself from the burden of remembering things that can simply be filed away (ie, reference documentation) so that you can be more present, focus on what's important, and free your mind up for more creative/productive thinking - including those walks where many of us get our best thinking done. :)
> Your mind is having ideas, it's just not holding them.
which is still quite accurate, especially given the theme of the article. Augmenting your memory is important, especially since age can make memorization harder, and even impossible.
> Having all API docs one key press away is profoundly empowering.
> While Dash is a $30 Mac app, there’s the free Windows and Linux version called Zeal[1], and a $20 Windows app called Velocity[2]. Of course there’s also at least one Emacs package doing the same thing: helm-dash[3].
Thanks, Zeal looks interesting until one tries to use it.
I've installed Zeal and installed the Python documentation. I can see the Python tree (among others that I've installed) in the sidebar. But searching for "python append to list" or "append to list" return 0 results. In the preferences I enabled fuzzy search, still no results.
So I turned to the tree to see how easy it would be able to find the answer. I opened the Python leaf, and from the 15 subleaves I opened Structures. In there I see "PyCompilerFlags", "_frozen", and "_inittab".
> So I turned to the tree to see how easy it would be able to find the answer
Python3 -> Classes -> list
I agree it's not easy enough to use the search for this particular case (it's actually very difficult to find the list "add" -- list.append). I'm going to stick it out with zeal, though, and see if I get better at finding what I want.
Some similar searches in other languages gave back what I wanted, so maybe it's got to do with how the python docs were imported or something.
I made something similar to have a key-shortcut quickly show a 'cheatsheet' for whatever application window that is in focus in i3(window manager). If no cheatsheet exists it opens an empty document so I can fill it in and save it.
I'm a computational physics student; my work involves using multiple software with varied options. Frequently, I need to check to make sure all my parameters are correct, and having these docs at hand is important for me. Using offline documentation is always faster than Google. Since the docsets for these special pieces of software for computational physics or quantum chemistry is lacking, I build these docsets myself. Up till now, I have written code (and sometimes scrape web pages) to build these docsets myself:
Remember back when Linux systems just came with huge packages of documentation? You could do anything offline by reading man pages or looking through /usr/doc/, /usr/share/, etc.
Improving your own brain with memories and knowledge is like greatly increasing the size of a CPU L1 cache and the improvement in execution performance is similar
CPUs don't have logic or reason, so it's not a good analogy. E.g. you can replace an infinite set of memories by distilling them into a few bits of knowledge that underpin that given set of experiences.
I don't think an analogy is meant to be factually equivalent, it is meant to convey meaning. I perfectly understood the meaning of the parent comment, even if it is technically inaccurate.
We aren't trying to understand minds or caches, we're trying to encourage a practice that has little to do with the inner workings of either.
Fair point, but on the other hand: I need to format strings and use basic regular expressions at least a couple of times per week and the execution performance increase and convenience from finally having learnt those by heart (simply by using them enough) is definitely worth it for me and not at all similar to having to look them up each time.
Sure but if you use like five languages then the differences between s.length len(s) and length(s) will blur when at the beginning of the day you need to refresh the cache for todays language.
i could memorize them all but it is less interesting than reading papers about optimizing algorithms for cloud economics or other more valuable (to me) ideas.
Why are you using five languages? I mix them up when I switch, but I rarely work in more than two languages at once, and keeping two separate isn't that hard.
When I'm working in my preferred environment, I need to work in its language, but the language/VM is built in something (usually C), and the kernel I'm running on is built in something (usually also C), and I might be making a webpage where I need a bit of Javascript. And sometimes I need to poke at a shell script or something that's overgrown shell and is now Perl. And sometimes my preferred environment isn't really the right fit for the problem at hand, so I've got to use something else. Oh, and maybe I need to compile stuff, so here comes Make.
Context switching between Java, SQL, Javascript and a build scripting language isn't _that_ unusual an experience. Throw in a templating language there and you're at 5.
> Improving your own brain with memories and knowledge is like greatly increasing the size of a CPU L1 cache
Yup... and as the size of L1 cache grows so does its latency. Eventually you reach the latency of main RAM and then... wait I forgot why you increased L1 cache size. Can you just reset it to "fast" please?
“Normally, flies will remember very well and will get an A if they are tested a few minutes after learning. But if there is long time between learning and testing, which for a fruit fly is 1 day, they will forget and get an F”
A fruit fly lives for 40 to 50 days, so 1 day isn’t a long time to remember things for them.
“Flies form a memory of locations they are heading for. This memory is retained for approximately four seconds. This means that if a fly, for instance, deviates from its route for about a second, it can still return to its original direction of travel.”
I couldn’t find how many things they can remember simultaneously.
My memory problem is not with documentation. Usually my IDE takes care of it.
As soon as I hit 40, I can't remeber what I did last week or before that. Did I solve that problem? Which customer complained about it? Where did I put that PDF?
Fortunately I am very organized, and I leave clues for myself all over the place. Readme files in named folders, details in commits, email myself with some information.
But more than often I am asked "was X display compatible with XYZ board?" and I can't give a direct answer anymore.
I can relate to that. Things get worse with age: I once googled for a piece of code (I2C driver) for an embedded platform, only to find out that I already wrote one myself, and managed to completely forget about it. Turns out it's a good habit to publish what you wrote.
Oh, yes. I know the feeling of finding forgotten code. When looking at the code I say to myself "this is actually good" (I always think my old code sucks), followed by the question "how did I managed to find time to write all this?".
Several times I've been searching for something that led me to a stackoverflow question or github issue in which I had apparently written the accepted answer years before.
A few years ago I was discussing some bit of Apache configuration, the other dev pulls up the Apache docs to prove I'm wrong. It turns out that I was wrong - because when I wrote those docs on the Apache wiki my phrasing was ambiguous. It was funny showing him the commit history of that page but I had to concede that by a certain interpretation of the phrasing he was not incorrect.
Write a short paragraph after each day and a longer one at the end of the week, describing what was done.
Also maybe write down a short paragraph of what you plan on doing the day after each day, and a longer one at the end of the week about what will be focused on next week.
Review each day and you'll keep a good mental map of what has been done.
I’ve started taking screen recorded video journals of my work. As I explain what I’ve done and what still needs to be done, I am navigating the codebase and highlighting key elements, the way I would during a zoom call with a colleague.
I was really excited because I have the memory of a goldfish but am the most hyper-productive person I know. I was curious about the techniques and a write up on various techniques that this person uses to compare them with my own.
Then 2 paragraphs in I run face first into a 5 meter high stainless steel marketing pitch. Wtf.
Tell me you have ADHD without _telling_ me you have ADHD ;)
Sometimes I'm not sure what's worse, the not remembering, or the people claiming that you having forgotten means you didn't care enough, or that you're not trying hard enough. grrrr
The "Time flies..." quote is commonly ascribed to Groucho Marx.
But, according to QuoteInvestigator.com[0], that is not correct:
QI has traced the core of the quotation to the work of an early researcher in artificial intelligence, Anthony Oettinger, who was trying to get a computer to manipulate the English language.
Which is another interesting detail of artificial intelligence: this attribution is basically the Mandela Effect.
Groucho Marx SHOULD have written that joke. It's good, the punch word is at the end, it's the kind of logic-mocking braintwist Groucho was exceptional at making, and yet he didn't write it. So we ascribe it to him anyway, because that's who SHOULD have written it. Thus is intelligence: we try to make things make sense, and we try to relate things to the big areas of 'sense' we already have in our minds. We have a space for Groucho and for humor, but who's Oettinger and where can we watch his comedy routines? What, he doesn't do comedy routines? And so, sense beats reality…
Arguably programmers who have great memories are much more likely to write unmaintainable code, 'just edit these fifteen files to add a new financial type'
Once and only once is a requirement for people that can't remember as much but also for good code. Consider it a requirement for code organization and tools:if it takes longer to look up than remember then the tooling needs a bump. Obviously during heads down green fields coding, the cache will get filled efficiently of wjatever is needrd, but by the time v1.1 rolls around a good reread and rethink might be a good idea
Also, the key skill isn't memory but rapidly and well learning new things. My C, perl, and jquery knowledge is well forgotten and replaced with python, go and terraform and i am considered trying out Rust.
> Arguably programmers who have great memories are much more likely to write unmaintainable code, 'just edit these fifteen files to add a new financial type'
Very true. Some programmers have higher tolerance for complex code. Which is often why there are disagreements over refactoring.
This article is great for me since I discovered zeal which is something I knew I need it but did not have a name for it.
Now I want to be honest, I have been trying for a bit over an hour to add ANY documentation that is not included by default. It's way too complicated. Has anyone found any success?
Docs from pyspark, azure data factory, databricks, anything big data related would be a gift for me. But the steps are overtly complicated.
Specifically, in the "Building documentation" part of the article, it's where you are supposed to use doc2dash and use to add new info. If someone or even the author has managed to add any not supported doc, and wants to share, it'd be much appreciated.
Otherwise I am left with the question, how many hours are you supposed to spend in order to save time with each search? Because this programs seem unbalanced for people without the know-how to even start.
Never heard of Zeal before in my life; glad to have read this blog. Also thankful to this guy for making it easier to get docs into Zeal. People are the best.
Same here - neve knew Zeal existed 'til today! I also often have numerous browser tabs open to a few different doc sites. I know there are websites out there that provides something similar (e.g. https://devdocs.io , etc.), but Zeal is *offline* which makes it a wonderful thing! I just downloaded it on my work machine, and love it! I'll be setting this up at home machine too! Kudos to the crteators of Zeal, and for today's promotors (TIL)!
Unfortunately it's not a patch on Dash. I used to use it, but found too many cases where it was missing docsets and glitching on existing ones. It didn't handle different screen resolutions well. The Fedora package for it is often broken (I just installed it to see if things have improved & all I get is an empty window with an odd partial-width window title bar).
There's a gap in the market for a good cross-platform docs browser, preferably supporting user-added notes.
Much of IQ is basically memory in disguise. They are almost one and the same. Pattern matching is related to patterns as you have to remember. For creativity the more primitives you have to build a creative entity the better... and how do you have more primitives? You remember them.
In fact, Programming Interviews directly test for Memory. Do you remember the correct algorithm? If not can you pattern match the correct algorithm? If not can you creatively build the correct algorithm from primitives you remember?
It is my personal belief that if you hold memory to be equal the variability among IQ will significantly be reduced. I will welcome any evidence for or against this belief.
I personally think that there are only two types of stupidity, and both can be adjusted to. You either a) have a bad memory, which means you need to take your time, take a lot of notes, maybe do Anki, or b) you refuse to accept the truth of some basic logical/statistical primitives out of willfulness, which means you need humility.
Same. Unfortunately, I came to this paradigm too late - my mid-thirties - to make use of it in my first career. But it’s radically changed the way I approach my second.
Zeal looks cool, I'm going to use it, but it has a lot of problems.
- the last release was October 2018. The repo shows activity up to 2 months ago, and I'm sure the last release works fine, but it doesn't seem very active. That wouldn't be a problem, I believe software can be done, except...
- the installation documentation is very much lacking. The "make -B build" on the README doesn't work, the build instructions on the wiki is empty except for per distro instructions, the package release for Debian doesn't exist, and the ppa points to an old IP.
That said, I got it working. It seems pretty cool, haven't used it yet, but from what I can see, it is very mouse centric and no keyboard controls, has popup windows and a system tray icon (I loathe them), so for a tiling wm not quite the best. But it does what I need it to do: download docsets, keep them up to date via the RSS feeds for Dash, search them.
I'd love a tool like this but more keyboard oriented, specifically vim like keybinds but any keybind system would work. I know there's an emacs package "helm-dash" but I don't use emacs. I might try it just for this tool.
All in all I think this is going to make my life a lot better. I like to disconnect from the world, but I want to be able to code while disconnected and zeal could enable me to do that.
It's very cool, when I was learning PHP on a slower computer I downloaded the documentation as a Windows help file. Nowadays I just DuckDuckGo it and it's still a much slower process, having the docs offline with Zeal is much faster...
Not for me. Maybe I need to dig in, I'm on bullseye, maybe it's only in the older build repositories?
Anyway, I found this https://github.com/qwfy/doc-browser that I'm compiling right now to see how it works, looks keyboard focused, simpler and supports DevDocs, and bonus it supports Hoogle if you're a Haskeller.
Dash looks like a great project but the community doc contributions are a security concern. Docs are uploaded to git as Tars, nothing is stopping someone from adding malicious code to these tar uploads which developers will download unknowingly when they add a community doc.
Even if we did have a "modern" replacement it would probably say something like "there are a lot of features and you should explore the interface to discover them!"
I started having this problem around 40. A lot of it came from a heavy marijuana habit that seemed to reduce my working memory with the tradeoff that I became more clever at small-scale problems(I call it a tactical vs strategic tradeoff).
Quit smoking pot and things got a lot better for a while, but then things seemed to resume their downward slide. Sleep apnea didn't help, and treating it made things better for a while, but that slide continued.
Now I just try to manage it with lots of notes, scripting everything I can, and trying to simplify my workflow wherever possible. Hopefully things stabilize in my 50s or I don't think I'll be good for much engineering work beyond that. I have a number of dopamine-related movement disorders though so there may be some special issues in my case.
Yeah the problem is "dopamine" means many things in the brain. You can have too much one place and too little in another. Or you could have too much along with a declining population of dopamine neurons in certain areas. I'm type II bipolar which is theorized to be an issue with dopamine regulation and also suffer from essential tremor and spasmodic torticollis. So in my case I probably have issues in several areas.
I can still remember a lot of my early code (and much of it is still running in some form 15 years later). As I’ve grown and gained more responsibility I spend a lot more time on concepts and design than on particulars, and I think that actually makes the particulars harder to remember. This is doubly true because I also spend less overall time building/iterating on those specific pieces. That plus the sheer number of things I work on in a given day, at this point I can forget code I wrote last week… but it doesn’t really matter because I know how to read it later if I want, and I can always build it again anyway.
I have been using post-it notes and a pen on my desk lately and it feels like virtual ram for my brain. Instead of holding X variables in my head I can hold X+5 or whatever fits on the post-it note. The ephemeral nature and small footprint makes it easy to scratch out brain-exstension-only notes without being tempted to start making a reference document like I would in a notebook. The note is extremely valuable to me for its lifetime, and a day later I'll glance at it, say "I have no idea what this means," and toss it.
I tend to work on a lot of ML projects in my personal time, even if that means I am simply benchmarking something. Which means I generate a lot of data (in form of results) and conclusions based on them. Documentation, lavish commenting, proper file organization (also documented), READMEs at different levels are necessary for me to manage such breadth and scope. My memory is fine (unless I don't know what I'm forgetting ;-)), but there's only so much I can pack in for the amount of things I want to do.
Despite the excellent initial intention, the article itself shows the reasons of the low adoption: if you actually need to kind of hack every new piece of documentation, this is overwhelming. I can have more than 30 external packages for each project. It would be too painful to handle all of them, and particularly frustrating for the ones I can't, and still need to browse from my browsers.
This kind of projects could have a better adoption rate with standards for documentations.
I actually think that a terrible short-term memory makes developers write better code. Because it forces them to write easy to read, well-commented code so that they can figure out how to improve it later.
I regulary have to modify code I wrote years ago and I more often that not have completely 100% forgotten ever writing it. So I have to read and learn my own code from scratch. Which taught me how to write code that is easy to read and learn for myself and others.
Using third-party libraries may be more costly than developing your own.
It's much easier to modify and maintain your own software which is purpose-built to your needs than it is to maintain somebody else's code that was built to their needs, and was probably overcomplicated to make it as a generic library.
For a compiled language, you have the extra complexity of integrating its build system into yours.
So I'd say you should still weigh carefully whether you want to pull an extra dependency or not.
Your own code is as foreign as a third party library to a developer who hasn't been part of the writing process.
Third party libraries which have a handful or more users are generally speaking relatively bug-free for standard use cases.
Third party libraries are generally a productivity win for languages which have decent build systems, which is most of them notably excluding C and C++.
If only there was a universally available program and format for programming API’s we could reach for as a single command. We could call this man (short for manual).
What happened? In C ecosystem this is a no brainer. Why can’t my frameworks offer a simple man page I can download into a folder and it works offline. Why does modern documentation require such fancy things like a web browser?
> On the other hand, the information for everything that we need to write code today is usually no more than one click away.
Sure! If you just assemble a page with about 500 button or links, and know which one to click on to get the needed information, everything is a click away!
Haha. Same. I would fail a trivial pursuit-style magical but useless details interview or an A+ certification exam. I've been using Dash for about 10 years. Sadly, it lacks Rust crates support and really hasn't moved anywhere in 4 years.
Dash looks excellent. Insta-buy for me. I only wish it officially supported Unreal Engine (cf, Unity is supported). Epic doesn’t make offline docs available and the existing community projects tend to be out of date.
IMO this makes a good case for TypeScript and the code completion features associated with it. Though personally I have the memory of an elephant so I don't need it.
I've never used TypeScript, and your comment has me intrigued. Please clarify something:
As I understand it:
1. TypeScript is a language (super-set of JS).
2. Code completion is a feature of an IDE or editor.
So how does this language enforce code completion? Or is this something in VSCode? If I were using Notepad++ to code, it won't have code completion, correct? Thanks.
Even if someone can remember every second in any given day with picture-perfect detail, taking notes, documenting stuff will allow him to do MUCH more than he normally could.
The rate at which you assimilate and synthesize new information on a particular subject is proportional to how much other stuff you already know about that subject.
Its also only really worth learning things that matter or might matter to you in the future, so there is a degree of speculation and risk involved.
But not remembering helps that. I have a good sense of what exists and what can be done and the general outline of many things. Keeping the details in your head makes it harder to be creative and think expansively.
Copilot has got me writing some side projects in Java again. I love the fast compilation speed, good enough type system, strong and seamless IDE support, and the abundance of libraries.
The new language versions with things like records take a lot of the tedium away, and Copilot gets rid of most that is left.
Well, i suppose since the source code is simply a repo on github, have you tried submitting these as feature requests (via github)? Worth a shot, eh? (I do like the idea of the zoom level adjuster.)
Zeal seems, complicated. It hasn't seen a release since 2018 (https://github.com/zealdocs/zeal/discussions/1308 they're switching browser engines at some point, which affects default zoom level, though customizable CSS injection should be doable if they keep static CSS injection), and commits are a bit sparse (https://github.com/zealdocs/zeal/issues/1336), so I'm not sure my feature requests will be fulfilled.
Worryingly, while updating my 5 docsets downloaded months ago, I hit a segfault, and trying to debug it in coredumpctl repeatedly hangs gdb and makes it eat all available RAM. I suspect it's a lifetime or multithreading issue.
If your code reads like English, you don't need much of a memory.
A codebase that requires large working memory typically means it's written poorly. Lacking encapsulation, poor modularization/separation of concerns, unnecessarily succinct variable names
Most programmers believe a number of blatant falsehoods about documentation, with the most prevalent being "comments go out of date quickly, so there's no point in investing in them". Maybe I'm just hyper-aware of it because my short-term memory sucks, but code comments have saved me on so many occasions that they're simply not optional.
You can document your code. You can keep it up to date. It isn't that hard. You just don't want to.