Hacker News new | past | comments | ask | show | jobs | submit login
Justified Variables: Words of the same length with related meanings (github.com/timvieira)
83 points by polm23 on March 7, 2021 | hide | past | favorite | 38 comments



People/formatters that justify their structures drive me completely crazy (I'm looking at you golang et al). It's like they've never tried using revision control before. It just takes a single change in the longest entry and something that was a 1-byte change becomes a 20-line diff. Noisy for code review and 20 times more likely to create a merge conflict.

Your obsessive compulsions are something to be overcome, not embraced, and certainly not something to force on other people.


As for golang, I think the pros outweigh the cons here. Code is read more than it's written. Cleaner code means you can literally identify code structure from afar. Which in turn makes is easier to focus on the part you care about. If you have to pick a length for automatic formatting,the same length makes sense. So is trailing commas.

Even if only half of engineers think this is true, the other half should be pragmatic and let the go formatter do it's thing. The best thing about go is that it has one standard formatting.

As for equal length variable names, the majority of these examples you would use naturally anyway, so I am not sure what are you so upset about (top/not instead of true/false is a bit overkill)


> Even if only half of engineers think this is true, the other half should be pragmatic and let the go formatter do it's thing.

Why should a group of tolerant people bend to the wishes of a group of intolerant people? That's how I see the whole formatting debate.

Why should people comply with a formatting that is demonstrably worse in concrete ways (let's not even start talking about how it makes the need for ugly line-breaks occur much sooner), and to many is harder to read?

I don't understand why people feel the need to force their obsessions on other people. If you find it harder to read without a specific formatting, please go and develop some plugin in your fancy editor to display it to you that way. Just deal with it on your end and don't make it my problem.


To be fair, you could argue that your mentioned behaviour is an undesirable feature of current revision control software and code editors. The formatting could be declarative, so that compiler-meaningless indentation is not on disk. And, compiler-meaningless changes could be ignored by revision control (that's probably provided if meaningless changes aren't persisted to disk in the first place).


This implies an amount of language-understanding from the revision control system I'd rather avoid, especially when I unfortunately spend so much of my life writing mixed source-in-source-in-source monstrosities these days.


How much would it have to know? I’d expect a `significant_leading_whitespace=true` config would probably do the job?


It maybe depends on who's going to be interacting with the code in what ways.

If you expect the code to be changed infrequently but read frequently, and if you think the justified version is easier to read or harder to miss errors in, then that could easily outweigh the cost of noisier diffs.

(I'm not sure whether I actually believe the following argument, but:) The increased likelihood of merge conflict is not necessarily a bad thing. If you've got a bunch of variables, structure entries, or whatever, that are related to one another, then if A changes one of them and B separately changes another this very well may be the sort of thing you need to look at carefully and explicitly, which is what getting a merge conflict forces you to do. And (maaaaybe) the more closely related nearby variables (etc.) are, the more likely people who like justified code are to justify them.

Full disclosure: I sometimes justify things in code, in cases where I think it helps to clarify relationships between nearby lines. I don't remember ever getting a merge conflict as a result, or having colleagues complain about noisier diffs. But I've generally worked in small teams and I've often been the only person working on a given bit of code, and what works well in that context is not necessarily the same as what works if you have a much larger group all poking at the same code.


I used to hate it until I worked on a team that preferred it. Now I find more than a handful of lines without alignment hard to read. I’m not saying everyone would change their mind, but I suspect a lot of style preferences are more rooted in familiarity than we necessarily realize.


IMO this is one of those unintuitive things that clicks when you see it and you go "well of course that makes sense". Really cool.

And now I'm reminded of http://bash.org/?406381:

    <Axe> I
    <Axe> do
    <Axe> not
    <Axe> know
    <Axe> where
    <Axe> family
    <Axe> doctors
    <Axe> acquired
    <Axe> illegibly
    <Axe> perplexing
    <Axe> handwriting;
    <Axe> nevertheless,
    <Axe> extraordinary
    <Axe> pharmaceutical
    <Axe> intellectuality,
    <Axe> counterbalancing
    <Axe> indecipherability,
    <Axe> transcendentalizes
    <Axe> intercommunications'
    <Axe> incomprehensibleness.
    <JediHobbes> woah
    <JediHobbes> *blinks*


I might be in the minority, but I think using variables of equal lengths with opposite meanings is an anti-pattern. I like using clearly different words (diff lengths, diff starting characters) that still pair well together. IMO this makes it much easier to scan through code, and much less likely to accidentally use the wrong function/variable somewhere.


> I think using variables of equal lengths with opposite meanings is an anti-pattern.

Perhaps the worst example of this is Kotlin’s var/val. I much prefer var/const.


I get the sentiment, and I'd go further to say that I prefer Rust's `let` vs `let mut`, i.e., the longer & visually noisier option discourages mutability by default, and draws attention to when it is used.


I use a CLI app that is used like this:

`command get/set [long directory]`

Because of that long directory part, I use the command history instead of retyping the whole thing. I probably already lost a cumulative total of hours executing `set` instead of `get` by mistake.


Equal-length variable names help to see differences in the remainder of the line. You are more likely to spot mistakes in code where stuff that should be the same is aligned the same.


I lamemt the lack of support for elastic tabs. In a better world the pleasure of these justified variables could be found everywhere.


Maybe, but there’s no amount of elastic tabs that are going to make the suffix in

    nextAncestor
    previousAncestor
line up the same way as

    nextAncestor
    prevAncestor
so I admire the author’s time spent collecting these word groupings.


Why not though? It reminds me of this project: https://nas.sr/قلب/. It’s an Arabic programming language and apparently their script allows for elasticity, even within words. Perhaps the Latin alphabet just sucks for aligning?


It could work by adding a new alignment tab character, like & in TeX. TeX tables look like

    name & value & comment \\
    long name here & 1 & one\\
    x & 2 & two \\
and the columns line up at the & characters by adding suitable whitespace. Imagine some control character that you could insert in source code:

    next&Ancestor &= current->next;
    previous&Ancestor &= null;
and the corresponding parts would line up in consecutive blocks, like elastic tab stops but even inside words by stretching the existing whitespace around the words.


OK but what are the legitimate reasons to spell out Greek letters? Usually, they're already a shorthand for something else that has a name, so you should use that name instead.


In geometry, it's conventional to use Greek letters for angles. If you have two angles, theta/phi or alpha/beta are reasonable names for them. Probably better than angle1/angle2 (in which the names are mostly "noise"), but if the angles play specific roles then it may be better to use informative names: azimuth/altitude or whatever.

In finance, there are lots of traditional Greek-letter names. Using gamma/delta/theta is probably _more_ immediately comprehensible to your audience than, say, secondOrderUnderlyingSensitivity/underlyingSensitivity/timeDerivative. (Of course, in this case you don't get to choose which Greek letters you use.)

There are other contexts where particular Greek-letter names are traditional. Again, usually you don't get to choose which ones you use (at least, not if you want to get any familiarity advantage from using them).

When doing algebraic manipulations, sometimes it's convenient to give one set of variables Latin-letter names and another "parallel" set corresponding Greek-letter names. E.g., if you're doing plane geometry, maybe one point is (x,y) and another is (xi,eta). In software, this is probably a strictly worse strategy than calling them (x1,y1) and (x2,y2) or something of the sort.


I see. I had blinkers on for engineering which has symbols like nu for Poisson's ratio and epsilon for strain and I've developed some annoyance working with source code that's full of spelled out Greek letters where it's not even clear what they represent. Now I think about it, there are cases like you identified from finance where the Greek letter becomes the mainly known name, such as alpha and beta for Rayleigh damping coefficients. When implementing algorithms from literature, I've sometimes named variables with both - the description to aid understanding and the letter to make it clear how it corresponds to the original material. Still don't get to choose the letter though.


Two examples of times I have used Greek letter variable names: (a) Often times the Greek letter is the standard way of referring to a concept, e.g. "beta" in a linear regression. For these situations, the Greek letter would be used in spoken language to refer to the concept and is more clear than trying to describe the underlying concept in English. (b) If you implement a non-trivial mathematical equation, it is clearer to use the variable names which are found in the published equation than to try to rename them. To understand the equation, people will need to read the published description anyway, and it is easier to map one to the other if you don't change the names. Likewise, if you are implementing your own equation, you will usually use Greek letters in the paper, and for consistency, you should use the same variable names in the code as you do in your paper.


One nice thing about Julia is being able to use Unicode characters in variable/constant names, e.g. π and θ, instead of the usual pi and theta, in programs.


Most popular programming languages have that ability. The problem is that most people don’t know how to enter characters that aren’t on the keyboard and don’t have software installed/configured that makes it easy to do so (e.g. using a Compose key).


Ok, good to know. Well, Julia doesn't have those problems – in the REPL, out of the box, type "\pi[TAB]" for the greek letter, etc. And not needing an explicit multiplication sign in some places also makes mathematical code look more like maths, e.g.

    julia> 2π
    6.283185307179586


Same with Emacs if you select the TeX input method: Alt-x set-input-method TeX

\pi → π

That works for any programming language. I’ve used it occasionally for my own programs but use plain ASCII if I expect others to work on the code.


you can also do this in python, perl and C (if you use gcc or clang)


Surprisingly, it’s missing src/dst (source/destination).


    doing
    words
    fixed
    style
    takes
    moxie


I only knew of the word Moxie in the context of the Pokemon ability[1] and it never occurred to me until just now that it's a real word with a real meaning haha

> moxie

> n. The ability to face difficulty with spirit and courage.

Thanks!

[1] https://bulbapedia.bulbagarden.net/wiki/Moxie_(Ability)


Somehow bot, opposite of top, doesn't seem satisfactory. top/bum, albiet a bit snickerable, does make a good bit more sense.


In maths it’s very common to refer to top and bottom in an ordered structure. I often define a TeX macro \bit as \perp.


I'm tempted to do a pull request to add "remove" to "lookup", "insert" and "update". Of course that could be "delete" as well in many languages...

I also think there should be a place for "many" in the "some" and "none" grouping.


This is neat! I often find that the structure introduced by doing this makes it easier to spot some mistakes. One I've used in ML: train & valid for the training and validation sets.


I've been using ifp and ofp and ths/nxt for some time. stdin and stdout are a shame in this regard


those of use who use only single-letter variable names have this problem solved from the start


Well the next step is to select language primitives based on their character length: > vs >=, + vs ++, == vs ===, var vs const, map vs filter. Or you could select the appropriate function argument count based on total character count. So many opportunities to improve readability.


I bet people who find this worth optimizing are especially perplexed by variable-width fonts in programming.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: