Wow, he's still got it. I read the whole thing, without realizing it was Guido Rossum, thinking "hey, that's a really good point". Then got to the end and realized it was the inventor of the Python language. So, I wasn't impressed as I was reading it because of who it was, I was impressed because he was making a really good point.
Obviously, the fact that he made my favorite programming language was no fluke.
I had exactly the same experience and I was prepared to criticize because I normally think talk of operators in programming languages tends to add little but confusion and is mostly supported by programmers eager to do clever (i.e. incomprehensible) things.
Ramblings through technology, politics, culture and philosophy by the creator of the Python programming language.
This was literally the first thing I read when I hit the link.
Don't mean to sound rude but how did you miss that ? Does it render differently on web? I saw this on the mobile version of the site on my Android.
For me that was covered up by some popover I didn't read. Besides the tagline of a blog is the sort of thing I for one gloss over. I've had decades of training how to skip straight to the content.
Seems like it's more an argument and prefix notation vs infix.
z = Add(x,y) is just a form of prefix notation in my opinion. I would say that z = x + y feels better because that's how we are taught in school. But in English at least, it's reasonable to say the following:
Z is X plus Y.
To get Z, Add X and Y.
---
It's also an argument about inconsistent syntax. For example there is not a big gap in lisp for (+ a b) and (add a b). It's a bigger difference in python.
---
I find infix for math easier to read because it's how I learned it. But I also use lisp, and appreciate being able to say (+ x y) and (+ x y z a b c) where plus might be more accurately read as sum.
That and in college I used an RPN calculator, and going from postfix to prefix isn't as odd as infix to prefix.
Smalltalk also doesn't have this problem. Since all message-sends are essentially infix, operators (binary message sends) are simply special keyword messages.
1 add: 2.
1 + 2.
This is helpful when you have "operators" that aren't quite as obvious, for example raising to a power.
2 raisedTo:3.
Since there are no operators, there is no operator precedence. However, there is precedence between different message types: unary binds tightest, then binary, keyword last. Otherwise evaluation is uniformly left to right.
While this is a question on Smalltalk job interviews ( What is 2 + 3 * 5. ?), that's only useful for filtering out people who simply have never seen Smalltalk before (in case they claim knowledge). It doesn't seem to be an issue in practice, because the evaluation rules are otherwise so simple and uniform.
The other surprising effect is that binary message sends don't seem to have the same tendency to be confusing that operator overloading does. I don't really understand this effect, because the mechanism has effectively the same power.
The mnemonic value of "operators" as reminders of properties such as associativity or commutativity interestingly extends to diagrammatic reasoning. A diagram is really just a generalized expression, and this becomes quite useful when one has to deal with more than one "type" or "domain" of operation, but in a consistent way that preserves the compositionality-like properties OP talks about. A recent book exploring this topic is "Seven Sketches in Compositionality; An Invitation to Applied Category Theory" https://arxiv.org/abs/1803.05316 . (Despite the obvious reference to CT in the title, the work is quite accessible and the math involved is not much more complicated than that found in the linked blogpost. Importantly, and perhaps unlike some people in other programming-language communities, the author does not assume any pre-existing knowledge; the work is rather about using concrete, real-world examples to gently guide the reader's intuition.)
A “dict” operator would be vague because there is more than one way to combine dictionaries. Why should I have to guess whether duplicate keys are being skipped or replaced by a symbol, when two different well-named functions will always make it clear?
Of course it’s not the same. There is one sensible result expected when adding two lists or strings together (since they’re linear, it would not make sense to encourage inefficient search/replace operations with an operator; and also, the containers do not require unique keys, you can simply extend them). A dict must deal with conflicting keys, and arguably each situation requires different treatment of those keys.
Given the semantics of {}.update({}) and { * * {}, * * {}}, it is clear what is the preferred behavior, or at the very least the most commonly known one in Python.
Hence there is nothing hard or mysterious what the + operator should do. People are nit picking on this.
> Given the semantics of {}.update({}) and { * {}, * * {}}, it is clear what is the preferred behavior,*
Sorry, I honestly don't understand what's the meaning of that (I know a little bit of Python, C/C++, ASP.Net, etc...).
Concerning adding dicts: thinking over it I am more or less neutral as I think that somebody that would dare doing it would feel the pressure of reading the docs to know what the operator would do (e.g. merge distinct keys & overwrite-left/concatenate/ignore-right values of duplicate keys?).
On the other hand in Python I keep trying from time to time to e.g. cast "integer"s into "bytes" using "bytes(my_integer)", which results in a "my_integer"-bytes variable initialized to zeroes (ha-ha), so whatever would be implemented to add dicts using an op might be good for a large part of users but at the same time bad for the other part (that are inexperienced, have a different ways of thinking, etc...) => this might in turn weaken the language's acceptance.
They're the standard forms of doing dict merge in python, and generally the first answer whenever someone asks (both overwrite with right dict key/value on collision, concatenate keys otherwise).
{ * * d1, * * d2} -- unpack the two dicts and feed them both as input to dict comprehension to construct a new dict.
d1.update(d2) does the same merge strategy, but modifying d1 in-place.
Sure, there's five different ways to go about handling collisions, but there's already a well-defined, commonly used methodology, so it's fair to implement syntax sugar for it. It's also not too difficult to understand, and easy to look up (particularly compared to {* * d1,* * d2}).
And I'm not actually aware of any other merge strategy being included in the python stdlib, which implies that this strategy being the best useful default, has been decided long before this operator came into question.
What happens when you “merge” a pile of six pennies with a pile of three pennies? You get a pile of nine pennies. Six plus three is nine. Adding three pennies to six pennies results in a single group of nine pennies.
My point: don’t just pick words, explain why those words were chosen.
Penny piles can’t be merged together and remain pennies. If you merge a pile of 6 and a pile of 3, you get one metal lump. I actually really like the point you are making, just feel like your illustration works against you.
Problem with pennies as your example is it's hard to see how these work into a key value context.
If Penny is a class, then maybe you're asking what happens when you combine two arrays (or Python lists) of Penny objects? You get one array (list) containing all the Penny objects.
Perhaps you're getting at something like what if you have two "wallet" dictionaries you want to merge?
Let's say each wallet dict has a list of Penny objects, Dime objects, $5 notes, etc. If you are merging the wallets, maybe you don't want to overwrite the first wallet's Penny list with the second wallet's Penny list and instead you want to combine them.
This is where a dictionary comprehension can come in handy. Just iterate on the second wallet's items and add each item's value to the first wallet's corresponding value at the same key to create a new wallet object (or update one of your two existing wallets by setting the wallet equal to the comprehesion). You would have to add additional logic if you had any nested dictionaries in your wallet dictionary or another type that doesn't combine with the + operator, such as sets.
The other case is something like when your wallet gets sent off to the thief api. To make this more Pythonic, let's say you have a cached idea of your wallet's contents before your wallet itself is sent over the api. Once your actual wallet comes back and you pull it out to pay for something and realize it's empty, you merge the wallet you're holding with the cached idea of the wallet in your head, effectively updating the wallet to empty. If the key values are still on your returned wallet's dict, just now they're a bunch of empty lists as values, this will work fine:
cached_wallet.update(my_wallet)
However, if the key value sets are removed entirely from the wallet returned by the thief api, you'd probably be better off doing:
cached_wallet = my_wallet
This is because in Python the first example would have no affect on your cached idea of the wallet because an empty dict passed to the update method will not modify the dict you're attempting to update. So actually the second approach is much more robust here, unless of course it's problematic for some other part of your system to have a keyless wallet floating around (although I'd suggest fixing those other parts of your system by having defaults in place).
You could also use a dictionary comprehension in this case, making use of the get method with defaults while iterating on the emptied wallet you're holding like this maybe:
cached_wallet.update({
k: my_wallet.get(k, type(v)()) for k, v in cached_wallet.items()
})
If the thief put a new key value pair in your wallet, you wouldn't get it in the above comprehension, so you might want to do this in some cases:
cached_wallet.update(
**{ k: my_wallet.get(k, type(v)()) for k, v in cached_wallet.items() },
**my_wallet,
)
Although that may be called out as more expensive/redundant than necessary, you may want to combine the .keys() from both dictionaries, cast them as a set to dedupe then iterate through the actual wallet on those and fuck all, you're tossing the dirty tissue he stuffed into your wallet in the garbage anyway ... but then actually you decide to keep it, there may be DNA evidence here ... you have no idea how you're going to actually parse and apply this evidence and it's kinda disgusting and, ugh, get a hold of yourself, toss that out and go wash your hands and then focus on making sure the credit cards that are missing are canceled and hope to God that updating your card number on your Fubo TV streaming account doesn't invalidate your legacy subscription that lets you watch the Barcelona game each weekend for $10/mo because there's no way in hell you're going to start paying them $40/mo, that's bullshit.
Commutatitivity does not need to always hold. Multiplication is not commutative on matrices, but we still use it often and write it as a product nonetheless
In mathematics, the ‘multiplication’ operation is not assumed to be commutative, whereas the ‘addition’ operator is used specifically to indicate commutativity. So, for example, using the plus sign for string concatenation goes against this tradition.
The even more sensible thing is to take an optional argument: a function that is passed the conflicting keys and their values, and must resolve the conflict. If `None` is passed, then a documented default is taken.
Thats not obvious and breaks a commutative property. This observation about operators seemed poorly crafted to support a decision he already made. The existing example was unnecessarily complicated...most languages just do a singular function with a return val. Why be obtuse in the example? Preconceived agenda. The operator isnt compelling for dicts/hashmaps in any language.
> You can't "break" a commutative property that doesn't exist because the operator is not yet defined.
That's the point (eg modulus). There's an entire preamble that's not relevant to the decision. The coverage of the inane "insights" is some hero worship hype.
Documentation really isn’t a panacea. Quick, what exactly does this do: “UploadContent(false, true, true, false, false, false)”? Oh, all those flags are documented? Would you want to look them up every time?
Code should be more convenient to read before it is made more convenient to write.
In Ruby, we have Hash#merge, e.g. { a: 1, b: 2 }.merge({ b: 3, c: 4 }) gives { a: 1, b: 3, c: 4 }, which is typically what you want, e.g. args.merge(reset: false) to have reset set to false in the args even if present.
That's how dict.update works in python also. The other poster is saying that they don't like the + operator having the property of using the values from the right operand to update the left operand vs the clear semantics of the method call.
JavaScript deals with this well. The equivalent of the proposed
d3 = d1 + d2
is
let d3 = {...d1, ...d2};
In fact, it seems like you can already do this in python:
d3 = {**d1, **d2}
This seems to have most of the benefits of being a dedicated syntax (rather than just a function call), without the downsides of breaking the commutativity of the + operator.
> To create a new dict containing the merged items of two (or more) dicts, one can currently write:
{**d1, **d2}
> but this is neither obvious nor easily discoverable. It is only guaranteed to work if the keys are all strings. If the keys are not strings, it currently works in CPython, but it may not work with other implementations, or future versions of CPython.
> It is also limited to returning a built-in dict, not a subclass, unless re-written as MyDict(d1, d2), in which case non-string keys will raise TypeError.
Operators can certainly be nice. And I do like allowing the programmer to create new ones too, though I'd tend to prefer the Haskell approach of creating new ones out of existing symbols like >< or such rather than the C++ approach of letting the programmer redefine an existing operator for a new use, << in the streams library being one of the worst common examples.
Python discourages using operators for things that don't have anything to do with their original use. If you try to be too clever with it, you run into issues. For instance, comparison operators are chained, so that `x > y > z` is equivalent to `x > y and y > z`, not `(x > y) > z`.
The most radical use of operators in the standard library that I know of is in pathlib, where `Path('/usr') / 'lib' == Path('/usr/lib')`, and I think that got a lot of pushback. It's certainly an outlier.
It fits well with Python's overall approach to readability. If the meaning of your operator isn't immediately apparent from its existing meanings, then it should probably just be a regular old named function or method instead. Python doesn't like DSLs very much.
Programmer-defined operators are certainly useful, but Python went in the other direction, which has its own advantages. C++'s choice to have a fixed set of operators but overload them in a lot of arbitrary ways is probably the worst of both worlds.
You should look at the Construct library, which overloads the '/' operator as syntactic sugar to make s-expressions in it's DSL less paren-heavy. I'm not saying I agree, but it was wild when I first saw it.
> Python discourages using operators for things that don't have anything to do with their original use.
Python doesn't really discourage anything and especially not overloading operators in weird ways. You even pointed out the pathlib insanity in the stdlib. Python isn't the shiny bastion of consistency and obviousness that the zen claimed it was years ago
Every C++ programmer knows that << is used for streaming, that is only a problem for purists or nitpickers.
So you really did pick the worst common example to illustrate you point... A good example is how "&" is used by boost serialisation to allow both serialisation and deserialisation of a value using the same expression:
obj & value;
Now that makes no sense at first sight. Still, not even this example can be used as an argument against operator overloading, because operators are just functions with predefined names and some expected behaviour.
The above could have been called as obj.serializeOrDeserialize(value) and it wouldn't have been much better. The problem is with the programmer that can't pick proper function names (where +, -, etc are also function names).
> formulas written using operators are more easily processed visually has something to do with it: they engage the brain's visual processing machinery, which operates largely subconsciously, and tells the conscious part what it sees (e.g. "chair" rather than "pieces of wood joined together").
This is why I prefer ML-style syntax over Algol-style syntax.
ML-style syntax (with sugar for pattern matching) engages my visual process machinery better, such that I can scan over function definitions more at a glance, than the "literary" style of Algol syntax
But, this argument defeats itself. At least, in practice.
If I look at how 'operator overloading' is used in practice, _rarely_ do you get commutativity, or even anticommutativity, associativity, or distributivity.
Take list addition, where operator overloading often shows up:
someList += someElement
is shorthand for:
someList.add(someElement)
add is not commutative; the elements used in the operation aren't even the same type.
The point being, the manipulation and simplification that is, according to GvR, much easier to spot if operators are used aren't even relevant here.
If we're talking about, say, introducing a class 'Complex', representing complex numbers, and wishing that the language is such that one can add the ability to use mathematical operators here, observing that it is both [A] common in the domain of complex numbers to use, say, the symbol `+` to indicate addition, and [B] these properties of operations such as commutativity apply to many mathematical operations one might want to perform on complex numbers, I'd agree: Yeah, the language kinda sucks if you can't write `Complex(a, b) + Complex(c, d)`.
But how often does that actually occur?
Separately, if you go down this route, the abstraction should be as complete as it can be. Therefore, this:
`Complex(2, 3) + 5`
Should work, and should evaluate to `Complex(7, 3)`. But.. this:
`5 + Complex(2, 3)`
should also work. Which requires either scala's `implicit` system or python's take on this, involving `__rplus__`. These operations introduce their own complications.
Thus, the debate on operator overloading boils down, as language feature debates usually do, as a cost v. benefits analysis. The costs are heavy: There is a lot of proof out there that even experienced coders just insist on abusing operator overloading (or, to be a bit less judgemental: That opinions are rather divided on how they ought to be used, given the amount of complaints about `cout << somestring` and the like). Also, the language complexity is considerable, given that you need to solve the `5 + Complex(2, 3)` problem.
Are the benefits worth it? Possibly.
But I don't think GvR's article takes the cost side seriously, and the benefits stated, at least in my experience, tend not to apply at all there where op overloading tends to end up.
`someList += someElement` raises an exception. It's `someList += someSequence` which just does the same as `list.extend`, that is, `someList = someList + list(someSequence)`
And you can call it in both infix or prefix forms. Though Julia is a math focused language and the community tries to stay consistent with the correct notation. You can't add an element to a list using the "+" operator, but you can add each element of two lists if their dimensions match.
> should also work. Which requires either scala's `implicit` system or python's take on this, involving `__rplus__`. These operations introduce their own complications.
Or the way C++ handles this: being able to define operators as free functions (though you could alternatively define an implicit conversion).
In my opinion, requiring commutativity from a +-operator is maybe too much. Associativity and a neutral element seem to be enough (i.e. forming a monoid).
> requiring commutativity from a +-operator is maybe too much
The + operator is the only one that can be realistically required to be commutative. All other operators need not. If we are to have any commutative operator, it should definitely be +.
I mostly get stuck using Javascript because I want to target people’s web browsers. I write a pretty good amount of numerical code. The lack of operator overloading for number-like objects is extremely annoying for me.
There are always hacky workarounds, e.g. https://github.com/enkimute/ganja.js#the-inline-function but in this particular case the code to make it work ends up a bit confusing, and the hack is somewhat brittle and inflexible.
An interesting example is C's pointer-integer addition. p + n == n + p, true, but this is a purely syntactic fact. The actual semantic question of commutativity, whether switching the order of the arguments leaves the value unchanged, cannot even be asked of pointer-integer addition since the arguments, having differing types, cannot be switched.
This is much less confusing than (2), and leads to the observation that the parentheses are redundant, so now we can write "x + y + z"
This is a non-problem if you're using Lisp. (+ x y z) accepts an arbitary number of arguments and the operator precedence problem does not exist since there is no operator precedence.
If this is so obviously preferable, why have mathematicians so obdurately not adopted this style?
Mathematical notation is not an archaic practice that is followed out of a respect for tradition, or a doctrine that has been developed from first principles, it is something that has evolved (and continues to do so) because it has been useful.
We do have f(x, y) in mathematics notation, as well as notations like { a, b, c }, ( 1, 2 ). The inventor of vector and matrix notation finally had the brilliant idea of dropping the silly commas [ i j k ]. Think of how ugly matrices would look with commas.
Now in mathematics, there is no equivalent of a 200,000 line piece of software. The number of identifiers a mathematician works with in any given work is small, even if the work is large (many pages of derivation).
Mathematics notation is ambiguous and inconsistent. One letter names are preferred so that xy can denote the product of x * y. So much so that mathematicians reach for other scripts like Greek rather than make a multi-letter name. Then, inconsistently, we have things like sin(x).
Consider that xy(z + w) is likely interpreted as the product of x, y and z + w. It has the same syntax as ln(z + w), the natural logarithm of z + w, not the product of l and n.
Professionally printed mathematics resorts to the use of special fonts to visually resolve these things. This problem is so serious that everyone who is anyone in mathematics goes to the trouble of professional typesetting, even in minor papers.
In software, the program-defined functions form vast vocabularies. They cannot all be assigned to infix operators without creating mayhem. Many functions have 3 arguments. Seven argument functions are not uncommon. Those also cannot be infix.
In spite of what the author may say, real Python code is chock full of "import foo" and "foo.bar.baz(this.or.that(), other.thing())".
Calling people’s use of LaTeX to type their homework “professional typesetting” seems like a stretch. Professional typesetting would be something like: send your hand-written manuscript to a full-time typesetter, and wait for them to do the work.
A better description would be “goes to the trouble of using math typesetting software designed by experts”. But is this really so strange? People use even more sophisticated software than that for making image collages of cats with mustaches, for modeling platonic solids, for adding their favorite song to a frivolous home movie, ....
Well mathematics notation largely follows speech. People say “one plus two” — largely because speech doesn’t have closing parentheses, so we need to speak in a way that makes it clear when we’re done talking — so that’s how we write it. But for a computer, prefix notation is great because it’s unambiguous and clear even without knowledge of PEMDAS. Similar to how Americans write MM/DD/YYYY because that’s how we say dates, but we can still acknowledge that YYYY-MM-DD is the best format for computers.
I wonder how much is the reverse: we now tend to say mathematical expressions as they are written, but before this was standardised, you would just explain the steps.
Probably not "one plus two" -- I think + is essentially a variant of & which is a ligature for "et", and I guess most languages put "and" between the things being combined. But I'd be surprised if (x/y)^2 was said "x over y all squared" by many people before this notation. But the notation is clearly more designed for thinking on paper than for explaining down a phone line.
1. Mathematicians have different priorities than programmers, and they use different tools. Working with an equation on a whiteboard, it's easier to write "a+b+c" and then cancel terms as needed. When writing a formula on my computer, cancelling terms is something I almost never do, so it would be silly to use a notation that's been optimized for that.
When I am doing algebra on my computer, I hope I have a tool like Graphing Calculator (not "Grapher"!) that lets me simply drag a term from here to there, and automatically figures out what needs to happen to keep the equation balanced.
2. They have, except they use Σ for the prefix version. When it's more than a couple terms, and there's a pattern to it, Σ (prefix notation) is far more convenient than + (infix notation).
If programming languages look like they do because they're taking the useful notations from mathematics, why doesn't your favorite programming language have a Σ function? Who's being stubborn here?
Most programming languages do have some variant of `sum(seqence)`. Python certainly does. Or, like, loops, which do the same thing.
But they're optimized for different things. Using the same tool for infinite length sequences and fixed length sequences doesn't make a whole lot of sense. We often have different types for them (tuple/record vs. list/array) too.
Having done addition in both infix and prefix varieties on my computer, over the past few decades, I don't understand why prefix notation is considered 'optimized' for indefinite (not 'infinite') sequences and infix notation is considered optimized for definite length sequences.
What exactly "doesn't make a whole lot of sense" about (+ a b)? (It doesn't look the same as you wrote it in school? Neither does "3+4j", or "math.sqrt".)
Being able to use the same tool for different types of data is precisely what makes high-level programming so powerful. As Alan Perlis said, "It is better to have 100 functions operate on one data structure than 10 functions on 10 data structures." Having only one way to add numbers (in languages that do that) is a great feature.
Python's insistence on these pre-selected groupings of functionality has always made it frustrating for me to work with. The two ways to add numbers look and work nothing alike. Does "TOOWTDI" not apply to math?
(Yes, I'm also frustrated and confused that Python has built-in complex numbers, and a standard 'math' module with trigonometric functions, but math.cos(3+4j) is a TypeError. What possible benefit is there of having a full set of complex-compatible functionality, but hiding it in a different math module, with all the same method names?)
The zen never says TOOWTDO, it says TO(APOO)OWTDI. (That's "there's one, and preferably only one, obvious way to do it.)
`reduce(op.add, [x, y])` works. Python could remove it's infix ops and use only prefix ones. But prefix ones aren't obvious. And as Guido says, readability matters.
This really isn't a good argument it just looks like one on the surface. In fact while computer science is a branch of math its pretty clear that most applied software development has substantial differences from theoretical math.
Further there is no reason to suppose that any group of humans act optimally in this area when they clearly act on the whole so illogically and sub optimally in every other area of human endeavor.
It asks respondents to prove the much more complex question of why different groups may prefer a particular notation instead of taking the much more simple and direct route of explaining why the poster prefers a particular notation.
Mathematics uses infix notation literally everywhere. Just because you prefer some notation doesn't mean people who have used infix notation since they were 5 years old will like it. Also, I avoid lisp mostly because S-expressions are almost unreadable to me.
Mathematics uses prefix, infix, suffix, circumfix, and some more complex notations; basically any pattern any computer language uses probably is inspired from math even if math doesn't use the notation for the same operation or operator symbol.
I think the last one makes the relationships quite clear.
Writing legible equations is an art form, as is writing sexps.
Edit: If you think lisp is ugly, compare with regexps. For bonus points, explore writing regexps with lisp, e.g. with Emacs's `rx` macro. I think you won't find a more easily maintainable way to write them.
Some would argue that the fact that you have all of those alternative forms is part of the problem. Lisp is one of the (if not the) most individualistic programming languages around. Lisp makes it easy for a programmer to create their very own impenetrable, arcane, domain specific languages. This causes large organizations to avoid it like the plague.
Large teams don't want artists, they want replaceable parts.
The tragedy is that the "large organization" then goes on and writes multiple bad DSL to solve problem X which includes several code transpilers and a varying amount of custom syntax. In the end, they do the same thing as the Lisp folks -> they write a DSL, but because language XYZ they chose is less capable than Lisp, the solution is hacky and difficult to understand (and can't be – in comparison to Lisp – easily extended).
This is true for a lot of "frameworks" and especially true for modern fontend web frameworks (which have a compile/build step).
If it's about formatting, given that this is what this thread is about, you can use code formatter for lisp languages exactly like for any other language, in fact they are even easier to write for lisp because of the consistency of S expressions.
If you are talking about macro, then you're in the same boat as other languages that have macros, C, Rust, ect... And remember that the first rule of macros is to not use macros, except for when you absolutely have to, and in those cases it is the most elegant solution. If you have devs inventing DSLs for everything, then lisp isn't your problem.
I think most people would prefer to be an artist, but art rarely pays the bills, hence the cliche "starving artist." My post was not meant to rip on artists, though that's how it comes off now that I read it again. The truth is that society wants more cogs but needs more artists.
If prefer it if the product of my engineering effort were maintainable and clear. I succeed if someone with no context can understand the system I built without help.
Your comment is begging the question. The reason that Lisp “+” can accept a list of arbitrary length, rather than a pair, is that the underlying addition operator is associative.
But this observation does not change my point: that the parent comment is saying "Lisp already has the ability to do + on lists", but the reason "+ on lists" makes sense is because Lisp is using the underlying associativity of mathematical +. And the latter associativity property, for abstract mathematical "+", is what the blog post is describing/exploring.
The computing + is not associative for inexact types like floating-point. That's why it's important for the Lisp + to be consistently left-associative; the result could vary if that were left to the implementation to do however it wants.
In addition/relation to floating-point, another way in which addition is not associative in computing is if there are type conversions. (+ 1 1 0.1) is not the same as (+ 1 (+ 1 0.1)). The former will do an integer addition to produce 2, and then that is coerced to 2.0 which is added to 0.1. The latter adds 1.0 to 0.1, and then adds that to 1.0: two floating-point additions.
In languages that don't have bignums (which could include some Lisp dialects) whether overflows occur can depend on the order of operations, even when all operands are integers.
The reason we can have a n-ary + is that three or more arguments can be decimated through a binary +. The concept of + is defined as a binary operation.
Lisps have variadic functions that are blatantly non-associative, like, oh, list. (list 1 2 3) isn't (list 2 3 1).
Try writing out typical mathematical formula derivations using only s-expressions. I tried for a period and abandoned the persuit. It’s just not comparable to established mathematical notation.
Two solutions. Threading macros wherein instead of nested parens like (x (y (z))) one writes
((->> z)
y
x)
In clojure there is an interesting package https://github.com/rplevy/swiss-arrows which allows one to perform successive operations with explicit placement of the result of prior evaluation by placing a <> in the form
((->> (z <>))
(y <> 7)
(x <>))
In practice it seems like there is often less need to do so as many similar functions or the same variety have the same ordering and other options like as-> exist too.
There is also the idea of processing math expressions infix as expected when desired.
The Lisp community has literally tried exactly this, on and off, for the past half century -- and they always come back to s-expressions. Every new Lisp programmers says "I know, I'll make a macro to let me write infix math!", and then abandons it 2 months later. It's not like Lisp programmers aren't aware of how schoolchildren write (+ 2 2).
I've written tons of code in both language families. In infix/prefix (i.e., Algol-family) languages, I frequently wish for a nice consistent prefix syntax. In prefix-only (i.e., Lisp-family) languages, I can't say I've ever wished for infix notation.
I don't understand what the perceived issue is with infix notation, except for unfamiliarity -- and that passes soon enough.
So you trade simplicity in one problem (precedence) and gain complexity in another (variadic arguments and (lang-dependent) multiple variadic function implementations)
Yup. In addition to ignoring the usual Lisp convention for associative operators, I think the article muddies the waters here by using words like "add" and "mul" in all of the non-infix examples, making them unnecessarily verbose. After using prefix notation for years, it seems very natural for pattern recognition and formula manipulation. Never having to think about precedence and associativity is really nice.
he doesn’t make any coherent argument as to why one is more clear or preferred than the other. he simply states it and moves on with this bias. for example, when he compares 2 to 2a, i feel he doesn’t really address anything and just states his preference as the more clear one.
plus, he of course seems to know nothing about lisp (or just ignores it) where you might have:
(+ a (+ b c)) = (+ (+ a b) c) = (+ a b c)
all three are valid lisp syntax and represent associativity of addition well, including “dropping the parentheses”. the thing is, procedure notation as opposed to infix operators actually make it more explicit by what is meant by associativity. this is because it makes explicit about what comes first because of how procedures are evaluated. this is much more explicit than conventions of what parentheses mean in infix operator expressions. even his python procedure example in (2) demonstrates this property.
the distributive law is:
(* a (+ b c)) = (+ (* a b) (* a c))
that seems pretty clear to me. and with this, you can of course extend the language such that the +, , etc. procedures operate on new data types or heterogeneous data types like ( c v) where c is a constant and v is a vector and you get scalar multiplication.
it is also funny to me that he considers readability over performance. well, i do too, but you can have both. see lisps, schemes, and sml dialects, all languages he seems to ignore the existence of.
another thought is that i recall gerald sussman saying in a talk that mathematical notation is impressionistic, and in general, i think he is right. his point was that procedure notation is much more explicit. he also mentions that prefix notation is inconvient for small expressions versus infix notation but is much more preferred when you have very large expressions with many, many terms.
Isn't "...once you've learned this simple notation, equations written using them are easier to manipulate than equations written using functional notation..." his argument why operators are more clear?
I don't necessarily agree with that, but I haven't made a world class programming language, either. I just assume he knows something I don't.
but programming isn’t about manipulating equations. if you want programs to manipulate equations, then again, the procedure notation, in particular lisp notation, wins out.
Maybe, but given that it was the opening to the comment with no reference to anything else it just seems like an unqualified opinion in the vein of the comenters criticisms of the post. It's gone now so I guess OP agreed.
i agreed it was unnecessary, but i don't think it's a huge stretch. most of the stuff i have read from him has come across as someone who has made up their mind and that's that. it's like when you ask someone why something is and the answer you get back isn't thoughtful, insightful, revealing, etc. but instead beats around the bush such that it reads like the person either doesn't understand or its because that's simply the way they want to do it. which is fine. python is (was?) his language, but i don't have to enjoy the perspective and approach. he no doubt knows way more than i do, but i never learn anything from him.
almost every answer is basically "i like to be pragmatic" but neither the design of python or his answers reflect that.
so now that python is so popular, we have people completely unaware of the existence of languages like lisp/scheme and ml dialects, which are much more powerful and can be just as clean as or cleaner than python, and a stunted language taking over every project.
> doesn’t really address anything and just states his preference as the more clear one
Maybe he doesn't understand why one is less confusing to him than the other one. Doesn't mean it's not like that for most people though. The reason it's less confusing is familiarity and simplicity (it's simpler form and it's already familiar).
but isn’t implied operator precedence and using parentheses to group and control order of operations one of the things people often find confusing in mathematics, despite it being familiar?
I've grown to dislike operator overloading quite a bit, and thus developed a general skepticism with regard to operators in general, and I think this article actually reveals why partially.
First off, I'm surprised this article doesn't touch at all on the fact that operators are usually the only "blessed" functions in languages to be able to be written in infix form, as I think that's actually where many of the "benefits" being described here come from. For example, I believe the first example about how operators highlight the lack of ambiguity in the expression "x + y + z" has little to do with operators and more to do with prefix vs. infix notation. Notice that this is not apparent if we use operators but also require prefix notation instead: "+ + x y z" vs "+ x + y z". Here too you may never realize how the binding order doesn't matter, because binding order is non-existent and unnecessary. The fact that in most languages non-operator function calls must be in prefix form shows the true concern here. If you for example imagine working in Haskell, where you can apply the opposite experiment, non-operator functions in infix form, this discovery could also arguably have been highlighted: "x `add` y `add` z". All this to say, I do not see this particular point as a great defense of operators.
But the real problem in my mind comes from the eventual overloaded meaning. This is actually displayed in this very blog post. Near the end, the author mentions how * is not always commutative, but in math, + is. And yet, in the simplest non-math example use of operators, string concatenation, we already find that + is not commutative. So what we're really doing is creating very terse function names that "often" imply a set of rules, but also often don't. In other words, in one specific domain they have a specific meaning, but when put into a general purpose programming language, they actually are just... "un-user-friendly one letter names".
And this gets to the heart of the issue. I agree that operators probably trigger a different part of the brain, but I think they do so simply because we use a reserved part of the character set to express them that we don't use for anything else. So they stand out. If "+" was replaced with "a", I don't think that "b a b" would really be that visually more helpful than "a(b,b)" -- so its not the magic "operator-ness" of the function, its not even the infixness, its the fact that we're basically putting a weird character in there, the same as how a smiley face emoji would stick out, or perhaps how "bolding" text would stand out (if that were possible in your programming language).
The problem is that these magic special limited-set characters are very attractive, and once you have a concept you firmly understand, you want to map them onto the closest version of the standard math forms. Case in point, string concatenation. But these new applications very rarely map exactly, and soon we end up with incredibly terse names that trick you into thinking they behave like something familiar. Most people would agree that "add" isn't the best name for a string concatenation function, and would hold a lot of weird meaning baggage, but the abstractness of the "+" operator (and the fact that we read it allowed as "plus") fools us into using it.
The best part is that string concatenation is a perfectly cromulent "multiplication", forms a free monoid, and in line with some notations for concatenation of tuples.
Elixir's string module documentation isn't half bad, but there is more practical Unicode thinking and less mathematical thinking (which makes sense, given their domains).
I happen to think baking unicode into your concept of a string is fundamentally misguided, so that all string operations following from that premise are inherently wrong. The very first example, constrasting encoded byte length with String.length("é")=1, calling the latter the "proper length" walks into a shibboleth which puts Elixir on the side of String.length("ﷺ")=1, even with the grapheme clusters concept, for which the only salvation is integrated font rendering.
It's practical and informative, but I can't consider it well-thought-out.
ed: to clarify, ﷺ is an Arabic ligature which represents many more than one (linguistic) characters. A more accessible example might be "ffi".
I could be wrong, but I think the reason why String.length is one is to have a consistent idea of what happens when you have monospaced console output. Things in the elixir standard library exist "when you need them for elixir itself", and monospaced console output formatting working is needed in a few parts of elixir. If you care about bytes only, you can use byte_size, as indicated in the docs.
No, codepoint length is totally useless for monospaced console output, see the third example. Grapheme clusters are closer, but still wrong in the presence of wide characters.
I've written a fuzzing library that tests random Unicode inputs and the width of the output was sensible on three platforms (Linux, Mac, and powershell).
Forgive me as I use similarly extreme language. I 100% disagree. Imo Guido's throwaway comment uses the same rhetorical form as hate speech. Your use of "100% agree" and "I actually really like Perl" are mutually contradictory.
Now let me unpack that. I'm curious to see if you reply and whether we arrive at middle ground.
=================================================
> in practice [Perl's use of operators] makes Perl programs more difficult to maintain than they should be
This is an old saw but do you agree that it is a misstatement of what's actually going on:
* Some Perl programmers, especially experienced ones, have made and continue to make Perl programs, including large ones, EASIER to maintain than they could be and would be if written by a similarly strong programmer using the deliberately dumbed down language Python.
* Some Perl programmers, especially beginners, have made and/or continue to make Perl programs, mostly small ones, more difficult to maintain than they should be and would be if written by a similarly weak programmer using a deliberately dumbed down language like Python.
Perhaps your response to that is something like "Oh sure, I just meant most programs I see because they're mostly small and written by inexperienced programmers."
If not, how modern are the Perl programs of which you speak? Much has changed since 2000 and especially in the 2010s: Perl 5 has stabilized around a wise view of the language, interpreter, and how to be respectful of each others' failings; Perl 6 has radically dialed back the complicated built in availability of terribly cryptic symbol aliases as part of a complete rethink of what it means to be a Perl language.
================================================
As for the "hate speech" aspect:
>> Of course, it's definitely possible to overdo this -- then you get Perl.
> That's the Guido we know and love.
The above was the entirety of by far the highest voted comment in the python reddit post about this article.
I posted the following in sardonic response, which also got upvoted:
Guido:
> Once you've internalized the simple properties which operators tend to have, using + for string or list concatenation becomes more readable than a pure OO notation, and (2) and (3) above explain (in part) why that is.
Larry:
> But it contradicts property (1). That's (in part) why I did not use + for string and list concatenation in Perls. Then again I used it for adding floats, even though that contradicts property (1) too. And changed my mind about what symbol to use between Perl 5 and Perl 6. Heck, who cares about consistency?
Guido:
> I care. I consistently hate on Perl and Python users consistently love me doing so.
I generally find Guido's and the Python community's intolerant attitude consistent with the above and with this exchange about the same article on twitter:
> Random nit: you've got a "font-size: 85%" in the CSS for your posts. Mind removing that, so those of us with our default font-size set deliberately don't have to squint?
Guido:
> I'm sorry, I have no interest in CSS hacking. The template is what it is. Deal with it.
Guido's riposte also got lots of hearts (presumably to soothe or applaud their beloved ex-BDFL).
I note the follow up got nothing from Guido:
> It's a single-line deletion, and would improve the accessibility of your blog, but sure ok.
There's zero chance Larry would have thought and spoken as Guido did and he absolutely would have made that single line deletion out of respect for the person whose eyesight wasn't as good as others'. The same attitude applies to improving the language and culture.
I see a new Perl emerging, one that will unfold throughout the 2020s. I see a well entrenched self-reinforcing tolerant attitude in Perl culture, even of those who like to use too many operators for my liking.
I appreciate the lengths you went to attempt to articulate your position, but I don't think we'll arrive at a good spot in comments on HN. The "hate speech" analogy is very incendiary.
The only comment I would add is clarification on your comment about 'Your use of "100% agree" and "I actually really like Perl" are mutually contradictory.'
Someone can really like a languages, but not agree with all the language designer's choices. A language designer needs to make tradeoffs, and I hope we can agree that not every choice needs to be universally accepted. Tolerance != uniformity and bind acceptance.
Beyond that: I'm very familiar with Perl 5, Perl 6, and the direction the community is taking the languages, and the Perl culture. I just disagree with some of the language choices.
> Someone can really like a languages, but not agree with all the language designer's choices.
Yes, of course.
When writing about `+`, Guido might have chosen to gently chide himself for deciding to break the very first principle he wrote about in his article.
Instead he arbitrarily focused on a couple later principles he didn't break -- and then rudely dissed Perl which deliberately did not break any of the principles he listed in his article.
I really like Python but I do not agree with all Guido's choices in human language and that inappropriate cheap shot at Perl was an example.
> A language designer needs to make tradeoffs
Yes, of course.
As I wrote, Guido chose to write "The template is what it is. Deal with it." in response to someone with poor eyesight requesting he make a one line change.
Twitter is a constrained medium so he presumably felt he needed to tradeoff civility for making his point clear. But:
> I hope we can agree that not every choice needs to be universally accepted. Tolerance != uniformity and bind acceptance.
That was the point of my first reply to you.
I was curious to see if you would agree that Guido's decision to be rude toward Perl in the very context in which he'd just described a flaw in Python according to his own stated principles did not warrant the essentially universal acceptance it got in the reddit python thread.
And, likewise, his rudely dismissive twitter response to a random innocent person reading his article which received far more hearts than those sympathizing with the tweeter with less than perfect eyesight.
> Beyond that: I'm very familiar with Perl 5, Perl 6, and the direction the community is taking the languages, and the Perl culture. I just disagree with some of the language choices.
To recap, I wasn't speaking about disagreement. And not really about Perl either. I was speaking of agreement with Guido's rudeness.
I'm amazed (and horrified) how come such an intelligent mathematician and designer of a beautiful programming language made such an obvious fuckup of using a commutative operator for string concatenation. Really, I hate python just for this single idiotic notation.
I mean, it's right there in front of your eyes. He talks about the convenience of using a visually commutative operator like "+" for commutative operations, and then a few lines later he says that it is a convenient notation for string concatenation. What. The. Fuck.
I understand the confusion it may cause to a new Python developer who comes from the mathematical background.
Does it cause any problems beyond that? For example, does it interfere with some elegant patterns, or result in some bug prone code, etc?
FWIW, the single main annoyance I've had with python was its treatment of strings as iterables of single character strings. While not wrong in any obvious theoretical sense, it conflicts with the mental model of many developers (both new and experienced) and has probably caused more bugs than any other feature of Python (and even more ugly type checks to avoid such bugs). When people realized it was a problem and discussed removing this feature, Guido said he had tried but (a) it was too much work and (b) he felt the benefits are too great to give up (https://mail.python.org/pipermail/python-3000/2006-April/000...).
> I understand the confusion it may cause to a new Python developer who comes from the mathematical background.
I'm confused mainly because a person with a mathematical background created a language with such an obvious blunder. When I have to use the language (and I do often) I am not confused about string concatenation, I am ashamed of having to use this ridiculous notation.
While it makes sense to think of them as sequences, this has bitten me more than I care to admit.
OT: Is there a name for things we love to complain about, but wouldn't change?
Hmm it does look like the other comment responding to you is correct: + is not always commutative in math: http://mathworld.wolfram.com/OrdinalAddition.html. If mathematicians are fine with that, I'm not sure what argument you may have against this operator being non-commutative in a programming language.
I for one feel that + is a very intuitive choice for concatenating strings and lists. Is there any other common operator you would recommend instead? Because I feel that using a less common one would harm beginner-friendliness significantly.
Also, to the best of my knowledge there is no place in the bible of maths that defines + to be commutative. In particular, addition of (infinite) ordinals is not commutative.
> I for one feel that + is a very intuitive choice for concatenating strings and lists. Is there any other common operator you would recommend instead?
Literally, any other operator would be better than addition. A space, a dot, a product, two dots, a minus sign, whatever. Anything except the visually commutative plus.
> Because I feel that using a less common one would harm beginner-friendliness significantly.
Usage of + for string concatenation is fairly new, probably invented by C++ (which, by the way, uses bit shift operators for output, so it is not really an example of sane language).
> Because I feel that using a less common one would harm beginner-friendliness significantly.
Actually, given that concatenating sequences (strings are an example) is a fairly common operation and for some kinds of sequences, so is adding them elementwise, it would be better not to use a common mathematical operator for concatenation, but to give it its own.
Obviously, the fact that he made my favorite programming language was no fluke.