Hacker News new | past | comments | ask | show | jobs | submit login
The 2,500 year old history of why Python’s all([]) returns True (carlmjohnson.net)
189 points by earthboundkid on March 18, 2020 | hide | past | favorite | 210 comments



  sum([1,1]) = 1 + 1 = 2
  sum([1]) = 1
  sum([]) = 0

  0 + n = n + 0 = n
Zero is the unit with respect to addition, so when we add together a list of no numbers, we ought to get back zero.

you can `from math import prod` in python3.8

  prod([2,2]) = 2*2 = 2
  prod([2]) = 2
  prod([]) = 1

  1 * n = n * 1 = n
The same way, one is the unit with respect to multiplication. So, multiplying no numbers together gets you one.

  all([True,True]) = True && True = True
  all([True]) = True
  all([]) = True

  True && x = x && True = x
You get the idea.


An alternative way of viewing this is: find the recursive definition of that function, and think of its base case. For sum, it is something like sum([Head:Tail]) = Head + sum(Tail), which works only if sum([]) is 0. For prod, we have prod([Head:Tail]) = Head * prod(Tail), which again works only if prod([]) is 1. And for all, we have all([Head:Tail]) = Head && all(Tail), which again works only if all([]) is True.


I like the algebraic point of view as a way of spotting essential patterns like this :) The essential missing step here is to point out that conjunction is multiplicative, not additive, and this is why `all` gives the unit under conjunction, rather than the unit under disjunction. Corresponding to `sum` is `any`, which gives the unit under disjunction for the empty list, since disjunction is additive.

That said, I do think there is something to the perspective presented in the article. Something which is not accidental to the algebraization of logic...


There's no need to think in terms of rings here ((+,*) and (|,&) together) - Monoids are precisely the most powerful algebraic object we need to consider here (only one binary operation at a time). All of these binary operations form a monoid.


I was going to comment this, but you got here before me. So, I would just like to ramble about type theory and constructive logic a bit.

According to the Curry-Howard correspondence and the BHK interpretation of logic, there is a syntactic correspondence between types and logical formulas and between terms and proofs.

Here, the unit type corresponds with truth because it is trivially inhabited with a single inhabitant. The product type A * B is interpreted as logical conjunction because you need a proof of A and a proof of B to inhabit it.

The unit type (true) is the identity of the product type (and). Unit * A = A * Unit = A from an isomorphism perspective. From the category theory point of view, a Cartesian monoidal category is a monoidal category where the tensor product is the categorical product and the unit object is the terminal object. It's no coincidence that the terminal object is written as 1.

So, an n-ary sequence of propositions connected by ANDs is like an n-ary tuple type, and an empty sequence of propositions connected by ANDs is like the nullary tuple type, or in other words the trivially inhabited unit type.


⊤ ∨ ⊥ ?

ɪ, don't know.


I was going to write unit in my comment, but unit is the wrong word. The word unit applies only to multiplication. In the natural numbers there is only one unit: 1. Some other fields have multiple units. The integers have 1 and -1 and the rationals contain infinitely many units (every element is a unit).

The correct term is an identity.


You could also look at it like this:

We want this general property to hold:

sum([ sum(x[0:n]), sum(x[n:]) ]) == sum(x)

define sum([]) (n = 0) so that it holds

It works the same for prod, all, and any, and gives a hint as to what the answer should be if there were things like union(sets) (the empty set) or intersection(sets) (the universe, or union of all sets)


Okay, but intuitively, I also want the property !(all(x==1) and all(x!=1)) to hold. Now what?


That should be:

  prod([2,2]) = 2*2 = 4
Right?


I hope so!

Edit: could have picked better examples too imo... Id have used 1 + 0 = 1 and 2 × 1 = 2... basically you add 1s for free when multiplying, 0s for free when adding, and trues for free when "Alling".


In mathematics in binary operations this element is called neutral element or idendity element. https://en.wikipedia.org/wiki/Identity_element?wprov=sfla1


Hmmm...

An identity is something that does not change the left operand

a * i = a

0 for + and 1 for x pass this test.

But for Boolean && neither True nor False pass the identity test because,

( True as identity) False && True = False True && True = True

( False as identity)

False && False False True && False = False

Perhaps there is more to it....


I think your post indeed shows that

a && True = a


it makes sense if you look at it as a && True = a. Thanks.


GP is suggesting that in this setting, the set is Booleans, the operation is &&, and the identity is True. So taking your calculation of

    a * i = a
the operation (denoted * above) is &&, and the identity (denoted i above) is True. Lets see how it holds out for all possible values of a:

    a && True =?= a
    let a := True
    True && True =?= True
    correct!

    a && True =?= a
    let a := False
    False && True =?= False
    correct!


I'm not clear how true doesn't work as the identity for AND.


I may be blinkered by my mathematical training, but to me this seems like the obvious thing to do. Here's an attempt to justify it based on common sense, no math stuff required. Suppose you do something like

  if(all(requirements)) { proceed; }
If there are no requirements that need to be satisfied, then you should be able to proceed.


Another way to make it seem more intuitive: "all of these are true" is the same as "none of these are false" (assuming values must be booleans), and clearly nothing in an empty list is false.


> I may be blinkered by my mathematical training

Yes, precisely the point of the article. We mostly find it natural to think like moderns today, but civilization went along for thousands of years thinking differently.


Epicycles. Experience leaves me inclined to believe the code precedes the definition.

  def all(iterable):
    for element in iterable:
      if not element:
        return False
    return True
https://github.com/python/cpython/blob/master/Python/bltinmo...


But that doesn't tell you _why_ that choice is a good idea!

My favourite explanation comes from property based testing.

In general, you have the following property: For any lists xs and ys,

  all(xs) and all(ys) == all(xs + ys)
If you want to keep that property as simple as possible, you have to define

  all([]) == True
If you want to go with the opposite, your property becomes more complicated.

Mathematical definitions are somewhat arbitrary. It's up to us to choose definitions that make sense in our contexts. A general desire to make the resulting systems simple and uniform worked out well for many, many context.

Slight detour: that's also why defining 2 - 5 to have an answer is a good idea! Negative numbers were seen as a bit suspect, but they behave perfectly sensible. Same for complex numbers.

But: still we generally leave 0 / 0 undefined, because no answer would lead to a satisfying system.


I love this. But I feel like there are people (like you and me) who find an explanation like this awesome and compelling, and then there are others who have a hard time getting on the same wavelength, let alone to try to agree. Like for example, a similar sort of reasoning leads to min(x, y) breaking ties in favor of x, but max(x, y) breaking ties in favor of y, e.g. because [min(x, y), max(x, y)] should stably sort the objects. (There were discussions of how C++'s definition is broken a while back, if you Google hard enough.)

But whenever I've tried to explain things like this to people, I've found so many just don't seem to see the issue, or to remotely care if they do. Their intuitive responses tend to be:

- It's an obscure issue, it's never gonna come up.

- Oh, well I mean it's obviously your fault for calling it that way with that implicit assumption. Why would you think that?

- Well it's the caller's responsibility to read the documentation/look at the source/test the code before they use this function, not my problem to try to fit it into every mathematically possible use case.

I feel like the notion of putting some more thought into what seems like an elementary API so that these issues don't come up and trip up users in the first place seems like a completely unjustifiable academic effort to them, if not an outright foreign one. Have you found any effective way to try to communicate stuff like this and hopefully actually convince people?


The very concept of "breaking ties" with min/max suggests that something woolly and ill-defined is going on; surely x and y are either equal or they are not?


That's everyone's instinctive reaction, but it's completely wrong, and everyone will know this too, if you just give the right examples. Like say you compare people by their age. That's a perfectly fine operation sufficient for sorting, min, max, etc. Now you find two people have the same age. Are they the same person?

Or let's say you're sorting strings case-insensitively. "a" might be neither less than nor greater than "A", meaning they'd be unordered with respect to each other. But that doesn't mean you suddenly wouldn't care if one got duplicated and the other got deleted.

Fundamentally "unordered" just means "neither comes before nor comes after". It doesn't imply "equal to". You can wrap this in a lot of fancy math terminology about partial and total orders, but it's not that complicated conceptually; the idea is just that the lack of an ordering doesn't imply equality, just like how you and your twin aren't the same person. This is one aspect of why C++ is introducing more sophisticated comparisons in C++20 (I would suggest looking it up).


> Like say you compare people by their age. That's a perfectly fine operation for sorting, min, max, etc. Now you find two people have the same age. Are they the same person?

They have the same age, which might be the minimum age of the group. I don't think it makes sense to talk about the minimum person of the group; minimum/maximum imply that it's the thing itself which is ordered.


The reaction to this point seems to have been dismissive (ironic in a thread that started with a complaint about dismissing small distinctions!), but I think the distinction you're making is an important one. In Haskell, it's the difference between `minimum` and `minimumBy`.

https://hackage.haskell.org/package/base-4.12.0.0/docs/GHC-L...

https://hackage.haskell.org/package/base-4.12.0.0/docs/Data-...

The point about strings is different. It doesn't make sense in general to speak of the minimum string in a list, or, more generally, the minimum of a partially ordered set; but it does make perfectly good sense to sort a partially ordered set, with the idea that there are multiple correct sorts when elements are incomparable. That, of course, is where the notion of a stable sort becomes important.


How about a priority heap? It's much more useful to sort things by their min priority, but just telling me the priority at the end isn't helpful. You'll want the thing as well.

(Even more concretely consider a min heap used in pathfinding, where you store the (distance_so_far, (x,y)) pairs, and they sort by distance_so_far.)


> How about a priority heap? It's much more useful to sort things by their min priority, but just telling me the priority at the end isn't helpful. You'll want the thing as well.

Sure, but again, at that point you're not taking the minimum thing, you're taking the thing with the minimum priority. And it wouldn't be strange to say that two things have the same priority (even if they're different things).


> I don't think it makes sense to talk about the minimum person of the group; minimum/maximum imply that it's the thing itself which is ordered.

Uh, it totally makes sense, I just gave you a string example too. The comparator doesn't have to distinguish unequal elements, and a min (and argmin and sort et al.) is perfectly fine and sensible with such a comparator.

Do you code in Python at all? Do you find the fact that 1.0 and 1 are both simultaneously equal in some sense and unequal in another sense to be just complete nonsense? Do you expect min(1.0, 1) to error...? Or for sorted([1.0, 1]) to return [1, 1] just because the elements are "equal" so obviously what right do you have to care which one is returned?


> Uh, it totally makes sense, I just gave you a string example too. The comparator doesn't have to distinguish unequal elements, and a min (and argmin and sort et al.) is perfectly fine and sensible with such a comparator.

Sorting is not the same thing as taking the minimum.

> Do you code in Python at all? Do you find the fact that 1.0 and 1 are both simultaneously equal in some sense and unequal in another sense to be just complete nonsense? Do you expect min(1.0, 1) to error...?

Yes, and this is a large reason why I prefer to work in languages that let me manage such distinctions more carefully.


Not if you're trying to do a stable sort.

If you've got a list of people that is already sorted by name, and you want to sort them by age but still preserve the name sorting for each age (ie, if Alice and Bob are both 30 years old, I still want Alice to appear before Bob in the list), then having a defined way to break ties greatly simplifies the code.


You wouldn't use min/max as part of sorting though.


min and max can also be comparing one part of a more complex object:

    In [1]: l = [(19, "John"), (19, "Jack"), (20, "Jim")]

    In [2]: min(l, key=lambda x: x[0])
    Out[2]: (19, 'John')


0/0 isn't the best example, because every answer leads to a satisfying system—i.e., it'd be perfectly fine to have a multiply valued quotient here; we'd just have to let it infect all the rest of our arithmetic operations. It's 1/0 and other such fractions that have no sensible answer.


I'm a bit confused.

You can make 1/0 return infinity, and preserve some properties of fields. Though that works best, if your zeroes are also signed.

But for 0/0, I don't see nearly as many properties you can rescue.


Whatever value x you assign to 1/0, it had better be true that multiplying it by 0 gives 1; and the properties of a field give that 0x = (0 + 0)x = 0x + 0x, so that 0x = 0. You thus would have to give up either distributivity or additive inverses, which are pretty dear to me!

If, on the other hand, you regard y = 0/0 as a "multi-valued variable" standing for any element of the field, then it behaves perfectly fine, since 0y = 0. The only problem is that it's infectious, so that just about any arithmetic computation involving it, like y + 1 or 2y, also suddenly has to stand for every element of your field. (If you work over a ring, then y + 1 stands for any element and 2y only stands for elements of the ideal generated by 2, etc.) Computations like 0y and y + (-y) both still yield 0.

This is not very useful—at least I can't see anything useful to come of it—but, aside from the infectious multi-valuedness, I don't see what problems result.


  all(xs) and all(ys) == all(xs + ys)
If I understand you correctly, the "and" should be an "implies" instead:

  all(xs) => (all(ys) == all(xs + ys))
Otherwise this law cannot be true since there exist xs for which !all(xs).


I think they just got their operator precedences mixed up. They meant:

    (all(xs) and all(ys)) == all(xs + ys)
This is why you should always use parentheses unless it's absolutely obvious! I certainly had to check the docs to find out which way round it goes. I guess normally you're not dealing with Boolean variables so having == bind tighter makes more sense:

    y is not None and x == 7


Yes, indeed! Sorry, I was deviating from the Python order of precedence.


> But that doesn't tell you _why_ that choice is a good idea!

That wasn't necessarily a choice, it may have just been an oversight not to deal with the "empty list" case.

The fact that this blog post exists is a testament to this behavior being surprising.

I don't think any of the arguments - philosophical or otherwise - are very good from a practical standpoint, that is: is this behavior is not more likely to cause harm?

My gut feeling says that this behavior is somewhat unsafe, but then again Python is probably not the language to use when safety is a big concern.

A hypothetical safer language might be better off to define "all/any" only for non-empty list.


Haskell, the prototypical careful language, defines their equivalents for 'any' and 'all' exactly like Python does it. And that's not an accident.

> That wasn't necessarily a choice, it may have just been an oversight not to deal with the "empty list" case.

If your general code gives you a answer for a corner case, it's a good hint that this might be a reasonable answer. Not a guarantee, though.


Functional languages tend to elevate some mathematical ideals before concerns of practical safety, performance or just common sense.


On my cynical days, I'd be inclined to grant the latter two. But do you have an example of the first?


It boils down to the position that side-effects are actually totally fine when they allow you to write a simpler, more straightforward program, where state is easy to observe and debug.

Such code is less likely to contain obfuscated logical errors arising from an attempt to not break "functional purity". It has a chance to be practically more safe.

Another aspect is the way most functional languages treat memory. If you need a garbage collector, you effectively have to throw out hard realtime constraints. Latency and memory usage can be safety concerns. Most functional languages do not let you reason about that.


Thanks for the answer!

Garbage collectors are used in most modern languages. Most implementations of garbage collectors don't give any real time guarantees. And neither do typical free/malloc implementations.

You can do functional programming without garbage collectors. Eg via linear types / uniqueness typing.

But the more general answer here is: whether your tool is appropriate depends on your problem.

Hard real time constraints are rare for most software. Most languages in widespread use don't support those out-of-the-box. Eg C certainly doesn't.

> It boils down to the position that side-effects are actually totally fine when they allow you to write a simpler, more straightforward program, where state is easy to observe and debug.

You know that eg Haskell allows you to write imperative code just fine? It's actually one of the more pleasant languages for imperative programming.

And it's pretty easy to get proponents of FP to admit that imperative programming has its uses. Much easier at least, than to get them to say positive things about Java-style OOP.

> Such code is less likely to contain obfuscated logical errors arising from an attempt to not break "functional purity". It has a chance to be practically more safe.

As I said most functional languages support imperative programming styles just fine. But, of course, your complaint remains more valid if you are redirecting it against certain functional programmers. Languages are tools to be used by people. Some of them more flexible, some more boneheaded.


> Latency and memory usage can be safety concerns. Most functional languages do not let you reason about that.

Most languages of any kind do not let you reason about that. You're not wrong, but neither is this some specific issue with functional languages. Indeed I'd say functional languages are at the forefront of trying to bring in ways to reason about those, with things like Linear Haskell or https://www.microsoft.com/en-us/research/publication/coq-wor...


> Most languages of any kind do not let you reason about that.

I might as well generalize my statement to:

"Programming languages tend to elevate some ideals before concerns of practical safety, performance or just common sense."

It just wouldn't make as much sense in-context.


> "Programming languages tend to elevate some ideals before concerns of practical safety, performance or just common sense."

That statement would be difficult to substantiate, particularly in the context of functional languages. Practical safety, performance, and common sense are absolutely priorities; the ideals that functional language design follows are servants of those goals, not the master.


Maybe it's because I studied mathematics, but to me the `all([])==True` seems so obviously true and logical, and that any other behavior would be extremely surprising, and unsafe, for the opposite reason.


It's pretty obvious from any standpoint.

all([True, True] + []) is equivalent to `True and True and ?`. If you replace ? with False then all() will never return True in any case because given any list you can always create an equivalent list by appending [] which would taint the list to always return False on all(). Obviously nobody wants all() to return False on a valid list and the concept of tainting a list is pretty stupid so True is the only option. If you replace ? with True you can add as many empty lists as you like which is the desired behavior.

all([True, True] + [] + [] + []) == True and True and True and True and True


> It's pretty obvious from any standpoint.

If it was obvious from any standpoint, we obviously wouldn't be having this blog post or this discussion thread.

Natural language - the natural way to think about things - does not always intuitively translate over to predicate logic. If you disagree with that, you should try teaching it to a class of high schoolers.


Lol, I used to believe that logic was an intrinsic faculty of human reason, and then I taught a bunch of freshmen.


It probably is because you studied mathematics.

If you think of "all" as "in this set, no element exists which is False" then it is "obviously" the right behavior.

However, consider this more practical example:

    x = {1,2,3}
    y = {}

    all(e in x for e in y)
    > True
This doesn't strike me as obviously right.


That's mathematically equivalent to "y is a subset of x", which is obviously right. (Except for the wrinkle that y is a dict, not a set.)


Replying to the sibling comment,

We invented mathematics because practical thinking leaves us without answers at some points. You are free to axiomatically choose "all([]) = false", but then you'll find out that many mathematical equivalences won't hold as they all assume only the ZF axioms.


I don't disagree. Try to mute your "mathematical thinking" for a second and read it as an ordinary person, with a focus on practicality. It just doesn't look right.

If it did look right to me, I wouldn't be sitting here typing this all out. I would be silently agreeing with the mathematical point of view.

That's what tells me this is a likely source of confusion and error.


I simply don’t have the intuition, at all, no matter how I think about it, that it should be false.

Is there any language that works the way you’re suggesting is more practical/intuitive?


I'm not saying it should be false. It's the law of the excluded middle that forces us to make a true/false decision, in which case I can agree that, for mathematical reasons, it should be true.

For practical reasons, it might be better for it to be false. That would require some empirical research on whether this could actually prevent harmful behavior in practice.

From a design perspective, I would argue that "all" should not be defined for empty lists, because that forces people to consider whether the often-forgotten "empty list" edge case might cause an issue.


It's the right choice for practical reasons as well as theoretical/mathematical ones. It means that you generally don't have to special case the empty list case, you can just run let it happily do nothing. If you want exceptional behavior then you should probably use some other value (e.g. None)


> It means that you generally don't have to special case the empty list case, you can just run let it happily do nothing.

The consequence of "if (True)" is usually to do something, not nothing. The consequence of doing something by overlooking the "empty list" special case may be harmful.

Preventing potential harm is more important than not having the programmer jump through a couple of extra hoops to account for the special case. That makes the special case impossible to overlook.

This kind of reasoning is generally accepted when talking about implicitly nullable types vs explicitly nullable types (optional types), but somehow in this context people are more stubborn.


> This kind of reasoning is generally accepted when talking about implicitly nullable types vs explicitly nullable types (optional types), but somehow in this context people are more stubborn.

Not sure that's the same? Haskell has no implicitly nullable types. Everything not explicitly marked as optional is non-nullable.

But still, if your algorithm at hand can treat nulls the same as non-nulls, it's a good idea to do so. (Eg in a sorting algorithm where you just feed the elements to the user supplied comparison function, and let that function handle comparing of optional elements.)


Can you give a practical example of a case where that answer would actually be wrong for an empty set, though? Like, what are you trying to check, and what would be inside the "if"?


This was definitely not accidental. Here is the commit that introduces `all` and `any`: https://github.com/python/cpython/commit/96229b19181

It includes a test for the empty case: `self.assertEqual(all([]), True)`. They knew what they were doing.


Or - and bear with me for this - they wanted to make sure that future changes stay compatible since clients might rely on this behaviour.


What do you mean by "Epicycles"? Are you suggesting the code is written like this by random?


"Epicycles" may refer to ancient astronomers who started with the assumption that Earth is stationary and objects in the sky circle around it.

When observed planetary motions were found to be incompatible with the assumption, epicycles were introduced to keep the initial assumption working.


Yeah, sorry, I do know what epicycles are. I just don't understand the connection to this issue. It is not like predicate logic was invented to explain the behavior of Python.


I obviously failed to illustrate the connection the author maybe was referring to: it's an after-the-fact rationalisation, just like epicycles were.


I see many complicated explanations for "why all([]) behaves this way" that justify the behavior as good because of logical or mathematical purity. I am implying that these explanations are so long worded because they're wrong. I think they are good reasons for keeping the behavior as-is, but the code was probably written first and was written to be simple and just happens to produce mathematically consistent behavior at an edge case. It would be interesting to see the history of the function, to see if the implementation has changed.


> I am implying that these explanations are so long worded because they're wrong.

Are you my Intro to Operating Systems professor? Because he would take points away from answers on the exams for being more long-winded than necessary, even if the answer was correct.

Joking aside, your perspective is bizarre to me. You haven't offered any reasons to propose that all([]) should be False, but reject the given answers for it being True simply on the basis that the explanations are long-winded?


The question isn't "what should all([]) return", it's "why does all([]) return True". I think the answer is "because that's what the simplest implementation returns", not "because someone considered the mathematical implications".


Perhaps it happens to check for any false in the list and otherwise returns true. Sometimes mathematically pure is equivalent to simple and vice versa. The explanations are long because they try to explain the implications and make analogies, which could be infinite.


Another reason we must have `all([]) == True`, because `any([])` should obviously be `False`.

`all(x) == not any([not v for v in x])`

so `all([]) == not any([])` -> `True`


That's a strong hint that those definitions would make our lives easier. But it's not forcing anything?

Also, I don't see why `any([]) == False` would be any more obvious than `all([]) == True`?

I've seen not-so-carefully designed APIs that violate your 'obvious' logic for 'any':

We had a system that you could send a list of topics, and it would return you all the datasets associated with those topics. The designer of the system decided that an empty list of topics would return all datasets, instead of nothing.

I suspect their reasoning was that giving people an option to get all datasets was useful, and that predictably returning nothing was useless.

But what they should have done in this situation for a clean API was to take an optional list of topics.


If you view any and all as reductions

  any([x, y]) == x or any([y]) == x or y or any([])
  all([x, y]) == x and all([y]) == x and y and all([])
then any([]) must be false (otherwise x and y would never matter), and all([]) must be true.

In the query API you're describing, does {topic1, topic2} find datasets on (topic1 OR topic2), or is it (topic1 AND topic2)? That's what determines whether more topics means more or fewer results, and whether {} should correctly find nothing or everything.


> [...] then any([]) must be false (otherwise x and y would never matter), and all([]) must be true.

Yes, I know. My problem is with:

> Another reason we must have `all([]) == True`, because `any([])` should obviously be `False`.

Argument is of the form 'A, because B'. But I don't see how B is any more obvious than A here. They seem about equally obvious.


`all(x + y) = all(x) and all(y)` even when `x, y` can be the empty list.


The identity element of the conjunction (“and”) monoid is True. Any confusion here seems like a good reason we should teach students functional programming - this is straightforward if you think in terms of folds over algebraic objects.

I can’t actually think of a coherent argument for returning anything besides true.


I'm in your camp. But the best argument I found is:

"This is a confusing situation, and no boolean answer we could give would satisfy the principle of least surprise, so we should throw an error."

I don't think there's a good argument to be made for returning False.

As an aside: I don't think there's anything particularly related to 'functional programming' about the arguments used here. But, yet, experience with functional programming tends to make people more careful about finding general principles that give you answers for corner cases like this, too.

The imperative code used in the Python standard library is just as compelling an argument for the Right Answer also being the simplest as any based in FP:

  def all(iterable):
    for element in iterable:
      if not element:
        return False
    return True
In general, when I'm reviewing imperative code, I'm encouraging the authors to find ways to handle corner cases like empty lists organically within their code instead of with a special case check at the top. The organic handling seems to have a higher chance of producing the Right Answer.


> "This is a confusing situation, and no boolean answer we could give would satisfy the principle of least surprise, so we should throw an error."

Common folk tend to disagree about whether zero is an even number. Calculating 0 mod 2 is a confusing situation, and no numeric answer we could give would satisfy the principle of least surprise, so we should throw an error.


Indeed. I didn't say the argument was a good one. Just that it was the best I found.


> no boolean answer we could give would satisfy the principle of least surprise

I think True follows the principle of least surprise.

> I don't think there's anything particularly related to 'functional programming' about the arguments used here

It imparts a better understanding of algebraic abstractions, including what it means to recursively reduce a data structure.

I'm not sure how that python code is supposed to be especially simple or elucidating. I think the following Haskell code is reasonably elucidating, since it elides any need to manually define a special case or base case:

    all = getAll . foldMap All
Where "All" is the conjunction ("and") monoid, as defined in Data.Monoid.


> I think True follows the principle of least surprise.

Me too. I was just quoting the best argument (I found so far) for a different position.

> I'm not sure how that python code is supposed to be especially simple or elucidating. I think the following Haskell code is reasonably elucidating, since it elides any need to manually define a special case or base case:

The Python code wasn't for comparison to any Haskell implementation. If you can read Haskell, you'll likely prefer the Haskell version. But that's not the point.

The point is that amongst all imperative attempts, the most straightforward ones also naturally suggest what to do about the empty list.

Btw, in Haskell notation most normal people's mental model of how 'all' works is probably closer to:

  all = foldl1 (&&)
Most normal people don't think about neutral elements of operations, and that they suggest a natural answer for folds over empty collections.


idk why everyone is treating this like it's some complicated mystery. it's defined to be consistent with universal quantification (ie, ∀x) that most of us should have learned in intro to discrete. std::all_of works exactly the same way.


> most of us

Who are "most of us"? People on HN? People who do programming?

I've a professional programmer, but horrible at math. Don't have any formal education and even less education about discrete math. So for me this was educational and I thank the author for putting it down in writing and sharing the knowledge.

I'm sure I'm not alone on HN or in the software industry with this.

I'm also sure I'm sitting on bunch of knowledge you have no idea about, but when others share that, I don't shoot them down for it, I'm happy we're all helping each other understand things.


the article itself is a decent intro explanation. I'm mainly talking about all the comments in this thread getting into monoids and abstract algebra when a sufficient explanation can be found in first-year cs material. it's a consequence of how all_of and any_of relate to each other.


I see. Since you made a top-level comment, it seems you're commenting about the article itself. If you're replying to comments, I suggest you use the reply functionality and it all becomes a little bit clearer for everyone.

I, for one, also like the comments here as they expand and dive into more topics that again, "some of us" have not read/heard about before so grateful for that.


You're not alone, don't worry with the gate keeping here.


> idk why everyone is treating this like it's some complicated mystery. it's defined to be consistent with universal quantification (ie, ∀x) that most of us should have learned in intro to discrete. std::all_of works exactly the same way.

Lewis Carroll didn't know this, and treated universally quantified statements over an empty domain as false. Of course that's wrong; but I think it's over-specialising to claim that this is something obvious, or that everyone knows.

(Also, as others have said, if someone learns from this, then that's great! That's a positive impact, whereas saying "you should have known it already" is purely negative, so why bother?)


It really just goes to show how far you can get without knowing the fundamentals these days.


I really didn't mean this as a gatekeeping comment, but yes, you are right.

I do think most programmers would benefit a lot from learning the fundamentals of logic, or at the very least boolean algebra. after seeing some of the tortured expressions they write, I certainly wish my coworkers were more familiar with boolean simplification rules and identities.


I don't mean it as gatekeeping either. It's just an observation. Although I've clearly offended someone...

I've lost count of the amount of times I've factored out the "not" from horrid expressions in my colleagues' code like "not this and not that" using Dr Morgan's law. Funny thing is I learnt Dr Morgan's in electronics a long time before I got to computers.

I think the problem is it's "boring" and people get hooked on doing stuff with programming. I can't blame them. But I wish more people remembered the fundamentals.


I suspect the Greeks are being mischaracterized here.

Would they, for example, have considered the statement "all unicorns have a horn" to be false?

In fact, the post mentions the statement "all stonemen are made of stone" which is assumed to be true under the greek logic, but according to the explanation should be false because there are no stonemen.


Your last example should be "all stonemen are made of wood". That would be true under one of the modern interpretations, but false under the Greek one?

Your example ""all stonemen are made of stone" would be true under both modern and supposed Greek interpretations.


I think Boethius would say all stonemen are made of stone is false.


Hmm, I guess I have to re-read.


I believe the consensus literature today says Aristotle was open to indeterminate values (there’s a huge debate about the meaning of his “there will be a sea battle tomorrow” example which is taken to be indeterminate), but Boethius botched the Latin translation and said all unicorns have horns is false. I could be wrong though.


I'd like to note that Socrates' syllogisms only address a small fragment of predicate logic. It coexisted and competed with the propositional logic of the Stoics. Predicate logic did not achieve its power as we recognize it today until Frege. http://hume.ucdavis.edu/mattey/phi112/history.html


Works as expected I'd say.


Unfortunately, the min and the max of empty python lists are not +INFINITY and -INFINITY, as they should be for similar reasons.


That wouldn't make any sense.

  >>> min(['a', 'b'])
  'a'
  >>> min(['a'])
  'a'
  >>> min([])
  float('-inf')  # ???
An empty list is not a list of numbers. It's not a list of anything.

(You can use typing hinting to indicate the type of an empty list, but type hints are ignored by the interpreter.)


An empty list is a list of anything. If it were not a list of anything, it would not be a list at all.


You could make an artificial min/max of all types object that gets returned on empty lists


min/max(X) is expected to return some Y for which (Y in X) is true.


I'd rather argue that "min" only makes sense for numbers and not for other objects.


No, min makes sense for any set with a total order. Sometimes partial order too.


Right, `min` makes sense for lattices too, not just total orders.


Typically you would use "least" and "greatest" (for a partial order) or "first" and "last" (for a total order). It sounds very strange to talk about "min" and "max" for strings, for example. Do you call the first word on a dictionary the minimal word?

But then again, what can you expect from a language that uses "+", the quintessential notation for commutative operations, for the (very non-commutative) operation of concatenation?

Doubly sad when you realize that the language was designed by an actual mathematician.


> Typically you would use "least" and "greatest" (for a partial order) or "first" and "last" (for a total order).

This is just absolutely not true. It is certainly "min" and "max" for any total order, they definitely don't have to be numbers or even numbers with the usual ordering (as confusing as that would be with a set of numbers but the reverse ordering). "first" and "last"? I've never heard that.

I haven't worked with partial orders so much that I can be as confident about that, perhaps "least" and "greatest" are the correct terms. But "min" and "max" are certainly used sometimes by non-specialists (i.e. professional mathmaticians that don't necessarily spend a lot of time with posets) without any confusion.


I was trained as a mathematician and I can confirm that min and max (for minimal and maximal, or minimum and maximum) are mathematically accurate terms. See https://en.wikipedia.org/wiki/Maximal_and_minimal_elements


> I can confirm that min and max (for minimal and maximal, or minimum and maximum) are mathematically accurate terms.

This is true, but they do not have exactly the same meaning. On a poset you do not speak of "the min", but of "a min". You still have "the least", which is a different notion.


Which is why I said "min makes sense for any set with a total order. Sometimes partial order too." There's no ambiguity for total order whatsoever.


> what can you expect from a language that uses "+", the quintessential notation for commutative operations, for the (very non-commutative) operation of concatenation?

Is this any worse than using `+` for floating-point addition, which isn't even associative?


But floating point addition is commutative!

Even if it wasn't, floating point numbers are very often regarded as approximations of real numbers, where addition is associative. String concatenation cannot be construed to have any resemblance of commutativity.


Integers have an (intrinsic & natural) total order. Do characters? I'd say no. We represent them as integers, then expose those underlying integers because it's extremely useful for efficiency purposes, but I'd pretty strongly argue that that's an optimisation only and not an intrinsic to the alphabet.

'a' < 'b' only by accident of history. It could just as easily have been the reverse.


To give some more examples:

Tuples have a reasonable natural total order: just compare component wise. And minimum of a list of tuples makes sense.

You could argue that alphabetic ordering is artificial, but it's hard to argue that it ain't well established and non-surprising.

Interestingly, not all numbers have intuitive orders. Eg complex numbers don't really have a good preferred ordering. And the ordering of p-adic numbers is well defined and unique, but really weird.


And Python does reflect this for complex numbers - you'll get an exception if you try to compare two for order.


> Tuples have a reasonable natural total order

No they don't. (1, 2) > (2, 1) = ???

To compare tuples you need a generalisation of comparison which includes 'incomparable' as an outcome. See lattices https://en.wikipedia.org/wiki/Lattice_%28order%29

> but it's hard to argue that [alphabetic ordering] ain't well established

it is, agreed...

> and non-surprising

...agreed again, only because it's a convention you know so well (had it been reversed since your birth you'd say exactly the same thing).

I agree with both points but that does not invalidate what I said abour alphabetic ordering being artificial.


Oh, I think I wasn't careful enough with my language.

I meant that the common ordering on tuples is reasonably natural. Not that it is necessarily the One True Natural Ordering.

About lattices and partial orders: if there's a choice about what to pick as the default ordering for some built-in datatypes for your language and there are multiple reasonable choices, going with the total order seems like a good idea.

To give a counterexample: functions from natural numbers to an orderable set have a well defined, reasonably natural order, just compare them like as if they were a list of elements of the orderable set. But that would be a bad idea in a programming language, and it is indeed better to just treat functions as incomparable by default.


I think we can certainly agree on this. Thanks.


The set of tuples over a totally ordered set has a fairly natural total order as long as you accept the concept of precedence.


So evaluate (1, 2) > (2, 1) = ??? and tell me what ??? is, and why. I claim '>' is natural for 1 apple vs 2 apples, it is not natural for tuples unless you, as you say, suppose that the first in the tuple is more 'important' than the second WRT comparing. I mean I can accept that as a handy add-on, but it's not intrinsic - that's why lattices have 'incomparable', for just such a case.


0 and 1 are characters. The language of integers has an ordered 10 character alphabet. This is not unique among languages because it is a useful construct.

Similarly we could have said b < a is true but then it would always be true and a < b would be false and the world would be exactly the same because this is all a made up convention for the benefit of humans.


You're confusing numerals with numbers. Numerals are what we write down "0" and "1" are numerals - just marks on a sheet of paper or phospor until interpreted by the human mind.

Numbers are the underlying conceptual thing.


"betaalph" doesn't quite have that ring to it


Also, it's worth noting that in Judaism and some other religions these things have religious significance stretched to them. In this particular example, Aleph is considered to equal 1 and Bet is considered to equal 2.

https://en.wikipedia.org/wiki/Gematria


Dates objects are a good example of non-numbers that have an obviously well-defined order.


True. What's your point?


It's pretty nice to have min work on tuples, so I can make my min-heap take in (distance_so_far, (x,y)) when doing pathfinding.


`min` and `max` operate on semigroups, not on monoids. They don't have to have a default value, and any nicely strongly typed language should prevent them from acting on empty lists.

You could invent a monoid instance of `float` for `<` which includes `inf` as an empty element and, if you ignore NaN, it kinda sorta works in isolation. However, adding infinity breaks a lot of other nice properties to the numbering system.

For example, the following identity of `min` is valid in the finite numbers.

``` n < min A ⟺ (∀ a ∈ A) n < a ```

The right part of the equation is always true if `A = ∅`, and according to our new monoid the left part is true if and only if `n ≠ ∞`.


The min() function doesn't know the type. The minimum possible value for a list of strings is "", for example.

To get what you want, use min(lst, default=-math.inf) .


Agreed.

In Python 2, you had

  float('-inf') < ""
So you could sort of get away with that. But that positive infinity wasn't bigger than any string. And in Python 3 comparing strings and numbers just results in an error.


Max returns a string with an infinite number of U+1FFFFs for an empty collection. True story.


Good point. I was a bit surprised when I found out that Javascript's `Math.min(...[]) ` and `Math.max(...[])` return +Infinity and -Infinity, respectively. But after thinking about it, it makes a lot of sense. It does the right thing when the argument is an empty list.


Maybe the justification is that you do not know the type of an empty python list. But numpy at least could assume that the numbers are floats and give the correct result instead of raising an error (and thus complicating the interface).


> Maybe the justification is that you do not know the type of an empty python list.

Excellent point, thanks!


Interesting point. You got me thinking and opened a wormcan in my mind. I guess True and False are in the set of Booleans, but infinities are, very strongly arguably, not in the domain of numbers. But still, but still.


There are many different "domains" of numbers, and many of them contain infinities. For example, the affinely extended real line, or the IEEE754 floats, are two widely used domains of numbers that contain +inf and -inf.


Those infinities are conceptual only. They don't exist. If infinity was part of the integer domain then

(infinity + 1) > infinity

which it isn't. I know about IEEE infs and nans, and neither are numbers, just useful conventions.


I do not understand what your words mean. All numbers are conceptual. None of them "exists", they are all just useful conventions.


> All numbers are conceptual

Agreed, neet to be precise here, so let's de-abstract away from numbers. Which is greater, 3 apples or 8 apples?

Now give me infinity apples. Finite apples exist, infinite apples don't except as an idea. Personally, infinities make me uncomfortable.


The apples example is bad, because while I can agree that 3 apes and 8 apples exist in a very real sense, I would not agree that Graham's number of apples exists any more than an infinity of apples. Also, I can't have pi apples or i apples.

As for mathematical notions of numbers, there are of course mathematics with all kinds of infinite numbers. You have the transfinite ordinal and cardinal numbers, with the famous 2^Aleph0 = Aleph1 value (2^the number of natural numbers = the number of real numbers). You have infinitesimals as well. These are all just as well defined as the naturals and rationals, and exist just as much.


Maybe you should be dubious about Graham's number. It seems like you should always be able to add an apple to a pile of apples and get a pile with more apples in, but that's not actually the case; at some point, well before Graham's number, your pile would collapse into a black hole.


If we start defining numbers by physical quantities ('what really exists') we'll have a pretty hard to work with numerical system. Because MAX_APPLES is of course smaller than MAX_DUST_MITES, so they define two different number systems. And I can probably pretty easily subdivide until maybe a hundredth of an apple, but a hundredth of a dust mite is much less physically plausible (at human scales).

Not to mention that you could say: well sure, I can imagine a trillion trillion apples, but they don't really exist, they're just a fantasy. In reality we only have MAX_REAL_APPLE apples, and damn it, Johnny just ate one so the largest 'really existing' number just went down.

Overall, we can play this game a lot, but the reality is that pi is as real as Aleph0 which is as real as 1 and 2 and 0.0000...1 and +Inf etc.


> If we start defining numbers by physical quantities ('what really exists') we'll have a pretty hard to work with numerical system. Because MAX_APPLES is of course smaller than MAX_DUST_MITES, so they define two different number systems. And I can probably pretty easily subdivide until maybe a hundredth of an apple, but a hundredth of a dust mite is much less physically plausible (at human scales).

Sure. So a simplified model is probably the right tradeoff to make. But we should not forget that it is a tradeoff; by working in a simplified world of notation we run the risk of forming constructions that can't actually be carried back to the real world of counting apples or dust mites. The map is not the territory.

> Overall, we can play this game a lot, but the reality is that pi is as real as Aleph0 which is as real as 1 and 2 and 0.0000...1 and +Inf etc.

"All models are wrong, but some are useful" - but not all models are equally wrong, and not all are equally useful. Including pi gives us a model that's easier to work with, but introduces a certain risk of forming a construction that doesn't translate back to reality. Including infinities brings a substantially bigger version of that risk. It may still be the right choice, but it's a choice that should be made carefully and deliberately; blindly assuming that all mathematical models are appropriate to a given situation is quite unwise.


"3 apples" is just a rough (though often useful!) and very context-dependent abstraction anyway. real-world apples differ in size, so you might want to count a small apple as half an apple; and does a big "twin"/"conjoined" apple count as two apples? at best you're using some arbitrary size treshhold. it gets hairy/edge-casey pretty fast...

i know this is nitpicking, but i think when you really look at it, "infinity apples" isn't that much more of an "idea" than "3 apples". (even if the latter is more useful in day-to-day life)


How do you know that 3 apples is less than 8 apples? How do you know when I give you a basket of apples that it contains 3 apples or 8?


Well, 3 big apples are greater than 8 small apples. If you think otherwise, don't get into business of farm produce retail.


In logic non-existing objects have all properties, including contradictory ones. This follows from the fact that in propositional calculus from a contradiction (or from False) everything can be concluded. I.e., P /\ ~ P -> Q.

One could conclude that the definition of -> is not very logical but if one is to attach a formal meaning to -> it would seem hard to avoid this. One would need conditions like that when P -> Q we have to have that P and Q are related in some way but that does not sound like something that is very easy to formalize.

It also does not matter whether one uses intuitionistic logic. P /\ ~ P -> Q still holds.

In type theory False is the empty type and proving something about it can be done with case analysis. Since it is the empty type there are zero cases and one is immediately done and can prove anything.


This is in fact the logically meaningful thing to do. All must be monotone decreasing in set inclusion.

Other similar definitions are that min of an empty set is infinity and max of an empty set is minus infinity. Min is monotone decreasing over set inclusion hence min of empty set must be the maximum possible value. Max is montone increasing in ser inclusion hence max of empty set must be the smallest possible value.


I think the very same is happening in scheme. Scheme's functions naturally works on lists: +, * , and, or. Applying it on empty lists gives us:

  > (and)
  #t
  > (or)
  #f
  > (+)
  0
And most surprising (although it's very convenient on second thought):

  > (* )
  1
Thus multiplication on empty list gives 1.


I believe that’s likely acting as a reduction with an initial value. When you sum you start with 0 and when you multiply you start with 1. These provide identity functions for the first addition and multiplication.

1 * x = x

0 + x = x


The reason it's like this on python has nothing to do with philosophy and everything to do with algebra. The mathematical terminology sadly escapes me but essentially for each operation you want there to be a value such that application is a no-op. For addition this is 0, for multiplication 1, for and it is True and for or it is False. You want the n-ary versions of the operations to return those values for the 0 arity case. So sum([]) = 0, all([]) = True and if there was one product([]) = 1. This makes the mathematics very uniform and beautiful.

Lispers know this implicitly. You know they say learning Lisp makes you a better programmer? This is an example of that. Raymond Hettinger and many of the Python authors are Lispers are heart and this is obvious to them.


The identity element is the term you’re looking for


Yeah. Multiplicative identity, additive identity etc. Not sure what the name of the "andity" identity would be. Maybe just "identity of the and operation".


Logical "and" is also called "conjunction", so you would call it the "conjunctive identity". "False" would be the "disjunctive identity".


The answer: otherwise the existential quantifier wouldn't work (since you can transform forall x into there does not exist a not x). When you use the existential quantifier, it makes sense. I'm not sure how this is relevant to python, this question is purely in the domain of first order predicate logic.


Is there any language where this isnt true?


English.


Qualitatively, one is reminded of dead-beat controllers[1].

As a theoretical matter, "all unicorns are blue" is like a transfer function for a controller that can move an elevator from the bottom of the Empire State Building to the top in 0.01 seconds. Connecting words is like solving a math problem.

Trying to locate that unicorn, or bringing that controller out of state space and into reality, one quickly realizes the non-existence of unicorns and that the energy applied to that elevator in order to meet the time requirement would vaporize the equipment.

Requirements and metadata need not meet the realities that apply to data.

[1]https://en.m.wikipedia.org/wiki/Dead-beat_control


That's intriguing, but I'm a bit confused and don't really see the connection. Could you expand?


My point was that as long as we're cranking formulas or stringing words together, much can seem plausible.

Hard reality does not always agree.

Nor do HN moderators, apparently.


They may have simply not understood. I’ve read your post half a dozen times and I’m still not sure what you’re trying to say.


Maybe I just suck like the blue unicorn.


I'm not sure there's any reality involved here?

But when you are building arbitrary systems, some design choices give you simpler and more applicable systems than others.


Shouldn't all() simply throw an exception in case of an empty list?


True is the unit of logic and as a monoid. You know math is useful.


In Swift, I would image I could define this method as throwing an error in such a case, nixing the debate entirely.

Why Swift specifically? This would be in the calling signature, implying there are inputs you can give said method that are nonsensical.


You'd just open another debate.

Your client code has to work with the behaviour of your function.

If you have a perfectly sensible answer for a corner case, just give that answer. Otherwise, you are forcing unnecessary extra case handling on your client code.

And that principle also tells you which answer your code should be giving: whatever makes the client code simpler.

Another example, perhaps more inspired from C conventions:

Imagine you have some functions that take a timeout as an argument. After the time is up, they should fail, if they haven't produced an answer yet.

Now, in these kinds of situations you often find that people choose 0 to mean an infinite timeout.

That's a bad idea, because it violates simple invariants like: if you run two statements with timeouts of t0 and t1, after at most t0 + t1 seconds, you should either have an answer or get an error.

(You can of course complicate the invariant by adding special cases for t0 or t1 being zero. But who likes special cases?)


This is in no way Python-specific.


Monoid morphism.


Maybe it should throw an exception


Why? All unicorns are blue, aren't they?


I’ve seen a lot of experimental evidence for that. "All unicorns are blue" is logically equivalent to "all non-blue things are not unicorns".

Over the past few decades I’ve seen ten of thousands of non-blue things, and none of them were unicorns!

https://en.wikipedia.org/wiki/Raven_paradox


Contrapositives are one of my favorite things in simple logic to demonstrate the difference between rigorous logic and our human heuristic logic. Mathematically, contrapositives are identical to the original statement, but extremely frequently, if not almost always, they sound different to us humans.

If more people understood them, there's a number of dumb internet debates that could finally die, where X -> Y isn't immediately obviously wrong to our human brains and may even sound correct but !Y -> !X is immediately obviously wrong.


> All unicorns are blue, aren't they?

Indeed they aren't! All unicorns are not blue.


True


All unicorns are blue, indeed.

The surprising thing is that all unicorns are red, too.


I'm not sure, but all unicorns I've seen have been blue.


all([]) returns True but any([]) returns False, which seems like a contradiction.


It follows from exactly the same logic. The following are theorems, at least as long as x and y are lists of Booleans:

    all(x + [True]) == all(x)
    any(x + [False]) == any(x)
    all(x + y) == (all(x) and all(y))
    any(x + y) == (any(x) or any(y))
    not all(x) == any(not xi for xi in x)
    not any(x) == all(not xi for xi in x)
Any other values for all([]) and any([]) would cause some or all of these theorems to be correct for all values of x and y except the empty list. In practice this would mean you would have a bug in your program on most occasions where the argument of one of these functions was empty, unless you included a special case for empty sequences.

For example, suppose you're processing a request that includes some tasks and some attachments. You might write something like this:

    if any(task.is_failed() for task in request.tasks):
        raise TaskFailed(request)

    if any(x.is_too_large() for x in request.attachments):
        raise AttachmentTooLarge(request)
Clearly for this code to do the right thing in the case where a request has tasks or attachments but not both, we need any([]) to be false. And the corresponding requirement applies to all([]) if our task interface is a little different:

    if not all(task.is_succeeded() for task in request.tasks):
        raise TaskFailed(request)
Now, there are occasionally programs where you do need a special case for empty sequences; for example, if you have three search results, you might want to return a page containing just those results, but if you have zero, you probably don't want to just return an empty page — the user might think the software is just broken. And in the above example, if there are neither any tasks nor any attachments, it might or might not be useful to report that the user accidentally submitted an empty request. But it is unusual for such a special case to just fall out of a possible alternative semantics of all() and any().


The way I think about it is the base case of a recursive function to evaluate the two. In order to preserve folding Boolean AND across a list your base case must be true. Folding Boolean OR across a list only works if your base case is false.


False is the identity element for Or, similar to how Zero is the identity element for addition.

True is the identity element for And, similar to how One is the identity element for multiplication.


Hardly. As described in the article `all([])` is equivalent to `∀x Fx` or "all unicorns are blue". `any([])` is equivalent to `∃x Fx` or "a blue unicorn exists".


How so? Any means "at least one is true" and all means "all in the array are true".


all: "there is none in the array that is false".


>> all means "all in the array are true"

> all: "there is none in the array that is false"

You... are aware that those statements are exactly the same, right?


They're not. In fact, that's what this entire article is about.


I don't know what to say to this. They are exactly the same. It's not a controversial point. You could write an article on the theme "why apples fall upward in the winter", but that wouldn't make apples fall upward in the winter.

See e.g. https://en.wikipedia.org/wiki/Universal_quantification#Negat...


> It's not a controversial point

It's not important to the discussion at hand, but I enjoyed your wording here, because even though you're right, in a sense this is one of the most controversial points in mathematics: https://en.m.wikipedia.org/wiki/Constructivism_(philosophy_o...


> It's not a controversial point.

I wouldn't have thought so personally, but alas, here we are ;)


Perhaps you two are not using precise enough language?

The common definitions for 'all' and 'any' as used in math are certainly on thaumasiotes' side. But English as a natural, every-day language is a bit more fuzzy.


Correct. That is the point this article is making.

"all in the array are true" is not specific about how the empty case ought to be interpreted. That's not at all to say all([]) ought to be False; the wording is simply under-specified.

Where as "there is none in the array that is false" is clearly True when all([]) is encountered.

Thus the statement cannot be simplified to the English statement "all in the array are true", without introducing ambiguities.

As a general rule, this is why we don't perform mathematics (or programming) in English.

EDIT: Just to clarify why this is ambiguous, in case anyone reading is still not following (although I'd suggest reading the article!)

English is an (informally) consensus based language. Understanding boolean logic is not a precursor to being able to interpret common language, in particular English. Just because one has knowledge of logic (mathematics) and we happen to have given some operation the English name "all", that does not mean the English word "all" in general speech refers to this logical operation. Especially since, as the article describes, the mathematical consensus for this logic has only largely shifted to be interpreted this way in the last several hundred years. The English word "all" predates the current mathematical consensus (which isn't even universal) by thousands of years.

Additionally, if for some reason you are interpreting the written word "all" (no matter the context) as the logical operation, then "all in the array are true" is not a viable definition for this word, it's self-referential and thus ambiguous. "there is none in the array that is false" on the other hand is clear in its meaning (if not slightly grammatically incorrect).


> The English word "all" predates the current mathematical consensus (which isn't even universal) by thousands of years.

This is at the edge of possibility, but it's certainly not a slam dunk. "All" is not an indo-european word; it's restricted to Germanic languages.

> Just because one has knowledge of logic (mathematics) and we happen to have given some operation the English name "all", that does not mean the English word "all" in general speech refers to this logical operation.

This isn't a good argument; all you and the article are saying is that people have vocally objected to the meaning of "all" for a long time. They still do. But this isn't sufficient to claim that the objections reflect a disagreement over the meaning of "all". If you carefully explain to people how you want them to interpret sentences like "every cat is in a box", and then you show them pictures and ask them whether or not every cat is in a box, most people will screw it up. That's not because they didn't understand you; it's because they are very bad at the task.

It is quite clear that the vernacular meaning of the English word "all" coincides with the formal meaning of the logical universal quantifier. You can poll any number of people and they'll give you a definition that just so happens to match the logical quantifier. People don't object to all([]) being true after thinking about what "all" means to them -- they object before thinking about it.


If you poll people asking for a definition of "all", you're doing it wrong. Instead, you should ask them to use the word to describe various cases, and derive the actual definition from that.

I doubt it'll match formal logic.


Mostly agreed about the vernacular. Though a minor nitpick: mathematics has been around for longer than English.

As for eg Python: when in doubt, going with the mathematical usage of the term is the Right Choice here.

For another interesting example: the negation of must and may in English vs their German equivalents.

In English 'You must not do X.' means that refraining from X is obligatory. In German the equivalent 'Du musst nicht X-en.' means that engaging in X is not obligatory.

Both usages make sense in their respective languages. And both usages have translations into formal logic. Just different ones.


They are not. If there’s adres zero entries, the first one isn’t true.


This is also consistent with Boolean algebra where the empty sum is 0 (any), and the empty product is 1 (all).


I think of it as: the minimum of an empty set is +inf as the min of a union must be the min of the mins. Similarly, the max of empty must be -inf.


Those will get you the right result if you use them as an initial value in a fold and the set is not empty. They're not correct on their own; an empty set has no minimum or maximum.


It depends. You _can_ define minimum and maximum over empty sets exactly as described and get a consistent mathematical system.

But eg Python's minimum and maximum functions don't do that. They throw an error by default.

In mathematical terms, if you have a lattice https://en.wikipedia.org/wiki/Lattice_(order) you can always just arbitrarily introduce a least and greatest element, and all the relevant laws stay valid.

Whether that's a good idea for your application or not, is a different question.

In programming, types often give you trouble. Eg the integers don't have least or greatest element, but you want your minimum function over list of integers to return an integer. So throwing an error is the only choice, if you can't represent something like minus infinity in your return type.


> You _can_ define minimum and maximum over empty sets exactly as described and get a consistent mathematical system.

You can define infimum and supremum that way. But I was under the impression that wherever the concept of a "minimum element" of a set is used, the minimum element of any set must be an element of that set.

So e.g. the positive reals have an infimum, 0, but no minimum. The page you link doesn't do much to dispute this idea; it uses "minimum" and "maximum" to talk about the largest and smallest elements of the lattice, but never to talk about a property that a subset of the lattice might have.


I guess you are right.

Of course, the Python function named 'min' could very well implement an infimum in this case, and it would be OK.

In general, there's some conventions, like using the words 'minimum' vs 'infimum'. Or stressing the property 'result of min should be an element of the input set' vs 'min on lists should behave like repeated application of min on two arguments starting with the neutral element'.

The right choice depends on context. In lots of programming context, I find stressing the connection with the minimum monoid was useful.


> Of course, the Python function named 'min' could very well implement an infimum in this case, and it would be OK.

Using min(∅) = +inf and max(∅) = -inf would violate the otherwise valid constraint that min(x) ≤ max(x). I'd be a little uncomfortable with it for that reason.

> The right choice depends on context.

Definitely.


> Using min(∅) = +inf and max(∅) = -inf would violate the otherwise valid constraint that min(x) ≤ max(x). I'd be a little uncomfortable with it for that reason.

Excellent point!

Funny enough, we have the same with 'any' and 'all':

For most lists of booleans: all(l) ≤ any(l). Or equivalently all(l) implies any(l).

But that property is broken for empty lists.

Hmm, I think I might understand the case for `all([]) == False` a bit better now. Though I still disagree with it.


Read up on DeMorgan’s law.


De Morgan's law isn't the right one for this. You want the identity ∀x.P(x) ⟺ ¬∃x.¬P(x).

∀x.P(x) is obviously true when the universe is empty -- no counterexamples exist. ∃x.P(x) is obviously not true in that case -- no examples exist.


That is De Morgan if you expand the quantifiers.


You have to make some assumptions if you want to "expand the quantifiers". In particular, you have to assume that the universe is non-empty, which is a really bad idea if you're commenting in this thread.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: