An alternative way of viewing this is: find the recursive definition of that function, and think of its base case. For sum, it is something like sum([Head:Tail]) = Head + sum(Tail), which works only if sum([]) is 0. For prod, we have prod([Head:Tail]) = Head * prod(Tail), which again works only if prod([]) is 1. And for all, we have all([Head:Tail]) = Head && all(Tail), which again works only if all([]) is True.
I like the algebraic point of view as a way of spotting essential patterns like this :) The essential missing step here is to point out that conjunction is multiplicative, not additive, and this is why `all` gives the unit under conjunction, rather than the unit under disjunction. Corresponding to `sum` is `any`, which gives the unit under disjunction for the empty list, since disjunction is additive.
That said, I do think there is something to the perspective presented in the article. Something which is not accidental to the algebraization of logic...
There's no need to think in terms of rings here ((+,*) and (|,&) together) - Monoids are precisely the most powerful algebraic object we need to consider here (only one binary operation at a time). All of these binary operations form a monoid.
I was going to comment this, but you got here before me. So, I would just like to ramble about type theory and constructive logic a bit.
According to the Curry-Howard correspondence and the BHK interpretation of logic, there is a syntactic correspondence between types and logical formulas and between terms and proofs.
Here, the unit type corresponds with truth because it is trivially inhabited with a single inhabitant. The product type A * B is interpreted as logical conjunction because you need a proof of A and a proof of B to inhabit it.
The unit type (true) is the identity of the product type (and). Unit * A = A * Unit = A from an isomorphism perspective. From the category theory point of view, a Cartesian monoidal category is a monoidal category where the tensor product is the categorical product and the unit object is the terminal object. It's no coincidence that the terminal object is written as 1.
So, an n-ary sequence of propositions connected by ANDs is like an n-ary tuple type, and an empty sequence of propositions connected by ANDs is like the nullary tuple type, or in other words the trivially inhabited unit type.
I was going to write unit in my comment, but unit is the wrong word. The word unit applies only to multiplication. In the natural numbers there is only one unit: 1. Some other fields have multiple units. The integers have 1 and -1 and the rationals contain infinitely many units (every element is a unit).
It works the same for prod, all, and any, and gives a hint as to what the answer should be if there were things like union(sets) (the empty set) or intersection(sets) (the universe, or union of all sets)
Edit: could have picked better examples too imo... Id have used 1 + 0 = 1 and 2 × 1 = 2... basically you add 1s for free when multiplying, 0s for free when adding, and trues for free when "Alling".
I may be blinkered by my mathematical training, but to me this seems like the obvious thing to do. Here's an attempt to justify it based on common sense, no math stuff required. Suppose you do something like
if(all(requirements)) { proceed; }
If there are no requirements that need to be satisfied, then you should be able to proceed.
Another way to make it seem more intuitive: "all of these are true" is the same as "none of these are false" (assuming values must be booleans), and clearly nothing in an empty list is false.
Yes, precisely the point of the article. We mostly find it natural to think like moderns today, but civilization went along for thousands of years thinking differently.
But that doesn't tell you _why_ that choice is a good idea!
My favourite explanation comes from property based testing.
In general, you have the following property: For any lists xs and ys,
all(xs) and all(ys) == all(xs + ys)
If you want to keep that property as simple as possible, you have to define
all([]) == True
If you want to go with the opposite, your property becomes more complicated.
Mathematical definitions are somewhat arbitrary. It's up to us to choose definitions that make sense in our contexts. A general desire to make the resulting systems simple and uniform worked out well for many, many context.
Slight detour: that's also why defining 2 - 5 to have an answer is a good idea! Negative numbers were seen as a bit suspect, but they behave perfectly sensible. Same for complex numbers.
But: still we generally leave 0 / 0 undefined, because no answer would lead to a satisfying system.
I love this. But I feel like there are people (like you and me) who find an explanation like this awesome and compelling, and then there are others who have a hard time getting on the same wavelength, let alone to try to agree. Like for example, a similar sort of reasoning leads to min(x, y) breaking ties in favor of x, but max(x, y) breaking ties in favor of y, e.g. because [min(x, y), max(x, y)] should stably sort the objects. (There were discussions of how C++'s definition is broken a while back, if you Google hard enough.)
But whenever I've tried to explain things like this to people, I've found so many just don't seem to see the issue, or to remotely care if they do. Their intuitive responses tend to be:
- It's an obscure issue, it's never gonna come up.
- Oh, well I mean it's obviously your fault for calling it that way with that implicit assumption. Why would you think that?
- Well it's the caller's responsibility to read the documentation/look at the source/test the code before they use this function, not my problem to try to fit it into every mathematically possible use case.
I feel like the notion of putting some more thought into what seems like an elementary API so that these issues don't come up and trip up users in the first place seems like a completely unjustifiable academic effort to them, if not an outright foreign one. Have you found any effective way to try to communicate stuff like this and hopefully actually convince people?
The very concept of "breaking ties" with min/max suggests that something woolly and ill-defined is going on; surely x and y are either equal or they are not?
That's everyone's instinctive reaction, but it's completely wrong, and everyone will know this too, if you just give the right examples. Like say you compare people by their age. That's a perfectly fine operation sufficient for sorting, min, max, etc. Now you find two people have the same age. Are they the same person?
Or let's say you're sorting strings case-insensitively. "a" might be neither less than nor greater than "A", meaning they'd be unordered with respect to each other. But that doesn't mean you suddenly wouldn't care if one got duplicated and the other got deleted.
Fundamentally "unordered" just means "neither comes before nor comes after". It doesn't imply "equal to". You can wrap this in a lot of fancy math terminology about partial and total orders, but it's not that complicated conceptually; the idea is just that the lack of an ordering doesn't imply equality, just like how you and your twin aren't the same person. This is one aspect of why C++ is introducing more sophisticated comparisons in C++20 (I would suggest looking it up).
> Like say you compare people by their age. That's a perfectly fine operation for sorting, min, max, etc. Now you find two people have the same age. Are they the same person?
They have the same age, which might be the minimum age of the group. I don't think it makes sense to talk about the minimum person of the group; minimum/maximum imply that it's the thing itself which is ordered.
The reaction to this point seems to have been dismissive (ironic in a thread that started with a complaint about dismissing small distinctions!), but I think the distinction you're making is an important one. In Haskell, it's the difference between `minimum` and `minimumBy`.
The point about strings is different. It doesn't make sense in general to speak of the minimum string in a list, or, more generally, the minimum of a partially ordered set; but it does make perfectly good sense to sort a partially ordered set, with the idea that there are multiple correct sorts when elements are incomparable. That, of course, is where the notion of a stable sort becomes important.
How about a priority heap? It's much more useful to sort things by their min priority, but just telling me the priority at the end isn't helpful. You'll want the thing as well.
(Even more concretely consider a min heap used in pathfinding, where you store the (distance_so_far, (x,y)) pairs, and they sort by distance_so_far.)
> How about a priority heap? It's much more useful to sort things by their min priority, but just telling me the priority at the end isn't helpful. You'll want the thing as well.
Sure, but again, at that point you're not taking the minimum thing, you're taking the thing with the minimum priority. And it wouldn't be strange to say that two things have the same priority (even if they're different things).
> I don't think it makes sense to talk about the minimum person of the group; minimum/maximum imply that it's the thing itself which is ordered.
Uh, it totally makes sense, I just gave you a string example too. The comparator doesn't have to distinguish unequal elements, and a min (and argmin and sort et al.) is perfectly fine and sensible with such a comparator.
Do you code in Python at all? Do you find the fact that 1.0 and 1 are both simultaneously equal in some sense and unequal in another sense to be just complete nonsense? Do you expect min(1.0, 1) to error...? Or for sorted([1.0, 1]) to return [1, 1] just because the elements are "equal" so obviously what right do you have to care which one is returned?
> Uh, it totally makes sense, I just gave you a string example too. The comparator doesn't have to distinguish unequal elements, and a min (and argmin and sort et al.) is perfectly fine and sensible with such a comparator.
Sorting is not the same thing as taking the minimum.
> Do you code in Python at all? Do you find the fact that 1.0 and 1 are both simultaneously equal in some sense and unequal in another sense to be just complete nonsense? Do you expect min(1.0, 1) to error...?
Yes, and this is a large reason why I prefer to work in languages that let me manage such distinctions more carefully.
If you've got a list of people that is already sorted by name, and you want to sort them by age but still preserve the name sorting for each age (ie, if Alice and Bob are both 30 years old, I still want Alice to appear before Bob in the list), then having a defined way to break ties greatly simplifies the code.
0/0 isn't the best example, because every answer leads to a satisfying system—i.e., it'd be perfectly fine to have a multiply valued quotient here; we'd just have to let it infect all the rest of our arithmetic operations. It's 1/0 and other such fractions that have no sensible answer.
Whatever value x you assign to 1/0, it had better be true that multiplying it by 0 gives 1; and the properties of a field give that 0x = (0 + 0)x = 0x + 0x, so that 0x = 0. You thus would have to give up either distributivity or additive inverses, which are pretty dear to me!
If, on the other hand, you regard y = 0/0 as a "multi-valued variable" standing for any element of the field, then it behaves perfectly fine, since 0y = 0. The only problem is that it's infectious, so that just about any arithmetic computation involving it, like y + 1 or 2y, also suddenly has to stand for every element of your field. (If you work over a ring, then y + 1 stands for any element and 2y only stands for elements of the ideal generated by 2, etc.) Computations like 0y and y + (-y) both still yield 0.
This is not very useful—at least I can't see anything useful to come of it—but, aside from the infectious multi-valuedness, I don't see what problems result.
I think they just got their operator precedences mixed up. They meant:
(all(xs) and all(ys)) == all(xs + ys)
This is why you should always use parentheses unless it's absolutely obvious! I certainly had to check the docs to find out which way round it goes. I guess normally you're not dealing with Boolean variables so having == bind tighter makes more sense:
> But that doesn't tell you _why_ that choice is a good idea!
That wasn't necessarily a choice, it may have just been an oversight not to deal with the "empty list" case.
The fact that this blog post exists is a testament to this behavior being surprising.
I don't think any of the arguments - philosophical or otherwise - are very good from a practical standpoint, that is: is this behavior is not more likely to cause harm?
My gut feeling says that this behavior is somewhat unsafe, but then again Python is probably not the language to use when safety is a big concern.
A hypothetical safer language might be better off to define "all/any" only for non-empty list.
It boils down to the position that side-effects are actually totally fine when they allow you to write a simpler, more straightforward program, where state is easy to observe and debug.
Such code is less likely to contain obfuscated logical errors arising from an attempt to not break "functional purity". It has a chance to be practically more safe.
Another aspect is the way most functional languages treat memory. If you need a garbage collector, you effectively have to throw out hard realtime constraints. Latency and memory usage can be safety concerns. Most functional languages do not let you reason about that.
Garbage collectors are used in most modern languages. Most implementations of garbage collectors don't give any real time guarantees. And neither do typical free/malloc implementations.
You can do functional programming without garbage collectors. Eg via linear types / uniqueness typing.
But the more general answer here is: whether your tool is appropriate depends on your problem.
Hard real time constraints are rare for most software. Most languages in widespread use don't support those out-of-the-box. Eg C certainly doesn't.
> It boils down to the position that side-effects are actually totally fine when they allow you to write a simpler, more straightforward program, where state is easy to observe and debug.
You know that eg Haskell allows you to write imperative code just fine? It's actually one of the more pleasant languages for imperative programming.
And it's pretty easy to get proponents of FP to admit that imperative programming has its uses. Much easier at least, than to get them to say positive things about Java-style OOP.
> Such code is less likely to contain obfuscated logical errors arising from an attempt to not break "functional purity". It has a chance to be practically more safe.
As I said most functional languages support imperative programming styles just fine. But, of course, your complaint remains more valid if you are redirecting it against certain functional programmers. Languages are tools to be used by people. Some of them more flexible, some more boneheaded.
> Latency and memory usage can be safety concerns. Most functional languages do not let you reason about that.
Most languages of any kind do not let you reason about that. You're not wrong, but neither is this some specific issue with functional languages. Indeed I'd say functional languages are at the forefront of trying to bring in ways to reason about those, with things like Linear Haskell or https://www.microsoft.com/en-us/research/publication/coq-wor...
> "Programming languages tend to elevate some ideals before concerns of practical safety, performance or just common sense."
That statement would be difficult to substantiate, particularly in the context of functional languages. Practical safety, performance, and common sense are absolutely priorities; the ideals that functional language design follows are servants of those goals, not the master.
Maybe it's because I studied mathematics, but to me the `all([])==True` seems so obviously true and logical, and that any other behavior would be extremely surprising, and unsafe, for the opposite reason.
all([True, True] + []) is equivalent to `True and True and ?`. If you replace ? with False then all() will never return True in any case because given any list you can always create an equivalent list by appending [] which would taint the list to always return False on all(). Obviously nobody wants all() to return False on a valid list and the concept of tainting a list is pretty stupid so True is the only option. If you replace ? with True you can add as many empty lists as you like which is the desired behavior.
all([True, True] + [] + [] + []) == True and True and True and True and True
If it was obvious from any standpoint, we obviously wouldn't be having this blog post or this discussion thread.
Natural language - the natural way to think about things - does not always intuitively translate over to predicate logic. If you disagree with that, you should try teaching it to a class of high schoolers.
We invented mathematics because practical thinking leaves us without answers at some points. You are free to axiomatically choose "all([]) = false", but then you'll find out that many mathematical equivalences won't hold as they all assume only the ZF axioms.
I don't disagree. Try to mute your "mathematical thinking" for a second and read it as an ordinary person, with a focus on practicality. It just doesn't look right.
If it did look right to me, I wouldn't be sitting here typing this all out. I would be silently agreeing with the mathematical point of view.
That's what tells me this is a likely source of confusion and error.
I'm not saying it should be false. It's the law of the excluded middle that forces us to make a true/false decision, in which case I can agree that, for mathematical reasons, it should be true.
For practical reasons, it might be better for it to be false. That would require some empirical research on whether this could actually prevent harmful behavior in practice.
From a design perspective, I would argue that "all" should not be defined for empty lists, because that forces people to consider whether the often-forgotten "empty list" edge case might cause an issue.
It's the right choice for practical reasons as well as theoretical/mathematical ones. It means that you generally don't have to special case the empty list case, you can just run let it happily do nothing. If you want exceptional behavior then you should probably use some other value (e.g. None)
> It means that you generally don't have to special case the empty list case, you can just run let it happily do nothing.
The consequence of "if (True)" is usually to do something, not nothing. The consequence of doing something by overlooking the "empty list" special case may be harmful.
Preventing potential harm is more important than not having the programmer jump through a couple of extra hoops to account for the special case. That makes the special case impossible to overlook.
This kind of reasoning is generally accepted when talking about implicitly nullable types vs explicitly nullable types (optional types), but somehow in this context people are more stubborn.
> This kind of reasoning is generally accepted when talking about implicitly nullable types vs explicitly nullable types (optional types), but somehow in this context people are more stubborn.
Not sure that's the same? Haskell has no implicitly nullable types. Everything not explicitly marked as optional is non-nullable.
But still, if your algorithm at hand can treat nulls the same as non-nulls, it's a good idea to do so. (Eg in a sorting algorithm where you just feed the elements to the user supplied comparison function, and let that function handle comparing of optional elements.)
Can you give a practical example of a case where that answer would actually be wrong for an empty set, though? Like, what are you trying to check, and what would be inside the "if"?
Yeah, sorry, I do know what epicycles are. I just don't understand the connection to this issue. It is not like predicate logic was invented to explain the behavior of Python.
I see many complicated explanations for "why all([]) behaves this way" that justify the behavior as good because of logical or mathematical purity. I am implying that these explanations are so long worded because they're wrong. I think they are good reasons for keeping the behavior as-is, but the code was probably written first and was written to be simple and just happens to produce mathematically consistent behavior at an edge case. It would be interesting to see the history of the function, to see if the implementation has changed.
> I am implying that these explanations are so long worded because they're wrong.
Are you my Intro to Operating Systems professor? Because he would take points away from answers on the exams for being more long-winded than necessary, even if the answer was correct.
Joking aside, your perspective is bizarre to me. You haven't offered any reasons to propose that all([]) should be False, but reject the given answers for it being True simply on the basis that the explanations are long-winded?
The question isn't "what should all([]) return", it's "why does all([]) return True". I think the answer is "because that's what the simplest implementation returns", not "because someone considered the mathematical implications".
Perhaps it happens to check for any false in the list and otherwise returns true. Sometimes mathematically pure is equivalent to simple and vice versa. The explanations are long because they try to explain the implications and make analogies, which could be infinite.
That's a strong hint that those definitions would make our lives easier. But it's not forcing anything?
Also, I don't see why `any([]) == False` would be any more obvious than `all([]) == True`?
I've seen not-so-carefully designed APIs that violate your 'obvious' logic for 'any':
We had a system that you could send a list of topics, and it would return you all the datasets associated with those topics. The designer of the system decided that an empty list of topics would return all datasets, instead of nothing.
I suspect their reasoning was that giving people an option to get all datasets was useful, and that predictably returning nothing was useless.
But what they should have done in this situation for a clean API was to take an optional list of topics.
any([x, y]) == x or any([y]) == x or y or any([])
all([x, y]) == x and all([y]) == x and y and all([])
then any([]) must be false (otherwise x and y would never matter), and all([]) must be true.
In the query API you're describing, does {topic1, topic2} find datasets on (topic1 OR topic2), or is it (topic1 AND topic2)? That's what determines whether more topics means more or fewer results, and whether {} should correctly find nothing or everything.
The identity element of the conjunction (“and”) monoid is True. Any confusion here seems like a good reason we should teach students functional programming - this is straightforward if you think in terms of folds over algebraic objects.
I can’t actually think of a coherent argument for returning anything besides true.
I'm in your camp. But the best argument I found is:
"This is a confusing situation, and no boolean answer we could give would satisfy the principle of least surprise, so we should throw an error."
I don't think there's a good argument to be made for returning False.
As an aside: I don't think there's anything particularly related to 'functional programming' about the arguments used here. But, yet, experience with functional programming tends to make people more careful about finding general principles that give you answers for corner cases like this, too.
The imperative code used in the Python standard library is just as compelling an argument for the Right Answer also being the simplest as any based in FP:
def all(iterable):
for element in iterable:
if not element:
return False
return True
In general, when I'm reviewing imperative code, I'm encouraging the authors to find ways to handle corner cases like empty lists organically within their code instead of with a special case check at the top. The organic handling seems to have a higher chance of producing the Right Answer.
> "This is a confusing situation, and no boolean answer we could give would satisfy the principle of least surprise, so we should throw an error."
Common folk tend to disagree about whether zero is an even number. Calculating 0 mod 2 is a confusing situation, and no numeric answer we could give would satisfy the principle of least surprise, so we should throw an error.
> no boolean answer we could give would satisfy the principle of least surprise
I think True follows the principle of least surprise.
> I don't think there's anything particularly related to 'functional programming' about the arguments used here
It imparts a better understanding of algebraic abstractions, including what it means to recursively reduce a data structure.
I'm not sure how that python code is supposed to be especially simple or elucidating. I think the following Haskell code is reasonably elucidating, since it elides any need to manually define a special case or base case:
all = getAll . foldMap All
Where "All" is the conjunction ("and") monoid, as defined in Data.Monoid.
> I think True follows the principle of least surprise.
Me too. I was just quoting the best argument (I found so far) for a different position.
> I'm not sure how that python code is supposed to be especially simple or elucidating. I think the following Haskell code is reasonably elucidating, since it elides any need to manually define a special case or base case:
The Python code wasn't for comparison to any Haskell implementation. If you can read Haskell, you'll likely prefer the Haskell version. But that's not the point.
The point is that amongst all imperative attempts, the most straightforward ones also naturally suggest what to do about the empty list.
Btw, in Haskell notation most normal people's mental model of how 'all' works is probably closer to:
all = foldl1 (&&)
Most normal people don't think about neutral elements of operations, and that they suggest a natural answer for folds over empty collections.
idk why everyone is treating this like it's some complicated mystery. it's defined to be consistent with universal quantification (ie, ∀x) that most of us should have learned in intro to discrete. std::all_of works exactly the same way.
Who are "most of us"? People on HN? People who do programming?
I've a professional programmer, but horrible at math. Don't have any formal education and even less education about discrete math. So for me this was educational and I thank the author for putting it down in writing and sharing the knowledge.
I'm sure I'm not alone on HN or in the software industry with this.
I'm also sure I'm sitting on bunch of knowledge you have no idea about, but when others share that, I don't shoot them down for it, I'm happy we're all helping each other understand things.
the article itself is a decent intro explanation. I'm mainly talking about all the comments in this thread getting into monoids and abstract algebra when a sufficient explanation can be found in first-year cs material. it's a consequence of how all_of and any_of relate to each other.
I see. Since you made a top-level comment, it seems you're commenting about the article itself. If you're replying to comments, I suggest you use the reply functionality and it all becomes a little bit clearer for everyone.
I, for one, also like the comments here as they expand and dive into more topics that again, "some of us" have not read/heard about before so grateful for that.
> idk why everyone is treating this like it's some complicated mystery. it's defined to be consistent with universal quantification (ie, ∀x) that most of us should have learned in intro to discrete. std::all_of works exactly the same way.
Lewis Carroll didn't know this, and treated universally quantified statements over an empty domain as false. Of course that's wrong; but I think it's over-specialising to claim that this is something obvious, or that everyone knows.
(Also, as others have said, if someone learns from this, then that's great! That's a positive impact, whereas saying "you should have known it already" is purely negative, so why bother?)
I really didn't mean this as a gatekeeping comment, but yes, you are right.
I do think most programmers would benefit a lot from learning the fundamentals of logic, or at the very least boolean algebra. after seeing some of the tortured expressions they write, I certainly wish my coworkers were more familiar with boolean simplification rules and identities.
I don't mean it as gatekeeping either. It's just an observation. Although I've clearly offended someone...
I've lost count of the amount of times I've factored out the "not" from horrid expressions in my colleagues' code like "not this and not that" using Dr Morgan's law. Funny thing is I learnt Dr Morgan's in electronics a long time before I got to computers.
I think the problem is it's "boring" and people get hooked on doing stuff with programming. I can't blame them. But I wish more people remembered the fundamentals.
I suspect the Greeks are being mischaracterized here.
Would they, for example, have considered the statement "all unicorns have a horn" to be false?
In fact, the post mentions the statement "all stonemen are made of stone" which is assumed to be true under the greek logic, but according to the explanation should be false because there are no stonemen.
Your last example should be "all stonemen are made of wood". That would be true under one of the modern interpretations, but false under the Greek one?
Your example ""all stonemen are made of stone" would be true under both modern and supposed Greek interpretations.
I believe the consensus literature today says Aristotle was open to indeterminate values (there’s a huge debate about the meaning of his “there will be a sea battle tomorrow” example which is taken to be indeterminate), but Boethius botched the Latin translation and said all unicorns have horns is false. I could be wrong though.
I'd like to note that Socrates' syllogisms only address a small fragment of predicate logic. It coexisted and competed with the propositional logic of the Stoics. Predicate logic did not achieve its power as we recognize it today until Frege. http://hume.ucdavis.edu/mattey/phi112/history.html
Typically you would use "least" and "greatest" (for a partial order) or "first" and "last" (for a total order). It sounds very strange to talk about "min" and "max" for strings, for example. Do you call the first word on a dictionary the minimal word?
But then again, what can you expect from a language that uses "+", the quintessential notation for commutative operations, for the (very non-commutative) operation of concatenation?
Doubly sad when you realize that the language was designed by an actual mathematician.
> Typically you would use "least" and "greatest" (for a partial order) or "first" and "last" (for a total order).
This is just absolutely not true. It is certainly "min" and "max" for any total order, they definitely don't have to be numbers or even numbers with the usual ordering (as confusing as that would be with a set of numbers but the reverse ordering). "first" and "last"? I've never heard that.
I haven't worked with partial orders so much that I can be as confident about that, perhaps "least" and "greatest" are the correct terms. But "min" and "max" are certainly used sometimes by non-specialists (i.e. professional mathmaticians that don't necessarily spend a lot of time with posets) without any confusion.
> I can confirm that min and max (for minimal and maximal, or minimum and maximum) are mathematically accurate terms.
This is true, but they do not have exactly the same meaning. On a poset you do not speak of "the min", but of "a min". You still have "the least", which is a different notion.
> what can you expect from a language that uses "+", the quintessential notation for commutative operations, for the (very non-commutative) operation of concatenation?
Is this any worse than using `+` for floating-point addition, which isn't even associative?
Even if it wasn't, floating point numbers are very often regarded as approximations of real numbers, where addition is associative. String concatenation cannot be construed to have any resemblance of commutativity.
Integers have an (intrinsic & natural) total order. Do characters? I'd say no. We represent them as integers, then expose those underlying integers because it's extremely useful for efficiency purposes, but I'd pretty strongly argue that that's an optimisation only and not an intrinsic to the alphabet.
'a' < 'b' only by accident of history. It could just as easily have been the reverse.
Tuples have a reasonable natural total order: just compare component wise. And minimum of a list of tuples makes sense.
You could argue that alphabetic ordering is artificial, but it's hard to argue that it ain't well established and non-surprising.
Interestingly, not all numbers have intuitive orders. Eg complex numbers don't really have a good preferred ordering. And the ordering of p-adic numbers is well defined and unique, but really weird.
Oh, I think I wasn't careful enough with my language.
I meant that the common ordering on tuples is reasonably natural. Not that it is necessarily the One True Natural Ordering.
About lattices and partial orders: if there's a choice about what to pick as the default ordering for some built-in datatypes for your language and there are multiple reasonable choices, going with the total order seems like a good idea.
To give a counterexample: functions from natural numbers to an orderable set have a well defined, reasonably natural order, just compare them like as if they were a list of elements of the orderable set. But that would be a bad idea in a programming language, and it is indeed better to just treat functions as incomparable by default.
So evaluate (1, 2) > (2, 1) = ??? and tell me what ??? is, and why. I claim '>' is natural for 1 apple vs 2 apples, it is not natural for tuples unless you, as you say, suppose that the first in the tuple is more 'important' than the second WRT comparing. I mean I can accept that as a handy add-on, but it's not intrinsic - that's why lattices have 'incomparable', for just such a case.
0 and 1 are characters. The language of integers has an ordered 10 character alphabet. This is not unique among languages because it is a useful construct.
Similarly we could have said b < a is true but then it would always be true and a < b would be false and the world would be exactly the same because this is all a made up convention for the benefit of humans.
You're confusing numerals with numbers. Numerals are what we write down "0" and "1" are numerals - just marks on a sheet of paper or phospor until interpreted by the human mind.
Also, it's worth noting that in Judaism and some other religions these things have religious significance stretched to them. In this particular example, Aleph is considered to equal 1 and Bet is considered to equal 2.
`min` and `max` operate on semigroups, not on monoids. They don't have to have a default value, and any nicely strongly typed language should prevent them from acting on empty lists.
You could invent a monoid instance of `float` for `<` which includes `inf` as an empty element and, if you ignore NaN, it kinda sorta works in isolation. However, adding infinity breaks a lot of other nice properties to the numbering system.
For example, the following identity of `min` is valid in the finite numbers.
```
n < min A ⟺ (∀ a ∈ A) n < a
```
The right part of the equation is always true if `A = ∅`, and according to our new monoid the left part is true if and only if `n ≠ ∞`.
So you could sort of get away with that. But that positive infinity wasn't bigger than any string. And in Python 3 comparing strings and numbers just results in an error.
Good point. I was a bit surprised when I found out that Javascript's `Math.min(...[])
` and `Math.max(...[])` return +Infinity and -Infinity, respectively. But after thinking about it, it makes a lot of sense. It does the right thing when the argument is an empty list.
Maybe the justification is that you do not know the type of an empty python list. But numpy at least could assume that the numbers are floats and give the correct result instead of raising an error (and thus complicating the interface).
Interesting point. You got me thinking and opened a wormcan in my mind. I guess True and False are in the set of Booleans, but infinities are, very strongly arguably, not in the domain of numbers. But still, but still.
There are many different "domains" of numbers, and many of them contain infinities. For example, the affinely extended real line, or the IEEE754 floats, are two widely used domains of numbers that contain +inf and -inf.
The apples example is bad, because while I can agree that 3 apes and 8 apples exist in a very real sense, I would not agree that Graham's number of apples exists any more than an infinity of apples. Also, I can't have pi apples or i apples.
As for mathematical notions of numbers, there are of course mathematics with all kinds of infinite numbers. You have the transfinite ordinal and cardinal numbers, with the famous 2^Aleph0 = Aleph1 value (2^the number of natural numbers = the number of real numbers). You have infinitesimals as well. These are all just as well defined as the naturals and rationals, and exist just as much.
Maybe you should be dubious about Graham's number. It seems like you should always be able to add an apple to a pile of apples and get a pile with more apples in, but that's not actually the case; at some point, well before Graham's number, your pile would collapse into a black hole.
If we start defining numbers by physical quantities ('what really exists') we'll have a pretty hard to work with numerical system. Because MAX_APPLES is of course smaller than MAX_DUST_MITES, so they define two different number systems. And I can probably pretty easily subdivide until maybe a hundredth of an apple, but a hundredth of a dust mite is much less physically plausible (at human scales).
Not to mention that you could say: well sure, I can imagine a trillion trillion apples, but they don't really exist, they're just a fantasy. In reality we only have MAX_REAL_APPLE apples, and damn it, Johnny just ate one so the largest 'really existing' number just went down.
Overall, we can play this game a lot, but the reality is that pi is as real as Aleph0 which is as real as 1 and 2 and 0.0000...1 and +Inf etc.
> If we start defining numbers by physical quantities ('what really exists') we'll have a pretty hard to work with numerical system. Because MAX_APPLES is of course smaller than MAX_DUST_MITES, so they define two different number systems. And I can probably pretty easily subdivide until maybe a hundredth of an apple, but a hundredth of a dust mite is much less physically plausible (at human scales).
Sure. So a simplified model is probably the right tradeoff to make. But we should not forget that it is a tradeoff; by working in a simplified world of notation we run the risk of forming constructions that can't actually be carried back to the real world of counting apples or dust mites. The map is not the territory.
> Overall, we can play this game a lot, but the reality is that pi is as real as Aleph0 which is as real as 1 and 2 and 0.0000...1 and +Inf etc.
"All models are wrong, but some are useful" - but not all models are equally wrong, and not all are equally useful. Including pi gives us a model that's easier to work with, but introduces a certain risk of forming a construction that doesn't translate back to reality. Including infinities brings a substantially bigger version of that risk. It may still be the right choice, but it's a choice that should be made carefully and deliberately; blindly assuming that all mathematical models are appropriate to a given situation is quite unwise.
"3 apples" is just a rough (though often useful!) and very context-dependent abstraction anyway. real-world apples differ in size, so you might want to count a small apple as half an apple; and does a big "twin"/"conjoined" apple count as two apples? at
best you're using some arbitrary size treshhold. it gets hairy/edge-casey pretty fast...
i know this is nitpicking, but i think when you really look at it, "infinity apples" isn't that much more of an "idea" than "3 apples". (even if the latter is more useful in day-to-day life)
In logic non-existing objects have all properties, including contradictory ones. This follows from the fact that in propositional calculus from a contradiction (or from False) everything can be concluded. I.e., P /\ ~ P -> Q.
One could conclude that the definition of -> is not very logical but if one is to attach a formal meaning to -> it would seem hard to avoid this. One would need conditions like that when P -> Q we have to have that P and Q are related in some way but that does not sound like something that is very easy to formalize.
It also does not matter whether one uses intuitionistic logic. P /\ ~ P -> Q still holds.
In type theory False is the empty type and proving something about it can be done with case analysis. Since it is the empty type there are zero cases and one is immediately done and can prove anything.
This is in fact the logically meaningful thing to do. All must be monotone decreasing in set inclusion.
Other similar definitions are that min of an empty set is infinity and max of an empty set is minus infinity. Min is monotone decreasing over set inclusion hence min of empty set must be the maximum possible value. Max is montone increasing in ser inclusion hence max of empty set must be the smallest possible value.
I believe that’s likely acting as a reduction with an initial value. When you sum you start with 0 and when you multiply you start with 1. These provide identity functions for the first addition and multiplication.
The reason it's like this on python has nothing to do with philosophy and everything to do with algebra. The mathematical terminology sadly escapes me but essentially for each operation you want there to be a value such that application is a no-op. For addition this is 0, for multiplication 1, for and it is True and for or it is False. You want the n-ary versions of the operations to return those values for the 0 arity case. So sum([]) = 0, all([]) = True and if there was one product([]) = 1. This makes the mathematics very uniform and beautiful.
Lispers know this implicitly. You know they say learning Lisp makes you a better programmer? This is an example of that. Raymond Hettinger and many of the Python authors are Lispers are heart and this is obvious to them.
Yeah. Multiplicative identity, additive identity etc. Not sure what the name of the "andity" identity would be. Maybe just "identity of the and operation".
The answer: otherwise the existential quantifier wouldn't work (since you can transform forall x into there does not exist a not x). When you use the existential quantifier, it makes sense. I'm not sure how this is relevant to python, this question is purely in the domain of first order predicate logic.
Qualitatively, one is reminded of dead-beat controllers[1].
As a theoretical matter, "all unicorns are blue" is like a transfer function for a controller that can move an elevator from the bottom of the Empire State Building to the top in 0.01 seconds. Connecting words is like solving a math problem.
Trying to locate that unicorn, or bringing that controller out of state space and into reality, one quickly realizes the non-existence of unicorns and that the energy applied to that elevator in order to meet the time requirement would vaporize the equipment.
Requirements and metadata need not meet the realities that apply to data.
Your client code has to work with the behaviour of your function.
If you have a perfectly sensible answer for a corner case, just give that answer. Otherwise, you are forcing unnecessary extra case handling on your client code.
And that principle also tells you which answer your code should be giving: whatever makes the client code simpler.
Another example, perhaps more inspired from C conventions:
Imagine you have some functions that take a timeout as an argument. After the time is up, they should fail, if they haven't produced an answer yet.
Now, in these kinds of situations you often find that people choose 0 to mean an infinite timeout.
That's a bad idea, because it violates simple invariants like: if you run two statements with timeouts of t0 and t1, after at most t0 + t1 seconds, you should either have an answer or get an error.
(You can of course complicate the invariant by adding special cases for t0 or t1 being zero. But who likes special cases?)
Contrapositives are one of my favorite things in simple logic to demonstrate the difference between rigorous logic and our human heuristic logic. Mathematically, contrapositives are identical to the original statement, but extremely frequently, if not almost always, they sound different to us humans.
If more people understood them, there's a number of dumb internet debates that could finally die, where X -> Y isn't immediately obviously wrong to our human brains and may even sound correct but !Y -> !X is immediately obviously wrong.
It follows from exactly the same logic. The following are theorems, at least as long as x and y are lists of Booleans:
all(x + [True]) == all(x)
any(x + [False]) == any(x)
all(x + y) == (all(x) and all(y))
any(x + y) == (any(x) or any(y))
not all(x) == any(not xi for xi in x)
not any(x) == all(not xi for xi in x)
Any other values for all([]) and any([]) would cause some or all of these theorems to be correct for all values of x and y except the empty list. In practice this would mean you would have a bug in your program on most occasions where the argument of one of these functions was empty, unless you included a special case for empty sequences.
For example, suppose you're processing a request that includes some tasks and some attachments. You might write something like this:
if any(task.is_failed() for task in request.tasks):
raise TaskFailed(request)
if any(x.is_too_large() for x in request.attachments):
raise AttachmentTooLarge(request)
Clearly for this code to do the right thing in the case where a request has tasks or attachments but not both, we need any([]) to be false. And the corresponding requirement applies to all([]) if our task interface is a little different:
if not all(task.is_succeeded() for task in request.tasks):
raise TaskFailed(request)
Now, there are occasionally programs where you do need a special case for empty sequences; for example, if you have three search results, you might want to return a page containing just those results, but if you have zero, you probably don't want to just return an empty page — the user might think the software is just broken. And in the above example, if there are neither any tasks nor any attachments, it might or might not be useful to report that the user accidentally submitted an empty request. But it is unusual for such a special case to just fall out of a possible alternative semantics of all() and any().
The way I think about it is the base case of a recursive function to evaluate the two. In order to preserve folding Boolean AND across a list your base case must be true. Folding Boolean OR across a list only works if your base case is false.
Hardly. As described in the article `all([])` is equivalent to `∀x Fx` or "all unicorns are blue". `any([])` is equivalent to `∃x Fx` or "a blue unicorn exists".
I don't know what to say to this. They are exactly the same. It's not a controversial point. You could write an article on the theme "why apples fall upward in the winter", but that wouldn't make apples fall upward in the winter.
Perhaps you two are not using precise enough language?
The common definitions for 'all' and 'any' as used in math are certainly on thaumasiotes' side. But English as a natural, every-day language is a bit more fuzzy.
Correct. That is the point this article is making.
"all in the array are true" is not specific about how the empty case ought to be interpreted. That's not at all to say all([]) ought to be False; the wording is simply under-specified.
Where as "there is none in the array that is false" is clearly True when all([]) is encountered.
Thus the statement cannot be simplified to the English statement "all in the array are true", without introducing ambiguities.
As a general rule, this is why we don't perform mathematics (or programming) in English.
EDIT: Just to clarify why this is ambiguous, in case anyone reading is still not following (although I'd suggest reading the article!)
English is an (informally) consensus based language. Understanding boolean logic is not a precursor to being able to interpret common language, in particular English. Just because one has knowledge of logic (mathematics) and we happen to have given some operation the English name "all", that does not mean the English word "all" in general speech refers to this logical operation. Especially since, as the article describes, the mathematical consensus for this logic has only largely shifted to be interpreted this way in the last several hundred years. The English word "all" predates the current mathematical consensus (which isn't even universal) by thousands of years.
Additionally, if for some reason you are interpreting the written word "all" (no matter the context) as the logical operation, then "all in the array are true" is not a viable definition for this word, it's self-referential and thus ambiguous. "there is none in the array that is false" on the other hand is clear in its meaning (if not slightly grammatically incorrect).
> The English word "all" predates the current mathematical consensus (which isn't even universal) by thousands of years.
This is at the edge of possibility, but it's certainly not a slam dunk. "All" is not an indo-european word; it's restricted to Germanic languages.
> Just because one has knowledge of logic (mathematics) and we happen to have given some operation the English name "all", that does not mean the English word "all" in general speech refers to this logical operation.
This isn't a good argument; all you and the article are saying is that people have vocally objected to the meaning of "all" for a long time. They still do. But this isn't sufficient to claim that the objections reflect a disagreement over the meaning of "all". If you carefully explain to people how you want them to interpret sentences like "every cat is in a box", and then you show them pictures and ask them whether or not every cat is in a box, most people will screw it up. That's not because they didn't understand you; it's because they are very bad at the task.
It is quite clear that the vernacular meaning of the English word "all" coincides with the formal meaning of the logical universal quantifier. You can poll any number of people and they'll give you a definition that just so happens to match the logical quantifier. People don't object to all([]) being true after thinking about what "all" means to them -- they object before thinking about it.
If you poll people asking for a definition of "all", you're doing it wrong. Instead, you should ask them to use the word to describe various cases, and derive the actual definition from that.
Mostly agreed about the vernacular. Though a minor nitpick: mathematics has been around for longer than English.
As for eg Python: when in doubt, going with the mathematical usage of the term is the Right Choice here.
For another interesting example: the negation of must and may in English vs their German equivalents.
In English 'You must not do X.' means that refraining from X is obligatory. In German the equivalent 'Du musst nicht X-en.' means that engaging in X is not obligatory.
Both usages make sense in their respective languages. And both usages have translations into formal logic. Just different ones.
Those will get you the right result if you use them as an initial value in a fold and the set is not empty. They're not correct on their own; an empty set has no minimum or maximum.
It depends. You _can_ define minimum and maximum over empty sets exactly as described and get a consistent mathematical system.
But eg Python's minimum and maximum functions don't do that. They throw an error by default.
In mathematical terms, if you have a lattice https://en.wikipedia.org/wiki/Lattice_(order) you can always just arbitrarily introduce a least and greatest element, and all the relevant laws stay valid.
Whether that's a good idea for your application or not, is a different question.
In programming, types often give you trouble. Eg the integers don't have least or greatest element, but you want your minimum function over list of integers to return an integer. So throwing an error is the only choice, if you can't represent something like minus infinity in your return type.
> You _can_ define minimum and maximum over empty sets exactly as described and get a consistent mathematical system.
You can define infimum and supremum that way. But I was under the impression that wherever the concept of a "minimum element" of a set is used, the minimum element of any set must be an element of that set.
So e.g. the positive reals have an infimum, 0, but no minimum. The page you link doesn't do much to dispute this idea; it uses "minimum" and "maximum" to talk about the largest and smallest elements of the lattice, but never to talk about a property that a subset of the lattice might have.
Of course, the Python function named 'min' could very well implement an infimum in this case, and it would be OK.
In general, there's some conventions, like using the words 'minimum' vs 'infimum'. Or stressing the property 'result of min should be an element of the input set' vs 'min on lists should behave like repeated application of min on two arguments starting with the neutral element'.
The right choice depends on context. In lots of programming context, I find stressing the connection with the minimum monoid was useful.
> Of course, the Python function named 'min' could very well implement an infimum in this case, and it would be OK.
Using min(∅) = +inf and max(∅) = -inf would violate the otherwise valid constraint that min(x) ≤ max(x). I'd be a little uncomfortable with it for that reason.
> Using min(∅) = +inf and max(∅) = -inf would violate the otherwise valid constraint that min(x) ≤ max(x). I'd be a little uncomfortable with it for that reason.
Excellent point!
Funny enough, we have the same with 'any' and 'all':
For most lists of booleans: all(l) ≤ any(l). Or equivalently all(l) implies any(l).
But that property is broken for empty lists.
Hmm, I think I might understand the case for `all([]) == False` a bit better now. Though I still disagree with it.
You have to make some assumptions if you want to "expand the quantifiers". In particular, you have to assume that the universe is non-empty, which is a really bad idea if you're commenting in this thread.
you can `from math import prod` in python3.8
The same way, one is the unit with respect to multiplication. So, multiplying no numbers together gets you one. You get the idea.