You're confused about the meaning of almost surely, and your comparison with dol...

lutusp · on Oct 19, 2012

> You're confused about the meaning of almost surely ...

Yes, but I'm not the only one. I think most people who read the justification for describing p = 1.0 as "almost surely" will have the same reaction, regardless of their training. Not to argue that logic must accept responsibility for our confusion.

> your comparison with dollars and pregnancy is misleading.

As to dollars, yes (because it's finite and trivial), but as to the parturient state of a set of women, it depends on the size of the set. For an infinite set of women, to describe them all as almost surely pregnant (or not), is a reasonable corollary to describing p = 1.0 as "almost surely" in the general case.

> Indeed, there are infinitely many non normal numbers (for example, every integer is obviously not normal), even though the probability that a number is not normal is 0.

This compares two infinite sets and draws a conclusion based on the outcome (the set of integers is infinite in size, and the set of numbers (reals and integers) is also infinite in size). I'm not sure this is comparable to making a statement about the mapping of a single well-defined probability (for example the appearance of War and Peace) into a single infinite set, like the presumed infinite set of the digits of Pi.

Which leads to an obvious question -- does the "almost surely" qualifier apply to all p = 1.0 probabilities, or only a subset? In other words, does the qualifier depend on the construction of the probability, or does it always apply?

jules · on Oct 19, 2012

To understand these issues, you should read up on measure theory: http://en.wikipedia.org/wiki/Measure_(mathematics)

pi is a single fixed number, so it doesn't help to talk about probabilities. Either every finite length sequence appears in pi, or at least one doesn't.

The technical term "almost surely" applies to all p=1 probabilities, even those where it's really certain that the event happens. This does not match with common understanding of "almost surely" because we'd not usually make a statement like "the decimal expansion of pi almost surely contains the number 4", but mathematically it's a correct statement.

It's important to keep in mind that this is technical terminology, so intuition does not always apply. It also means that it does not make sense to talk about women being almost surely pregnant without first defining a probability distribution. If we define each woman to be pregnant with a probability of 1%, independent of the pregnancy status of other women, and we have an infinite number of women, then almost surely at least one of them is pregnant. That is, the probability that at least one of them is pregnant is 1. But it's not certain that at least one of them is pregnant, it could be that all of them are not pregnant. Similarly, if we choose a number uniformly in the interval [0,1], then the probability of choosing 1/3 is 0, but it's possible that we chose 1/3. We cannot ignore that as an impossibility either, otherwise when we chose a number x, we could always claim "it's impossible that we chose exactly x!".

xyzzyz · on Oct 19, 2012

In probability theory, the statement "event S occurs almost surely" means by definition that probability of S occuring is 1. This is indeed confusing, but it's completely justified, given constraints.

When non-mathematicians talk about probability, the situation they usually have in mind is that they have n possible events that can occur, each is equally likely, for instance the fair dice throw. This leads to assigning each one of the events the same probability, in the case of dice it's 1/6. We also want to be able to assign the probability to the situation of some subset of events occurring, for instance the probability of the situation that number on a dice is even corresponds to a subset {2 was thrown, 4 was thrown, 6 was thrown}. By common sense, this should be 1/2, and generally the probability should be (number of events in the subset / number of all events). This approach is how probability is usually taught in high school.

The problem is that this approach, while useful, is totally inadequate approach for lots and lots of things that people care about. For instance, say we want to investigate the probability of earthquake occurring. We want to be able to answer the question "what's the probability that we will not have an earthquake in the next t seconds?". Assuming that earthquake are totally unpredictable, we should expect that if we wait s seconds, the answer to this question is still the same, so in probability terms, P(next earthquake will not occur in next t+s seconds provided it will not occur in the next s seconds) = P(next earthquake will not occur in next t). It's really not obvious how to apply the finite set of equally likely events method, if it's at all possible.

That's why most of the mathematicians use similar, but much more general approach. Just like before, we consider set of possible elementary events that can occur, but this time we don't assume it's finite, and we don't assign each one equal probability. Instead, we assign probabilities to subsets of events, so that it's subject to few simple rules, but otherwise in a completely arbitrary way. The rules are: probability of any of the events occurring is 1, and A_1, A_2, ... are pairwise disjoint sets of events, we assign to their union the sum of probabilities.

For instance, we want to model the archer shooting the circular target, so we let events be the points on the target (there are infinity of them), and we let the probability of subset of events be its area (assuming that whole target has area one). Now it makes sense to ask, say, what's the probability of archer hitting arrow within half of a radius from the center of the target? (it's 1/4). But, since every point has area 0, the probability of an archer hitting any particular point is 0, thus probability of archer _not_ hitting that point is 1, meaning that archer will _almost surely_ not hit that point. Almost, because while for every point the probability of archer hitting it is 0, archer will hit _some_ point. That's why mathematicians say almost surely, and not surely, when talking about evens with probability 1.