Hacker News new | past | comments | ask | show | jobs | submit login

Here's a shot at explaining it in terms similar to your explanation for E_1.

-------------------

As above we'll have some value of s_2, and a set of points x_1 < x_2 < ... x_k < s_2 < x_{k+1} < ... < x_n. We define E_2 = \sum (x_i - s_2)^2.

Suppose we move s_2 slightly to the right, obtaining s_2 + epsilon. The first k terms of the sum will increase, while the last (n-k) terms will decrease. Calculus tells us that each term will change by -2 (x_i-s_2) epsilon + O(epsilon^2). If epsilon is small (infinitesimal) we can ignore the epsilon^2 term. As expected this quantity is negative for points to the right of s_2 and positive for points to the left of s_2. We can add these up to find the total change to be

-2 × epsilon × \sum (x_i-s_2),

dropping the higher order term.

If s_2 is too far to the left, moving it to the right will decrease E_2 as the decreases from points x_{k+1}, ... x_n will be larger than the increases from points x_1, ..., x_k. Similarly if s_2 is too far to the right, moving it to the left will decrease E_2. So where's the point where we can't gain anything by moving left or right?

If we look at our expression above,

-2 × epsilon × \sum (x_i-s_2)

as a function of s_2, it's negative when s_2 is small and positive when s_2 is large. Negative means we can do better by increasing s_2; positive means we can do better by decreasing s_2. It's just a straight line, so if it's negative on one side and positive on the other side there has to be some point where it's exactly zero. At that point we can't improve E_2 at all by moving in either direction. To find that point we simply set

-2 × epsilon × \sum (x_i-s_2) = 0

and then some simple algebra gives us

s_2 = \sum(x_i)/n

-----------------

There's also a purely algebraic way to see that E_2 is minimized at the mean, without requiring any calculus. Whether it's helpful for intuition or not I don't know.

Suppose we have some value s'_2 that we're interested in, and we think this might minimize E_2:

E'_2 = \sum(x_i - s'_2)^2

I'm going to prove to you that E'_2 is not smaller than \sum(x_i - m)^2, where m is the mean. We can write

E'_2 = \sum(x_i - m + m - s'_2)^2,

since adding and subtracting m in the middle doesn't change anything. Then split it up

E'_2 = \sum((x_i - m) + (m - s'_2))^2

and expand the square

E'_2 = \sum(x_i - m)^2 + \sum(m - s'_2)^2 + 2 × \sum[(x_i - m)(m - s'_2)]

The first term is the one that I said E'_2 was not smaller than. The second term is the sum of a bunch of [identical] squares, so it is nonnegative. The third term it turns out is zero, because we can pull the (m - s\*_2) term out of the sum and then \sum(x_i - m) is zero. So we have

E'_2 = \sum(x_i - m)^2 + {nonnegative stuff},

which proves that s'_2 is not better than m. In fact the second term is strictly positive unless s'_2 = m, so m is the unique minimizer.




Side note: you can use this to convert LaTeX to Unicode for easier reading.

https://www.unicodeit.net/




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: