Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I think you're confusing probability of winning with winning margin. His 79.4% number is his estimate of the probability that Obama would win Virginia by any margin. I don't think it's correct to say he's "a little bit right" -- either his prediction is correct, or it isn't. The only way I know to gauge the accuracy of his actual probabilities would be to run the election multiple times -- but there might be a more statistically advanced technique I'm not aware of.


I'm not confusing the two - I'm saying the fact that Florida was so close indicates Nate may have been doing about as good as possible with the data available in Florida, while the fact that Virginia was so close might indicate that Nate might not have done as well as possible with the data there. Neither case is at all conclusive.

It's possible the quality of the data was just different in the two states, due to random factors in sampling or random factors on election day.

And you are right, we couldn't know without running it multiple times with everything the same except these random factors.


Whether the race was close has nothing to do with how likely one side was to win. If there are 100 voters and I know with absolute certainty that 51 of them are die-hard republicans and 49 of them are die-hard democrats, I'm probably justified in saying that the republican candidate has some large probability of winning, 90% or something depending on actuarial probability that some of the republicans die etc. But that's a 90% probability of winning by a 51/49 margin. Alternately if there are 10 republicans, 10 democrats, and 80 independents I may have no idea how the independents will vote so AFAIK it's equally likely that either candidate wins. The margin of victory has nothing to do with the probability of the outcome.


You could reasonably conclude someone has 80% chance to win by using a large amount of slightly lopsided data, or you could reasonably conclude the same thing with a smaller amount of highly lopsided data.

As such there is a relationship involving the probability of the outcome, the quantity of the polling data, the quality of the polling data, and the margins of the polling data.

At the time I wasn't aware that Nate had published his predicted margins. So, seeing he had a high probability for Obama in Virginia, which turned out to be close, I concluded that one of three things was true:

1. Virginia was very heavily polled and there was just a ton of good and consistent data, justifying a prediction of 80%. OR 2. Virginia had a normal amount of good and consistent data, but it was all very lopsided. OR 3. There was an error with Nate's model or with the data.

Had I known that I could have easily accessed his predicted margins I could have easily seen that it was 1. But without knowing that, I was ruling out 2 because it turned out the state was very close.

My point was that it's not easy to just look at Nate's map and score it against the result map. That's a very shallow and fairly uninteresting way to look at it.

My secondary point was that even if you understand that he had predicted some percentage chance to win in each state, it's not even easy to look at that percentage chance and score it against the result map, because, given a prediction like "80% chance Obama wins", there are many different ways he could have arrived at his conclusion, and it would be more accurate to inspect the method by which he arrived at the conclusion compared to the margins in the state.

I've put up a text file comparing the two here: http://pastebin.com/0RB5GRjQ and it turns out the results were within Nate's given interval in 49 of 50 states. It's not the shallow and misleading "50/50" you get by just comparing the color of Nate's map with the results map -- but I think it's much more interesting, more accurate, and more impressive.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: