Was Nate Silver the Most Accurate 2012 Election Pundit?

esurc · on Nov 9, 2012

Thanks. I've been trying to find a good way to understand the accuracy of Silver's predictions.

What's a fair benchmark? This article offers up a "coin flip" for each state, computing that such a coin flip would have a Brier score of 0.25. (The Brier score is a mean-squared error between outcome (1 or 0) and the percent certainty of the prediction in that outcome. If a coin flip is the model, each state's result of 1 or 0 would be in error by 0.5. The mean squared error would be 1/51 * 0.25 * 51 = 0.25.)

But... that seems like too generous a benchmark. Take the simple model: "assume 100% likelihood that state X will vote for the same party as it did in 2008." That guarantees that deeply red or blue states will vote the same way, so it takes the non-battlegrounds out of the equation.

With this model, there would only have been 2/51 errors. This simple lazy model achieves a Brier score of 0.039, beating Intrade and the poll average computed in this article quite badly.

After working through this, I'm still impressed by Silver and the other quant predictions. But I'm more concerned about media that rely too much on reporting a single polls result as "news" rather than as part of a larger tapestry.

Then again, it's the maligned media polls that are the raw input to Silver and the other models. Unless the media keeps funding the polls, the quality of these more advanced models will suffer.

gwern · on Nov 9, 2012

Thanks for the suggestion. I've added your benchmark suggestion (along with a bunch of other fixes and new data like a 2008 RMSE benchmark along the lines of your Brier suggestion) into the R document: https://docs.google.com/document/d/1Rnmx8UZAe25YdxkVQbIVwBI0...

(I don't know when the new numbers will go live on the blog; Luke handles that.)

backprojection · on Nov 9, 2012

Note how NPR is one of the most right-biased in this result. It's pretty evident from years of listening that the NPR staff generally are progressives, and would left. So I think this result exemplifies how genuinely 'fair and balanced' NPR really is.

FaceKicker · on Nov 9, 2012

Throughout all the criticism from pundits of Nate Silver et al. as a liberal/commie/whatever for predicting that Obama would win, I couldn't help but wonder if there was even any merit to the idea that an Obama supporter would want to say that Obama is going to win the election. I would think predicting that Romney has a 10% chance of winning (making it tough but a distinct possibility he would win) as Nate Silver did would lead to some of the BEST possible voter turnout for Romney (making it the worst possible prediction strategically, if you want Obama elected). On the other hand, a prediction that Romney will win in a landslide would make many Romney supporters stay home.

Is there actually evidence that higher poll numbers in favor of X lead to higher voter turnout from supporters of X? It seems like everyone takes that for granted but I've never seen any evidence that it's true.

csallen · on Nov 9, 2012

>> Is there actually evidence that higher poll numbers in favor of X lead to higher voter turnout from supporters of X? It seems like everyone takes that for granted but I've never seen any evidence that it's true.

I'd like to see some raw evidence, too.

But since we're forced to speculate at the moment, my bet would be that the relation to voter turnout and a candidate's chances of winning isn't linear. It's probably more of a parabola: the more extreme a candidate's chances of being elected (or being defeated), the less likely anyone is to come out and vote, because it feels impossible to make a difference. On the flip side, I'd suspect that the closer and more contested a race is, the more likely people are to feel obligated to vote.

FaceKicker · on Nov 9, 2012

You're absolutely right - I agree that voter turnout "should" be highest at exactly "50% chance my guy wins" and diminish from there in either direction. I think I just got a bit too caught up in what I was arguing when I mentioned that a 10% Romney prediction should be the best one for Romney supporter turnout; a 90% prediction should probably be just as good as a 10% prediction in terms of making you likely to vote.

Under this theory, polls/predictions can't actually skew the result in either direction [1], they can only increase or decrease the total turnout (highest turnout when polls say 50%, lowest turnout when they say 0%/100%). In reality this probably isn't actually true, but we can't say for sure whether or not polls do affect election outcomes, and in which direction, unless there's empirical evidence. I don't even have a guess for which direction it would go (whether you want your supporters to be "concerned" or "optimistic") - I could see either being true.

[1] Edit: I should note that this assumes that each poll/prediction is listened to and taken seriously equally by both sides of the electorate, which almost certainly isn't true. Which actually brings up something kind of interesting - it could be that with all of its "Romney will win in a landslide" talk, Fox News actually hurt Romney's chances because most Fox News viewers are Romney supporters (but they could have done the same amount of damage to Romney's chances by saying "Obama will win in a landslide" - the best way for them to help Romney might be to say that the election is exactly tied). And I suppose it would also lead to a justification for the idea that Nate Silver has a liberal bias, using his relatively "wishy-washy" predictions to energize his mostly young and liberal audience to get out and vote. It seems that I've gone too far with this and started arguing against myself...

Someone · on Nov 10, 2012

There is no guarantee that the effect of an almost certain victory/loss is identical for all parties. On the contrary, there is some data that indicates that supporters of some parties are more motivated to vote. See for example http://www.ncbi.nlm.nih.gov/pubmed/22065127, which shows that weather conditions can affect election results. I cannot find data on it, but I think it is not inconceivable this extends to 'going to the polling station even if it does not make a difference to the result'

However, in a winner takes all election, the effect would have to be huge. Let's say all polls indicate a 51%-49% result. Then, at least 4% of the winning party's voters would have to stay home 'because they already won' to change the result (and that assumes none of the other voters stay home 'because they already lost'). At a more realistic 60%-40% poll prediction, one in three voters would have to stay home.

dagw · on Nov 9, 2012

If you want to go full on conspiracy mode, you could argue that Fox news' owners wanted Obama to win due to the belief that 4 more years of being able to lament the horrors of a having a Communist Muslim in the White House would be much better for their ratings. And that is why they pushed the Romney by a landslide thing so hard.

waterlesscloud · on Nov 9, 2012

Just blowing smoke rings here, but it also could be possible that "undecideds" might break towards a perceived winner. If they aren't committed, they may just pick the winning side.

I have no evidence for this whatsoever.

showerst · on Nov 9, 2012

It's worth adding that Nate is on the record as saying he'd have voted for either Romney or Gary Johnson, if he would have voted.

http://www.mentalfloss.com/blogs/archives/150042

pronoiac · on Nov 9, 2012

Okay, I found the reference someone cited as "Nate Silver openly rooting for Obama". It's from March 2008, before he joined the NYTimes:

http://www.fivethirtyeight.com/2008_02_24_archive.html

kf · on Nov 9, 2012

I'm skeptical that that statement should be taken at face value. Nate Silver isn't the smoothest in in-person interviews.

pronoiac · on Nov 9, 2012

That's weird - I thought I read that he was personally pulling for Obama somewhere on 538. I'll go look for that.

figure8 · on Nov 9, 2012

Here he says: "I would describe myself as being somewhere between a liberal and a Libertarian."

http://m.npr.org/news/Books/162594751

mayukh · on Nov 9, 2012

Not just for voter turnout. Higher poll numbers are important for raising $$. People (especially those with large bank balances) want to back a winner.

khuey · on Nov 9, 2012

Well Romney supporters were certainly predicting a Romney landslide, so it's certainly plausible that Obama supporters would predict an Obama landslide. That's independent of whether or not that's a good idea in the game theory sense, of course.

FaceKicker · on Nov 9, 2012

You're definitely right, and it really didn't make sense to me that they were doing that. I guess it's probably just a human nature thing more than anything else.

I think my point still stands though: this doesn't showcase NPR's neutrality - perhaps they are in fact biased toward Obama and just lie more strategically than most Republican pundits. (Certainly not accusing them of that, only saying that I don't think we can glean much from the fact that they gave Romney higher numbers than he deserves.)

genwin · on Nov 9, 2012

I don't think a prediction of a Romney loss would significantly spur Romney voters. As one example of many in my life, a co-worker voted for Romney because Obama "tanked his 401K". These are not people who (on average) read real news, IMO. They wouldn't see the prediction.

gamble · on Nov 9, 2012

It seems like most of the polling organizations that did the worst during this election did poorly because they were way off-base with their likely voter model. No one in the press seemed to notice the scale of the turn-out machine Obama constructed since 2010. Many organizations (including, apparently, Romney's internal pollsters) assumed a turnout somewhere between '08 and '10 levels, when the Democrats managed to match or beat 2008 turnout in critical areas.

I'm not sure it's really a question of ideological bias so much as filtering the raw data through a backward-looking model.

brown9-2 · on Nov 9, 2012

I think it is a huge mistake to equate "polls that tend to be biased towards Candidate A" with "this institution favors Candidate A's policies".

The former can occur simply because of the assumptions that your model makes. Bias in "statistical bias" does not mean the same thing as "political bias".

bo1024 · on Nov 9, 2012

I would like to see them add in David Rothschild at Yahoo[1], who's an expert in scoring rules and prediction markets and whose February (!) predictions were almost exactly on the money.

[1] http://news.yahoo.com/blogs/signal/

PaulMest · on Nov 9, 2012

He works at Microsoft[1]. I have been following his predictions on PredictWise.com for the last four months.

[1] http://www.linkedin.com/pub/david-rothschild/12/651/681

dreeves · on Nov 11, 2012

Just to clarify, he recently moved from Yahoo Research to Microsoft Research when the former imploded. Same with David Pennock and others. The Yahoo folks in New York mostly switched to MSR (founding MSR-NYC) and the folks in California mostly switched to Google.

(I used to be part of that research group but I left at the end of 2010 to start Beeminder.)

nhebb · on Nov 9, 2012

This compares top lines. I think a comparison of turnout model accuracy would be more informative. Most of the models that erred predicted that the 2012 turn out would lean less Democratic than the 2008 turn out model, based on the 2010 mid-term turnouts and a (mis)perceived dampening of enthusiasm among Democrats and increased enthusiasm among Republicans. Based on exit polling, there was a drop off of 7 million white voters, and I don't think anyone who predicted that.

ams6110 · on Nov 9, 2012

The reasons for the low turnout will be interesting to hear. Nationally, nearly 3 million fewer votes for Romney in '12 than McCain in '08, and around 10 million fewer votes for Obama. Polls showed the election to be fairly close, so that would not tend to explain either "giving up" or "overconfidence" as a reason person didn't vote.

waterlesscloud · on Nov 9, 2012

Still a bit early to tell how many people voted in total. Probably need to wait a week or two for the final result.

Actually, that makes this whole exercise premature, the rankings may change once the final data is in.

Osmium · on Nov 9, 2012

Does anybody know more about YouGov's methodology? On the face of it, I'm suspicious of their very low margin of error which seems substantially better than any other poll out there, but you can't deny that their polling was accurate.

Another thing that looks odd on that graph: the given polling numbers from Washington Times/Politico/Monmouth/Newsmax/Gravis/Fox/CNN/ARG all look identical despite their differing margins of error (which suggests their source data is different). What's going on there?

sethg · on Nov 9, 2012

Here’s a good article from the YouGov site that describes their methodology: http://today.yougov.com/news/2012/10/23/obama-stays-ahead-ju...

Osmium · on Nov 9, 2012

Thanks for the link. In short, it seems like only poll people they can confirm are actually registered to vote based on the fact that:

"According to US census data, just 71% of eligible Americans are registered to vote. In 2008, almost 90% of those who were registered did vote. So in any poll, it is vital to know which respondents are on the register."

But there're a few more interesting subtleties in there too, so it's worth a read.

chmullig · on Nov 10, 2012

YouGov does online polling, which allows them to get very large sample sizes for national polls. I believe they had on the order of 36,000 for their final poll. Compare that to others with just ~1000.

gwern · on Nov 9, 2012

> Another thing that looks odd on that graph: the given polling numbers from Washington Times/Politico/Monmouth/Newsmax/Gravis/Fox/CNN/ARG all look identical despite their differing margins of error (which suggests their source data is different). What's going on there?

I dunno. IIRC, Drew Linzer of Votomatic even worried on his blog that the polling numbers were too close and that pollers might be fudging their numbers to be more similar to each other (which would lead to substantial overconfidence in estimates). Still, the final results seem pretty accurate, so...

swang · on Nov 9, 2012

Article doesn't mention Sam Wang. His confidence level for Obama winning was at 99%

http://election.princeton.edu/

gamble · on Nov 9, 2012

He mentions in the comments that Wang's website doesn't appear to give the raw numbers he needs to analyze its accuracy.

gwern · on Nov 9, 2012

Right. Wang seems to now have released raw numbers but did so after I finished my numbers and gave OP the go-ahead. I plan to add Wang and some other stuff today.

gwern · on Nov 10, 2012

Update: Wang's Presidential states & shares have been added, but not his Senate races.

gwern · on Nov 11, 2012

Wang's Senate races are now incorporated; they make his performance substantially more impressive.

peteretep · on Nov 9, 2012

Oooh, I wonder what "he" might choose for a username on a social media site!

swang · on Nov 9, 2012

I can guarantee, with a near 100% confidence rating, that I am not Sam Wang. Merely coincidental that we have the same last name and first initial. I mean there has to be at least 100 million people in the world with the surname Wang so the odds that we are the same person are pretty slim!

crayola · on Nov 9, 2012

"I mean there has to be at least 100 million people in the world with the surname Wang so the odds that we are the same person are pretty slim!"

Except this seems to ignore that P(comment about Sam Wang | I am Sam Wang) is much higher than P(comment about Sam Wang | I am not Sam Wang) :-)

jrockway · on Nov 9, 2012

A reply below says that the username is a coincidence. I'm inclined to believe it. Why use your real name to troll HN over a pretty obscure topic that's not even related to the site's userbase?

quorn3000 · on Nov 9, 2012

Are you related?

swang · on Nov 9, 2012

Nope.

interconnector · on Nov 9, 2012

Actually the most accurate 2012 election pundit was Drew Linzer (http://votamatic.org/). Provided Florida goes Obama's way, he correctly predicted the electoral college - Obama 332, Romney 206.

peteretep · on Nov 9, 2012

Did you read the article? Drew Linzer is one of the specific people it rates Nate Silver against.

thedufer · on Nov 9, 2012

Didn't Nate Silver say that was the most likely split? There was a post about it on HN for like the past 2 days.

Edit: The confusion is probably that Silver was reporting the average electoral split as his prediction, when the mode is more important in what you're talking about. His average was almost never a number that was actually possible, since he was quoting them to the nearest 1/10th of a vote, so its kind of unfair to punish him for not getting it exactly right.

interconnector · on Nov 9, 2012

You're right, I didn't realize Nate Silver was using an average. Also after reading the article, it looks like not just predictions for the 2012 presidential electoral vote, but other votes such as the Senate races are also being compared.

waterlesscloud · on Nov 9, 2012

On his Electoral Vote histogram, he'd had 332 as a big spike for weeks. The day before the election it was over 20% for that one outcome.

brown9-2 · on Nov 9, 2012

It's worth noting that if you used the 2008 results as your 2012 prediction, you would have gotten 49/51 on this scale.

gwern · on Nov 9, 2012

Yeah, but if you simply predicted 2008, you'd get a mediocre Brier score on your state victory predictions (because it would punish you for getting 49 compared to everyone who got 50 or 51), and the RMSE is even worse: the margins were different and the electoral vote & popular vote very different from 2008.

SeanLuke · on Nov 9, 2012

Slate Magazine found two other pundits which were as accurate as Silver.

http://www.slate.com/articles/news_and_politics/politics/201...

Sniffnoy · on Nov 9, 2012

Josh Putnam is considered in the article, which looks at a finer level of detail than just correctly predicting the states.

Osmium · on Nov 9, 2012

Great article, but I disagree with the colouring on the first graph: if reality was within the poll's margin of error, I don't think it should be coloured, because that implies a bias that (probably) isn't actually there.

waterlesscloud · on Nov 9, 2012

Isn't YouGov the XBox pollster? They're in the top group.

keypusher · on Nov 9, 2012

Also interesting to their their margin of error is significantly lower than any of the other polling organizations. From what I gather on Wikipedia, they are internet-centric although I'm not sure if that includes XBox.

bwood · on Nov 9, 2012

Is it possible to be more accurate than nailing all fifty states?

jfoutz · on Nov 9, 2012

Well, maybe yes. One strategy for predicting the outcomes would be to paint a quarter red on one side and blue on the other, and assigning the outcome of that state to the outcome of the coin flip. That probably wouldn't be a very effective strategy, but it's possible to still nail all 50 states (actually, 51 in this case). Silver's method is much much more accurate than the coin toss method.

Silver takes all the polls, even the crappy ones, and includes them in his calculation. If you're a bayesian you'd find this comforting because all of the evidence is included in the belief. There might be some handwringing about how important each poll really is. If you're not a bayesian, then you have some other weird strategy that might or might not work.

So really, the predictions are nice, but what we're after is a system that produces good predictions. It's not clear that Mr. Silver's is the best. Perhaps there is some horrible flaw an evil agent could exploit that just didn't get tickled this election. It's tough to say.

diminoten · on Nov 9, 2012

> If you're a bayesian you'd find this comforting because all of the evidence is included in the belief.

Off-topic, but this is my beef with the "Bayesians" - it's not all of the evidence; it's completely absurd to believe that all evidence can ever be accounted for, when considering anything.

gwern · on Nov 9, 2012

"More of the evidence" seems like pretty much the same thing...

shardling · on Nov 9, 2012

YOU DIDN'T READ THE ARTICLE DID YOU