After correctly predicting 49(!) states in the 2008 presidential election and 50(!!) in 2012, Nate Silver is the King of Stat Nerds. He sits on a throne made entirely of TI-83 graphing calculators and he wears a gold-played Casio calculator wristwatch. His fivethirtyeight blog was recently pulling in a staggering 20% of the New York Times web traffic, and while the rest of the pundits were reporting a dead heat in the polls, Silver’s model ended up at more than 90% for Obama leading up to election night.
Silver cut his teeth in baseball sabermetrics before moving to politics, and the backlash he faced in recent months strongly parallels the anti-stats sentiments that are still going strong in the sports world. But now Silver has been completely legitimized, and his recently released book (currently number 17 on Amazon and rising) is going to propel this whole “math” thing into the mainstream conversation, the likes of which we haven’t seen since Freakonomics and Moneyball.
Furthermore, the concept of “big data” will gain steam as the stories are written in the coming weeks about how the Obama administration campaigned more efficiently with less money and mopped the floor with the Romney camp. The zeitgeist is changing, and the legitimacy of data mining, reliance on sophisticated computer models, and quantitative analysis in general is going to become much more accepted in all parts of society. Fancy stats are certainly nothing new in sports, but I genuinely think that with Silver’s domination of the political prediction game, sports stats can hang on for the ride. I wanted to share some of my thoughts on the recent goings on because I really think we are at a crossroads with the way big data and stats are going to be received.
OBJECTIVITY: The Numbers Don’t Care Who Wins
Try to imagine for a minute what would have happened if the election was flipped–if Romney was leading in Silver’s model going into the election and then won. The reception in the media surely would have been different, but I wonder if the classic “I don’t like your numbers, therefore they’re untrue” argument would have been as obvious, or if the ad hominem attacks would have flown so freely.
Silver faced so much backlash because his forecast directly flew in the face of the pundits, and not just the Republicans. The narrative coming from talking heads on both sides of the aisle was that the election was “Razor Tight” (their words, not mine) and right up to November 6th, they were saying it was anybody’s election. Of course, it was in their interest to push that narrative, as they all have airtime to fill and quotas to fill for their magazines or newspapers or blogs. The fact that Silver was making an objective forecast was upsetting because it was so opposite of the status quo. The fivethirtyeight model simply input poll results, weighted them appropriately, and output a prediction of what the electoral votes would be. Silver didn’t set out to create a model that would show Obama was winning…he set out to create a model that would reflect the truth.
Everything in the above paragraph is directly applicable to sports writers and sports researchers. The Old School is mad at the New School because they have made their living on their experience and their “gut calls” and now those things are being invalidated by numerical models. It has been said that the difference between researchers and pundits is that one starts with a question and constructs an argument based around the answers to that question, and one starts with an argument and searches for facts that support the argument. The sports and political worlds both have become so reliant on pundits who until recently have been using just the most basic of stats that this whole “objectivity” thing is still very new, and therefore, scary.
But over time, both political stats and advanced sports stats will gain legitimacy as they persist in the mainstream consciousness. And as long as they are good stats, they will persist (more on this in a bit). And obviously Silver is CRUSHING IT for his part, so there’s no reason to think he’s going anywhere.
TRANSPARENCY: Hey, come over here and look at what’s behind this curtain!
With the Old School way of doing things, the pundits don’t have to be concerned with transparency…they say what they think, they explain it, and boom, there’s your transparency right there. Here’s my opinion, and by definition it’s unfalsifiable so…we’ll just keep going round and round because there is no right or wrong, it’s just all conjecture. With the New School data-driven approach, there is now a need for transparency that didn’t exist before. Without transparency, it’s all too easy to try to discredit stats by calling them biased (see above section on Objectivity.)
Sports researchers (and Nate Silver) walk a fine line when it comes to the transparency of data and formulas (formulae?). They must reveal enough about their methodology so that others can understand what they did, other researchers can help develop the measures, and laypeople can get what the numbers mean. But let’s be really real right now, the way to make money off these kind of things is to withhold enough of the nuts and bolts so that nobody else can figure them out. Silver was incorrectly slammed for tinkering too much with his machine, oversampling Democrats for example, or generally just using his “Magical Formula.” But he does in fact go into great detail about his methodology. I guess what it boils down to is that research should be as transparent as is necessary, but each person will have different levels of comfort when it comes to divulging all the secrets. For Silver’s part, I have been impressed with how open he has been and how much of the nitty gritty he gets into.
It Helps (A Lot) to be Right
It’s hard to imagine Silver’s model being wrong, given that his forecast reached over 90% for Obama. If it was in fact a closer election and it was, say 55/45 then there would have been a much different discussion surrounding his forecast. But over the last two presidential elections, he is batting .990, which is obviously a stellar showing. The nature of Silver’s forecast is such that he puts himself out on quite a limb, predicting each state and the overall winner. Yes, he does in fact give probabilities for each state so if he was wrong he would be able to defend the forecast. But he’s NOT wrong, which is very good for him.
In hockey research I keep coming back to the whole Minnesota Wild thing from the 2011-12 season. Long story short, when the Wild were number one in the NHL, the advanced stats community was vehement about their falling back to earth, based mostly around the PDO statistic. And they were RIGHT, the Wild did regress in a big way. The stats crowd will always have that one in their pocket, and if we ever get a season again, that stat will have a very established place in the stats conversation, not just on the blogs and on Twitter, but I think with the legitimacy gained from last year, it will creep into the mainstream conversation. Just as Silver’s model has been shown to be profoundly correct, the PDO analysis has proven correct and useful.
The quote from Moneyball that stuck with me the most (I can’t remember if it’s in the book too) is: “The first guy through the wall always gets bloody.” Nate Silver took a lot of shots in the media this year, but he has come through the other side squeaky clean. His forecast has been amazingly accurate, and I think it’s going to go a long way toward the legitimization of Big Data and statistical analysis not just in politics, but in sports and all other areas of society. I have been very impressed with the way Silver handles his work, and conducts himself in public and in the media. I hope these ramblings have made some kind of sense, I am still forming what I think are the lessons to be learned from his unprecedented success. Let me know what you think about whether and how the fivethirtyeight model’s success will help usher in the Big Data movement and what lessons the sports stats community can learn. Don’t forget to follow me on Twitter @Hashtag_Hockey, and thanks so much for reading!