Mining Truth From Data Babel
By LEONARD MLODINOW
Published: October 23, 2012
A friend who was a pioneer in the computer games business used to marvel at how her company handled its projections of costs and revenue. “We performed exhaustive calculations, analyses and revisions,” she would tell me. “And we somehow always ended with numbers that justified our hiring the people and producing the games we had wanted to all along.” Those forecasts rarely proved accurate, but as long as the games were reasonably profitable, she said, you’d keep your job and get to create more unfounded projections for the next endeavor.
Alessandra Montalto/The New York Times
THE SIGNAL AND THE NOISE
Why So Many Predictions Fail — but Some Don’t
By Nate Silver
Illustrated. 534 pages. The Penguin Press. $27.95.
This doesn’t seem like any way to run a business — or a country. Yet, as Nate Silver, a blogger for The New York Times, points out in his book, “The Signal and the Noise,” studies show that from the stock pickers on Wall Street to the political pundits on our news channels, predictions offered with great certainty and voluminous justification prove, when evaluated later, to have had no predictive power at all. They are the equivalent of monkeys tossing darts.
As one who has both taught and written about such phenomena, I have long felt like leaning out my window to shout, “Network”-style, “I’m as mad as hell and I’m not going to take this anymore!” Judging by Mr. Silver’s lively prose — from energetic to outraged — I think he feels the same way.
Nate Silver. Robert Gauldin
The book’s title comes from electrical engineering, where a signal is something that conveys information, while noise is an unwanted, unmeaningful or random addition to the signal. Problems arise when the noise is as strong as, or stronger than, the signal. How do you recognize which is which?
Today the data we have available to make predictions has grown almost unimaginably large: it represents 2.5 quintillion bytes of data each day, Mr. Silver tells us, enough zeros and ones to fill a billion books of 10 million pages each. Our ability to tease the signal from the noise has not grown nearly as fast. As a result, we have plenty of data but lack the ability to extract truth from it and to build models that accurately predict the future that data portends.
Mr. Silver, just 34, is an expert at finding signal in noise. He is modest about his accomplishments, but he achieved a high profile when he created a brilliant and innovative computer program for forecasting the performance of baseball players, and later a system for predicting the outcome of political races. His political work had such success in the 2008 presidential election that it brought him extensive media coverage as well as a home at The Times for his blog, FiveThiryEight.com, though some conservatives have been critical of his methods during this election cycle.
His knack wasn’t lost on book publishers, who, as he puts it, approached him “to capitalize on the success of books such as ‘Moneyball’ and ‘Freakonomics.’ ” Publishers are notorious for pronouncing that Book A will sell just a thousand copies, while Book B will sell a million, and then proving to have gotten everything right except for which was A and which was B. In this case, to judge by early sales, they forecast Mr. Silver’s potential correctly, and to judge by the friendly tone of the book, it couldn’t have happened to a nicer guy.
Healthily peppered throughout the book are answers to its subtitle, “Why So Many Predictions Fail — but Some Don’t”: we are fooled into thinking that random patterns are meaningful; we build models that are far more sensitive to our initial assumptions than we realize; we make approximations that are cruder than we realize; we focus on what is easiest to measure rather than on what is important; we are overconfident; we build models that rely too heavily on statistics, without enough theoretical understanding; and we unconsciously let biases based on expectation or self-interest affect our analysis.
Regarding why models do succeed, Mr. Silver provides just bits of advice (other than to avoid the failings listed above). Mostly he stresses an approach to statistics named after the British mathematician Thomas Bayes, who created a theory of how to adjust a subjective degree of belief rationally when new evidence presents itself.
Suppose that after reading a review, you initially believe that there is a 75 percent chance that you will like a certain book. Then, in a bookstore, you read the book’s first 10 pages. What, then, are the chances that you will like the book, given the additional information that you liked (or did not like) what you read? Bayes’s theory tells you how to update your initial guess in light of that new data. This may sound like an exercise that only a character in “The Big Bang Theory” would engage in, but neuroscientists have found that, on an unconscious level, our brains do naturally use Bayesian prediction.
Mr. Silver illustrates his dos and don’ts through a series of interesting essays that examine how predictions are made in fields including chess, baseball, weather forecasting, earthquake analysis and politics. A chapter on poker reveals a strange world in which a small number of inept but big-spending “fish” feed a much larger community of highly skilled sharks competing to make their living off the fish; a chapter on global warming is one of the most objective and honest analyses I’ve seen. (Mr. Silver concludes that the greenhouse effect almost certainly exists and will be exacerbated by man-made CO2 emissions.)
So with all this going for the book, as my mother would say, what’s not to like?
The main problem emerges immediately, in the introduction, where I found my innately Bayesian brain wondering: Where is this going? The same question came to mind in later essays: I wondered how what I was reading related to the larger thesis. At times Mr. Silver reports in depth on a topic of lesser importance, or he skates over an important topic only to return to it in a later chapter, where it is again discussed only briefly.
As a result, I found myself losing the signal for the noise. Fortunately, you will not be tested on whether you have properly grasped the signal, and even the noise makes for a good read.
Leonard Mlodinow is the author of “Subliminal: How Your Unconscious Mind Rules Your Behavior” and “The Drunkard’s Walk: How Randomness Rules Our Lives.”