Here’s all the interesting stuff in Nate Silver’s The Signal and the Noise

I’ve been immersing myself in statistics textbooks and software recently, as a part of a class and my general career interests. So over a weekend ski trip, I took on a lighter version of the work I’m doing by reading Nate Silver’s The Signal and the Noise: Why So Many Predictions Fail—but Some Don’t. Silver has been thoroughly and well-reviewed since his book was published shortly after the presidential election. So I won’t need to introduce him or the basics of what he does. My post will just highlight some of the more interesting, surprising, and difficult-to-articulate stuff in the book, particularly those that are related to topics in economics we’ve already discussed.

***

At the heart of the book is a powerful and important idea: The truth about the world, as best as we can understand it, is probabilistic and statistical, but we humans are unsuited to statistical and probabilistic thinking. What does this mean? Let me give a couple of examples. People say things like, ‘Uncle Lenny died of cancer because he was a smoker.’ The unjustified certainty with which we use the word ‘because’ here reveals a lot about how we think. After all, a large fraction of us—smokers and non-smokers alike—will die of cancer. We know with certainty that smoking statistically increases one’s risk of developing cancer, but we can’t say for sure that Uncle Lenny in particular wouldn’t have developed cancer if he weren’t a smoker. A more rational thing to say would be ‘Uncle Lenny died after being long weakened by cancer. As a smoker, he was more likely than the general population to contract cancer, and so it’s likely that his smoking was a significant contributor among other risk factors to his development of a cancer that was sufficient to contribute to his death.’ But that lacks a certain pith. See the problem? We all know that the underlying reality is that a wide variety of different risk factors contribute to and explain cancer, but we humans like to trade certain and definitive statements about linear causation, rather than thinking about a complex system of inputs that take on different values with different probabilities and interact with each other dynamically to produce distributed outputs with certain probabilities. In other words, we humans like to reason with narratives and essences, but the truth of the world has more to do with statistical distributions, probabilities, and complex systems.

Other examples of essentialist thinking are: When we have a hot summer, we often say that it was caused by global warming; on the other hand, global-warming deniers will say that we cannot make any such attribution because we had hot summers from time to time even before the industrial revolution. The most realistic thing to say would be, “global warming is increasing our chances of experiencing such a hot summer, and thus the frequency of them.” Another example: people will say that, “Kiplagat is an excellent long-distance runner because he is a member of the Kalenjin tribe of Kenya.” This ‘because’ is not entirely justified, but neither is the offense that sensitive people take to this claim, when they say things like, “Not every Kalenjin is a fast runner! And some Americans from Oregon are great runners, too!” The most precise way of putting the underlying truth would be, “Kiplagat is an excellent long-distance runner. He is a member of the Kalenjin tribe, which is well-known to produce a hugely disproportionate share of the world’s best long-distance runners, so this is one major factor that explains his ability.”

Why are we bad at probabilistic thinking, and locked into definite, essentialist, narrative styles of thinking? The axiomatic part of the explanation is that our brains have evolved to reason the way they do because these styles of reasoning were advantageous throughout our evolutionary history. We humans have been built as delivery mechanisms for our masters and tyrants—our genes. They encode instructions to make us and or brains work in the ways that helped them (the genes) get passed down through the generations, rather than working in ways that lead us to the strict truth. Probabilistic thinking takes a lot of information gathering and computational power—things that either weren’t around in our evolutionary history, or were costly luxuries. So our brains have evolved mental shortcuts or ‘heuristics’—ways of thinking that give us the major advantages of surviving and reproducing in the probabilistic world, without all of the costs. Our ancestors did not think, and we do not think, ‘Three out of the last five of our encounters with members of this other tribe have ended badly; so we can conclude with X% certainty that the members of this other tribe that we see here are between Y% and Z% likely to have a hostile attitude toward us.’ Rather, our brains tell us, ‘This enemy is evil and dangerous; either run away or fight—look, I’ve already elevated your heart-rate and tightened up your forearms for you!’ I.e., it gives us an essential claim and a definitive takeaway. In the modern age, public authorities say, ‘Smoking will give you cancer,’ which gets across the main takeaway point and influences behavior in important ways, more powerfully than ‘Lots of smoking generally contributes to a higher probability of developing cancer.’

Our brains are also wired to see a lot of patterns, causation, and agency when they aren’t there. As Silver notes, the popular evo-psych explanation for this is that it is more costly to erroneously attribute a rustling in the woods to a dangerous predator, and to take occasional unnecessary precautions against it, than it is to erroneously always assume that all rustling just comes from the wind, and get eaten alive when a predator actually does appear. Since missing a real pattern is more costly than falsely seeing an unreal one, we tend to see more patterns than there really are, and believe that we can discern a predictable pattern in the movement of stock prices, or get impressed by models that, e.g, predict presidential elections using only two small metrics, finding them more impressive than predictions that rely on aggregates of on-the-ground polling. Our basic innate failures at thinking statistically are reinforced by the culture around us, which accomodates/manipulates us (in good ways in bad) by appealing to our need for narrative approaches to understanding the world.

But now we live in the modern age. Our needs are different than they were in our evolutionary history, and our evolved psychology should not be destiny. We need to learn to reason more truly—which means probabilistically and statistically. Silver explores how we have succeeded and failed in doing this with examples drawn from baseball, online poker, politics, meteorology, climate science, finance, etc.

***

Why is it so important that we learn to reason probabilistically and statistically?  There are two main reasons. The first is very practical, and the second is more theoretical but ultimately very important. First, we obviously base our plans for and investments in the future around our predictions of what the future will be like. But the future cannot be known with absolute certainty, so we need to make rational decisions around a probable distribution of outcomes. For example, in chapter 5, Silver recounts an example of a flood which the public authorities predicted would rise to 48 feet—since the levees protecting a neighboring area were 51 feet tall, locals assumed they were being told that they were safe. But the 48 foot prediction obviously had a margin of error and, this time, it was off by more than 3 feet, and the levees were overrun. Given how dangerous it is to be in a flooded area, the local residents, had they understood the margin of error in the prediction and the probability of the levees being overrun, would have decided it was worth evacuating as a precaution—but they weren’t made to understand that the authorities’ prediction in fact entailed a range of possibilities. This is a very concrete example of the ubiquitous problem of reasoning, planning, and acting around a single true expectation, rather than weighting a range of possible outcomes.

Another example of this is how climate scientists don’t feel like they can give probabilistic statements to the public, like, ‘The most likely outcome is that, on our current path, global temperatures will rise on average 2 degrees Celsius over the next 100 years, and we have 90% certainty that this increase will range between .5 and 3 degrees. Additionally, we fear the possibility that there could be as-yet-imperfectly-understood feedback loops in the climate which could, with 5% probability, raise temperatures by as much as 8 degrees over the next century–while the chance of this is low, the potential costs are so high that we must consider it in our public-policy responses. Additionally, the coming decade is expected to be hotter than any in the last 100 years, but there is a 10% possibility that it will be a cool decade, from random chance.’ The public—you and I—are not good at dealing with these kinds of probabilistic statements. We demand stronger-sounding, definitive predictions—they resonate with us and persuade us, because they’re what our brains are comfortable dealing with. And a lot of the confusion in public debates surrounding scientific matters comes from our demand for definitive answers, where science can only offer a range of probabilities and influences. Climate scientist Michael Mann was quoted in the book as saying, “Where you have to draw the line is to be very clear about where the uncertainties are, but to not have our statements be so laden in uncertainty that no one even listens to what we’re saying.”

But the second, more fundamental reason for why we need to get better at probabilistic prediction is that offering and then testing predictions is the basis of scientific progress. Good models, particularly those that model dynamic systems, should offer a range of probable predictions—if we can’t deal with those ranges, we can’t test which models are the best. That is, we as a society would be ill-advised to say to climate scientists, ‘You predicted that temperatures would rise this decade, but they didn’t—neener neener.’ Rather, we should be savvy enough to understand that there’s a margin of error in every prediction, and that the impact of some trends can be obscured by random noise in the short run, and so the climate scientists’ claim that temperatures are rising is true even if it did not appear in this particular decade.

The rest of this post consists of some of the more interesting of the book’s ideas about statistical reasoning, and some of the barriers thereto, after a brief discursion on economics.

***

I’ve written in the past about the Efficient Market Hypothesis and about the value of short-selling, so it piqued my interest when Nate made some interesting points that related the two. One challenge Nate presents to the EMH is the two ‘obvious’-seeming bubbles that we have experienced in recent memory—the late 90s tech bubble, and the mid-2000s housing bubble. Now, it’s obviously very easy to call bubbles in retrospect, with the benefit of hindsight. But let’s accept for the sake of argument that we really could have seen these bubbles coming and popping—in the 90s, P/E ratios were hugely out of whack, and in the 2000s, housing prices had accelerated at rates that no underlying factor seemed to explain. The question is, why didn’t people short these markets and correct their exorbitant prices earlier?

Well, part of the problem is that in certain markets it can be difficult to accumulate a large short position without huge transaction costs, sufficient to move prices to a more rational level. But Silver’s more interesting argument is that institutional traders are too rational and too risk-averse relative to their own incentives. Counterintuitive, right? What does Silver mean? Let’s imagine that we’re in a market that looks a little overheated. Suppose there’s a market index that currently stands at 200, and you’re an analyst at a mutual fund and you think that there’s a 1/3rd chance that the market will crash to an index of 50 this year. That’s a big deal. But there’s still a 2/3rd chance that the party won’t stop just this year, and the market index will rise to 220 (a 10% return—not bad). In this scenario, the bet with the highest expected return is to short the market, a bet with an expected return of about $.18 on the dollar ( (1/3 * 150 – 2/3 *20) / 200). Going long in the market has an expected loss of the same. So if your goal is to maximize your expected return you go short, obviously.

The problem is, institutional traders don’t have an incentive to maximize their expected return, because the money they trade is not their own. Their first incentive is to cover their asses, so they don’t get fired. And if, in this scenario, you prophecy doom and a market crash, and short the whole market two years in a row, while the market is still rising, you’ll have a lot of outraged clients and you will get fired. And that’s the most likely outcome–the 2/3rds probability that the bull market will continue, and return another 10% this year. If you go along with the crowd, and continue to buy into a bull market that becomes overpriced then, well, when the music stops and the bubble pops, you’ll lose your clients’ money, but you won’t look any worse than any of your competitors. So this may be why a lot of bubbles don’t get popped in good fashion. It’s not that institutional investors are irrational—it’s that they are being rational relative to their career incentives, which are not well-aligned with market efficiency as a whole.

What’s the solution to this problem? Well, part of it is to get more really good short-sellers. One interesting tradeoff here is that the market is most efficient when people are (1) smart and (2) putting their own money on the line. Right now, we’re seeing a transition in which mutual funds and such are becoming more and more common, and so a larger portion of trading that is done in financial markets comes from institutions rather than from individual retail investors. These institutional traders may be smarter than independent retail investors, but they’re not betting their own money, which means their incentives are not well-aligned with market efficiency—the mutual fund’s first incentive is to avoid losing clients, who will bail out if the fund misses out on a bull market in the short term. So institutional investors will face a lot of pressure to keep buying into bull markets even when they know better. In short: don’t expect bubbles to go away anytime soon.

***

Silver discusses some of implications of the fact that predictions themselves can change the behavior they aim to predict. This is particularly pertinent in epidemiology and economics. For example, if the public authorities successfully inform the public that, this year, the flu is expected to be especially virulent and widespread in Boston, Bostonians will be especially inclined to get vaccinated, which will then, in turn, cancel the prediction. So was the prediction wrong? Maybe, but thank God it was! In economics, if the economics establishment sees that some developing country is implementing all of the ‘right’ policies, it will predict lots of economic growth from that country—this will cause a lot of investments and optimism and talent to flow into that country which could ‘fulfill’ the prediction. On the most practical level, this means that in these scenarios it’s very difficult to issue and then assess the accuracy of predictions. On a philosophical level it may mean that a perfect prediction that involves human social behavior may be impossible, because it would require a recursive model in which the prediction itself was one of the inputs.

A lot of this reasoning here raises a moral quandary. Should forecasters issue their predictions strategically? We know that public-health authorities’ predictions about how bad a flu outbreak will be will influence how many people get immunizations. The Fed’s predictions about the future of the economy influence companies’ plans for the future, which plans can then fulfill the Fed’s predictions (i.e., if a company is persuaded by the Fed that there will be an economic recovery, then it will ramp up its production and hiring right now, in order to meet that future demand, which will help fulfill that prophecy). Should these and similarly situated agencies therefore issue their predictions not descriptively, but strategically, i.e., with an eye to influencing our behavior in positive ways? In practice, I assume the agencies definitely do. The Fed has consistently optimistically over-predicted the path of the U.S. economy since the financial crisis. This is embarrassing for it, but any cavalier expression of pessimism from the Fed very well could have tilted the U.S. into a double-dip recession. The obvious problem is that when public agencies make their predictions strategically rather than descriptively, they could, over the long run, dilute the power of their predictions in the eyes of the public—i.e., people might start to automatically discount the authorities’ claims, thinking “this year’s outbreak of avian flu, much like last year’s, will affect 10^3 fewer people than the authorities suggest, so I don’t actually need to get a vaccination.”

***

Silver offers a lot of helpful reminders that rationality requires us to go beyond ‘results-oriented thinking.’ On televised poker, for example, commentators praise the wisdom and perspicacity of players who bet big when their hands weren’t actually all that strong, statistically speaking, and who win either because (1) they caught a break on the last cards dealt (in Texas hold’em style) or (2) they were  lucky enough that everyone else had even weaker hands. But while commentators may praise these players’ prescience, we should call these bets what they are—dumb luck. We shouldn’t evaluate people’s decisions after the fact using perfect information, what we know now. We should evaluate how rationally they acted given the information they had access to at the time. And betting big with a weak hand, without any information that other players’ hands are even weaker, is never the smart or rational thing to do—even though it will luckily pay off in some chance occasions.

***

‘Big Data’ is a modish term right now. An essayist in Wired claimed a few years ago that as we gain more and more data, the need for theory will disappear. Silver argues that this is just the opposite of truth. As the amount of information we have multiplies over and over again, the amount of statistical ‘noise’ we’ll get will multiply exponentially. With all this data, there will be more spurious correlations, which data alone will not be able to tease out. In the world of Big Data we’re going to need a lot of really sound theory to figure out what are the causal mechanisms (or lack thereof) in the data we have, and which impressive-seeming correlations are spurious, explained by random chance. So theory will become more important, not less.

***

One big takeaway for me, as I read Silver’s accounts for how statistical methods have been applied to improve a variety of fields, is that we are very easily impressed, sometimes intimidated, by mathematical renderings of ideas, but statistics really is not rocket science. The computations that statistical software can do at first seem complex, but they’re all ultimately built on relatively easy, intuitive, concrete logical steps. Same with models: the assumptions on which we build models, and the principles we use to tease out causation and such from within the wealth of data, are ultimately pretty intuitive and straightforward and based in basic logical inference. In reading Silver’s account of the how the ratings agencies got mortgage-backed securities wrong in the run-up to the financial crisis, I was astonished by just how simple the models the agencies were using were. That is, even those of us who like to bash the financial sector still tend to assume there’s some sophisticated stuff beyond our ken going on in there. But Silver reports, for example, that the ratings agencies had considered the possibility that the housing market might decline in the U.S., but continued to assume that defaults on mortgages would remain uncorrelated through such a period. The idea that mortgage-defaults would always exhibit independence—and that the rate of default as a whole could not be changed by global economic conditions—is flatly ridiculous to anybody who takes a moment to think imaginatively about how a recession could affect a housing market. But because the ratings agencies’ ratings were dressed up in Models based around Numbers on Spreadsheets, Serious People concluded that they were Serious Ratings. A lesson for the future: Don’t let yourself be bullied into denying the obvious truths or accepting obvious falsehoods just because they have been formulated in mathematical notation. A seemingly sophisticated mathematical model is in truth a very bad one if its basic assumptions are incorrect.

The lesson here is not that we should eschew statistical methods—it’s that we should get in on the game and improve the models, instead of being cowered by the people who wield them. Indeed, another striking part of the book was Silver’s admission that his own famous political-prediction model on his Five Thirty-Eight blog is not terribly sophisticated—it’s only been so successful because everyone else’s standards in the political world have been so low. And the statistical methods that revolutionized baseball drafting and trading, as recalled in Moneyball, weren’t that sophisticated either—they were just low-hanging fruit that hadn’t been eaten yet.

***

The more polemical parts of the book center on Silver’s righteous claim that pundits be held to account for their predictions. Silver points out that political pundits, like those who appear on the McLaughlin group, regularly get their forecasts wrong in very predictable ways, and never get called out on them or punished. As one who, like Silver, gets angry when people make plainly descriptively untrue statements about the world, I did enjoy his righteous outrage. But I think that in this, he (and I) get something basically wrong—namely, being a political pundit and appearing on the McLaughlin Group are fundamentally not truth-seeking activities, and so their failure to deliver truth should be completely unsurprising and probably doesn’t even qualify as a real indictment in the pundits’ minds. The goal of the people engaged in these activities is not to uncover the truth, but to root for their team. So of course the Republican pundits on McLaughlin group always predict Republican electoral victories, as the Democrats predict Democratic victories. That’s what they’re there for.

More fundamentally, I think Silver under-estimates how uncommon it is for people to think about the world in a descriptive truth-seeking manner. Most of us most of the time are not engaged in truth-seeking activity. Most of us typically choose the utterances we issue about the world on the basis of loyalties, emotional moral commitments, etc.. Thinking about the world descriptively is just not the natural mode for most people. When a Red Sox fan, in the middle of a bad season, says something like, “The Red Sox are going to win this game against the Yankees,” we shouldn’t actually take him to mean, “The Red Sox are certain to win this game” or even necessarily “The Red Sox have a better than even chance of winning this game.” Rather, the real content of his statement is better translated as, “Rah, rah, goooo Red Sox!”  For most people, statements that they phrase as predictions are not a matter of descriptive analysis of the world—they’re statements of affiliation, hope, moral self-expression, etc. The social scientific and descriptive mindsets are very rare and unnatural for humans, and if we’re going to get angry about people’s failures in this respect, we’re going to be angry pretty much all the time.

But I do agree with the basic takeaway from this polemic: Silver wants to make betting markets a more common, acceptable, and widely-expected thing. If we were forced to publicly put our money where our mouths are, we might be more serious and humble about the predictions we make about the future, which should improve their quality. I’ve long relied on Intrade to give me good, serious predictive insights into areas where I have no expertise, and do wish liquid betting markets like it, where I can gain credible insights into all kinds of areas, were more common and entirely legal.

***

A lot of expert reasoning goes into building a good model with which to make a prediction. But what about us general members of the public who don’t have the time to acquire expertise and build our own models? How should we figure out what to believe about the future? Silver provides some evidence that aggregations of respectable forecasters (i.e., those who have historically done very well) are almost always better than any individual’s forecasts. E.g., an index that averages the predictions of 70 economists consulted on their expectations for GDP growth over the next year does much better than the predictions of any one of those economists. So in general, when we’re outside of our expertise, our best bet is to rely on weighted averages of expert estimates.

But there’s an interesting catch here: While aggregates of expert predictions generally do better than any individual experts, this fact depends upon the experts doing their work independently. For example, Intrade has done even better than Nate Silver in predicting the most recent election cycles, according to Justin Wolfers’ metrics. So does that mean that Nate Silver should throw away his blog, and just retweet Intrade’s numbers? No. And the reason is that Intrade’s is strongly affected by Silver’s predictions. So if Silver were, in turn, to base his model around Intrade, we would get a circular process that would amplify a lot of statistical noise. An aggregation ideally draws on the wisdom of crowds, law of large numbers, and the cancelling-out of biases.  This doesn’t work if the forecasts you’re aggregating are based on each other.

Aggregations of predictions are also usually better than achieving consensus. Locking experts together and forcing them to all agree may give outsized influence to the opinions of charismatic, forceful personalities, which undermines the advantages of aggregation.

***

Nate argues, persuasively, that we actually are getting much better at predicting the future in a variety of fields, a notable example of which is meteorology. But one interesting and telling Fun Fact is that while meteorologists’ actual predictions are getting very good, the predictions that they are compelled to present to the public are not so strong. For example, the weather forecasts we see on T.V. have a ‘wet bias.’ When there is only a 5-10% chance of rain, the T.V. forecasters will say that there is a 30% chance, because when people hear 5-10% chance they think of it as an essential impossibility, and become outraged if they plan a picnic that subsequently gets rained on, etc. So to abate righteous outrage, weather forecasters have found it necessary to over-report the probability of rain.

Meteorologists’ models are getting better. We humans just aren’t keeping pace, in terms of learning to think in probabilities.

***

But outside of the physical sciences, whose systems are regulated by well-known laws, we tend to suck at forecasting. Few political scientists forecast the downfall of the Soviet Union. Nate attributes this failure to political biases—right-leaning people were unwilling to see that Gorbachev actually was a sincere reformer, while left-leaning people were unwilling to see how deeply flawed the USSR’s fundamental economic model was. Few economists ‘predicted’ the most recent recession even at points in time when, as later statistics would reveal, we were already in the midst of it. Etc., etc.

***

Silver points out that predictions based on models of phenomena with exponential or power-law properties seem hugely unreliable to us humans who evaluate these models’ predictions in linear terms. A slight change in the coefficients in the parameter can have huge implications for the prediction a model makes if it is exponential. This can cause a funny dissonance: a researcher might think her model is pretty good, if its predictions come within an order of magnitude of observations, because this indicates that her basic parameters are in the right ballpark. But to a person who thinks in linear terms, an order-of-magnitude error looks like a huge mistake.

***

Silver briefly gestures at a thing that the economist Deirdre McCloskey has often pointed out—that our use of ‘statistical significance’ in social science is arbitrary and philosophically unjustified. What is statistical significance? Let me back up and explain the basics: Suppose we are interested in establishing whether there is a relationship, among grown adults, between age and weight—i.e., are 50-year olds generally heavier than 40 and 35-year olds? Suppose we sampled, say, 200 people between 50 and 35, and wrote down their ages and weights, and then constructed a dataset. Suppose we did a linear regression analysis on the data, which revealed a positive ‘slope,’ representing the average impact that an extra year of life had on weight in the sample. Could we be confident that in general, for the population of people between 35 and 50 as a whole, this relationship holds? Not necessarily. Theoretically, there’s always a chance that our sample set is different—by pure chance—than the general population, and so the relationship in our sample cannot be generalized. There’s a possibility that the relationship we observed between age and weight is not a true relationship at all, but was just a matter of chance. And (as long as our sample was truly randomly selected from the population) we can actually calculate the probability of this possibility, using the data’s standard deviation and the size of our sample. In statistics, we call it the p-value, and a p-value of .05 means that there’s a 5% chance that a relationship observed in a sample is just an illusion, produced by chance. In contemporary academe, social scientists by convention will generally publish results with a ‘statistical significance’ of 95%–i.e., where the p-value is lower than .05. But applying this rule mechanically actually doesn’t make much sense. It means that today, a statistical analysis that produces a result with a p-value of .050 will get published, while one with a p-value of .051 will not, even though the underlying realities are almost indistinguishable. There’s no fundamental philosophical reason for setting our general p-value cutoff at .05—indeed, the basic reason we do this is that we have 5 fingers. In practice, this contributes to the rejection of some true results and the acceptance of some false results. If we accept all findings that establish ‘statistical significance,’ then we’ll accept a lot of false results. For example, if a journal publishes 100 research findings, all of which have a p-value of .03, passing statistical significance, we would expect that, on average, 3 of these findings would actually be incorrect, illusions of the samples from which they were built. (This is, by the way, after controlling for the possibility of the data being incorrectly obtained.)

***

On page 379, Silver has what is possibly the greatest typo in history: “ ‘At NASA, I finally realized that the definition of rocket science is using relatively simple psychics to solve complex problems,’ Rood told me.” (I am envisioning NASA scientists carefully scribbling down the pronouncements of glazy-eyed, slow-spoken Tarot-card readers.)

***

The final chapter in the book, on terrorism, was fascinating to me, because with terrorism, as with other phenomena, we can find statistical regularities in the data, with no obvious causal mechanism to explain those regularities. In the case of terrorism, there is a power-law distribution relating the frequencies and death tolls of terrorist attacks. One horrible feature of the power-law distribution of terrorist attacks is that we should predict that most deaths from terrorism will come from the very highly improbable, high-impact attacks. So over the long-term, we’d be justified in putting more effort into preventing e.g., a nuclear attack on a major city that may never happen, as opposed to a larger number of small grade terrorist attacks. Silver even argues that Israel has effectively adopted a policy of ‘accepting’ a number of smaller-scale attacks, freeing the country to put substantial effort into stopping the very large-scale attacks—he shows data suggesting that Israel has been able to thus ‘bend the curve,’ reducing the total number of deaths from terrorism in the country that we would otherwise expect.

***

But the big thing I was hoping to get from this book was a better understanding the vaunted revolution in statistics in which Bayesian interpretations and ideas are supplanting the previously-dominant ‘frequentist’ approach. But I didn’t come away with a sound understanding of Bayesian statistics beyond the triviality that it involves revising predictions as new information presents itself. Silver told us that the idea can be formulated as a simple mathematical identity: It requires us to give weights to the ‘prior’ probability of a thing being true; the probability that the new information would present itself if the thing were true; and the probability of the information presenting itself but the thing still being false. With these three we can supposedly calculate a ‘postperior probability,’ or our new assessment of the phenomenon being true. While I will learn more about the Bayesian approach on my own, Silver really did not convey this identity on a mathematical level, or help the reader understand its force on a conceptual level.

Overall, then, I found the book disappointing in its substantive, conceptual, and theoretical content. A lot of the big takeaways of the books are moral-virtue lessons, like, “Always keep an open mind and be open to revising your theories as new information presents itself”; “Consult a wide array of sources of information and expertise in forming your theories and predictions”; “We can never be perfect in our predictions—but we can get less wrong.” All great advice—but not what I wanted to get from the time I put into the book. The sections on chess and poker are interesting and good journalism, too, but they will do little to advance the reader’s understanding of statistics, model-building, or the oft-heralded “Bayesian revolution” in statistics, etc. But maybe I’m being a snob and wanting more of a challenge than a book could pose if it expected to sell.

–Matthew Shaffer