The elegant mathematical models of classical mechanics depict a world in which objects exhibit deterministic behaviors. These models make perfect predictions within the accuracy of their human-scale measurements.
But, once you start dealing with atoms, molecules and exotic subatomic particles, you find yourself in a very different world, one with somewhat counter-intuitive behaviors governed by the laws of quantum mechanics. The orderly, predictable models of classical physics have now given way to wave functions, uncertainty principles, quantum tunneling and wave-particle dualities.
But, the world of the very small is not the only one with non-deterministic behaviors. So are highly complex systems, especially those systems whose components and interrelationships are themselves quite complex. This is the case with social systems, which are based on individuals, groups, and institutions. It’s quite a challenge to make accurate predictions in such systems due to the the dynamic nature of human behaviors. Terms, like emergence, long tails, and butterfly effects - every bit as fanciful as quarks, charm and strangeness, - are part of the social systems lexicon.
Which brings us to the 2020 US election. “The polls were wrong again, and much of America wants to know why,” wrote NY Times journalist David Leonhardt in a recent article. “This is a disaster for the polling industry and for media outlets and analysts that package and interpret the polls for public consumption, such as FiveThirtyEight, The New York Times’ Upshot, and The Economist’s election unit,” said David Graham in The Atlantic.
What happened? Here is what the three well regarded forecasting sites Graham mentioned had to say in the aftermath of the election.
“On the morning of election day, The Economist’s election-forecasting model gave Joe Biden a 19-in-20 chance of winning the presidency… But it will be by a much closer margin than we forecast… Our errors may reflect a general weakness of quantitative models: they try to predict the future by extrapolating from the past,” wrote The Economist.
“It’s not too early to say that the polls’ systematic understatement of President Trump’s support was very similar to the polling misfire of four years ago, and might have exceeded it. For now, there is no easy excuse,” noted NY Times Upshot correspondent Nate Cohn.
But, prominent data journalist Nate Silver had a somewhat different opinion. Silver is the founder of FiveThirtyEight and author of the 2012 bestseller The Signal and the Noise: Why Most Predictions Fail but Some Don’t. In The Polls Weren’t Great. But That’s Pretty Normal, he wrote that he doesn’t entirely understand the polls-were-wrong storyline.
“This year was definitely a little weird, given that the vote share margins were often fairly far off from the polls (including in some high-profile examples such as Wisconsin and Florida). But at the same time, a high percentage of states (likely 48 out of 50) were called correctly, as was the overall Electoral College and popular vote winner (Biden). And that’s usually how polls are judged: Did they identify the right winner?… the margins by which the polls missed - underestimating President Trump by what will likely end up being 3 to 4 percentage points in national and swing state polls - is actually pretty normal by historical standards.”
Silver added that “Voters and the media need to recalibrate their expectations around polls - not necessarily because anything’s changed, but because those expectations demanded an unrealistic level of precision - while simultaneously resisting the urge to ‘throw all the polls out.’”
Perhaps the problem has to do with intrinsic limits to the predictability of elections and of social system outcomes in general. I wondered what computational social scientists had to say about the subject so I read a few article on predictions in the social sciences recommended by MIT professor Abdullah Almaatouq. Let me briefly discuss two of those articles.
In What failure to predict life outcomes can teach us, Cornell sociologist Filiz Garip wrote that social scientists are increasingly turning to supervised machine learning (SML) to offer predictions. But, some are also scrutinizing the suitability of SML for social science predictions, even asking a fundamental question: “are individual behaviors and outcomes even predictable?”
“Prediction is not a typical goal in the social sciences despite recent arguments that it should be,” said Garip. “Social scientists focus on inference: that is, understanding how an outcome is related to some input… In SML, by contrast, the researcher includes many inputs, considers flexible (often nonparametric) models linking inputs to the outcome, and picks the model that best predicts the outcome in new data.”
Her article describes a recent mass collaboration experiment involving hundreds or researchers led by Princeton professor Matthew Salganik which aimed to evaluate the promise and limits of SML for predictions in the social sciences. The experiment asked 160 different teams to build predictive models for six life outcomes, - such as a child’s grade point average and whether a family would be evicted from their home, - by analyzing data about 4,000 families from the Fragile Families and Child Wellbeing Study which has been carefully collected by social scientists over the past 15 years. The research teams competed to see which could build the best predictive model using any method of their choice. They were judged on the accuracy of their predictions, whose actual values were only available to the challenge organizers.
Despite having access to a rich dataset, the predictions weren’t very accurate regardless of the method used by any of the 160 research teams. Even the best SML predictions were only slightly better than standard methods for analyzing social science data, such as regression models. However, while not doing well in predicting individual life outcomes, the models were able to identify aggregate properties, - e.g., the effect of education on earnings, racial differences in school performance. Let me point out that this is also the case in physics. While we cannot predict the properties and behaviors of individual atoms, we can do so in a gas, liquid or solid consisting of huge aggregations of such atoms.
“How predictable is human behavior?,” ask Jake Hoffman, Amit Sharma, and Duncan Watts of Microsoft Research in their 2017 Science essay Prediction and explanation in social systems. “There is no single answer to this question because human behavior spans the gamut from highly regular to wildly unpredictable.” At one extreme they cite a study of 50,000 mobile phone users which predicted with 70% accuracy the likely location of users based on their usual most visited location as reflected in phone data. At the other extreme are highly improbable black swan events which are intrinsically unpredictable.
The predictability of social systems like presidential elections, stock market movements and feature film revenues fall someplace in-between these extremes, with the difficulty of prediction varying considerably with the details of the specific system. “To evaluate the accuracy of any particular predictive model, therefore, we require not only the relevant baseline comparison - that is, the best known performance - but also an understanding of the best possible performance. The latter is important because when predictions are imperfect, the reason could be insufficient data and/or modeling sophistication, but it could also be that the phenomenon itself is unpredictable, and hence that predictive accuracy is subject to some fundamental limit.”
The authors add that coming up with a theoretical limit to predictive accuracy for a complex social system is an important research question which should interest both social scientists and computer scientists. “If the best-known performance is well below what is theoretically possible, efforts to find better model classes, construct more informative features, or collect more or better data might be justified. If, however, the best-known model is already close to the theoretical limit, scientific effort might be better allocated to other tasks, such as devising interventions that do not rely on accurate predictions.”
“Depending on the balance between these two sets of factors, any explanation for why a particular person, product, or idea succeeded when other similar entities did not will be limited, not because we lack the appropriate model of success, but rather because success itself is in part random… In other words, to the extent that outcomes in complex social systems resemble the outcome of a die roll more than the return of Halley’s Comet, the potential for accurate predictions will be correspondingly constrained.”
Comments