1 Cognitive biases

1.1 People look for patterns where there are none

## Press [enter] to continue

1.2 Are you smarter than a rat?

T maze

T maze

2 On Forecasting

2.1 Motivation

  • Conspicuous failures of existing methods
  • Success of forecasting models in other behavioral domains
  • Increased processing power

Notes

  • Conspicuous failures of existing methods:
    • end of Cold War
    • post-invasion Iraq
    • Arab spring
  • Forecasting in other areas:
    • Macroeconomic forecasting (hum…)
    • Elections: Nate Silver effect
    • Demographic and epidemiological forecasting
    • Famine forecasting: USAID FEWS model
    • statistical models for mortgage repayment were quite accurate
    • Moneyball and Soccernomics

2.2 Predicting vs forecasting

  • Sound theory, but do not know whether the antecedent conditions have been satisfied.
  • Even with info + theory, randomness can play a role
  • Prediction is possible without explanation

Notes

  • Sound theory, but do not know whether the antecedent conditions have been satisfied. E.g., we know that revolts/revolutions/wars tend to happen when there is a famine, but were unable to predict the famine

  • Even with info + theory, randomness can play a role e.g., geophysicists understand the theory of plate tectonics and can monitor seismological antecedents, but still cannot predict earthquakes E.g., Italian Scientists to Stand Trial for Manslaughter in Quake Case: Italian``Enzo Boschi, the president of Italy’s National Institute of Geophysics and Volcanology (INGV), will face trial on charges of manslaughter with six other scientists and technicians for failing to alert the residents of L’Aquila ahead of the devastating earthquake that struck the central Italian town on 6 April 2009, killing 308 people.’’
  • Prediction is possible without explanation
    • e.g., ancient astronomers
    • Even good predictions won’t be believed without a sound theory because they undercut core scientific beliefs. Scientists would look for other mechanisms underlying these successes.

2.3 A problem is that these excuses are often used to justify poor forecasts

  • Explanation is possible without prediction:
    • Pacifists do not abandon Gandhi’s worldview just because he said in 1940 that Hitler is not as bad as “frequently depicted” and that `he seems to be gaining his victories without much bloodshed’
    • Martin Feldstein predicted that the legacy of the Clinton 1993 budget would lead to stagnation for a decade.
  • Prediction is possible without explanation when people have forecasting successes

3 Judging judgement

3.1 What is a good judge?

Two criteria:

  • Getting it right
  • Thinking the right way

3.2 Getting it right

How do we measure it?

  • Accuracy
  • True positives at the cost of false alarms?
  • Risks of overpredicting vs underpredicting Should false alarms and hits be weighed equally? E.g., what is riskier:
    • in the 1980s,
      • underestimate the Soviet Union, tempting them to test the US’s resolve?
      • Overestimate them and pay high military costs I.e., the risk here is to treat as `wrong’ forecasters those who have made value-driven decisions to exaggerate certain possibilities.
  • How early?

Notes

  • Accuracy: the problem is that accuracy is easy to reach with rare events. Eg., it is easy to forecast nuclear armageddon: every time, predict it won’t happen, and you’ll be right 99.9% of the time. (optional technical note: accuracy = \(P(\hat Y = Y)\))

3.3 Thinking the right way

  • Do not violate basic probability theory. i.e., probabilities should sum to 1
  • Adjust your probability estimates in the face of evidence

Notes

From Tversky and Koehler, p. 553: “Stanford undergraduates (N= 196) estimated the percentage of U.S. married couples with a given number of children. Subjects were asked to write down the last digit of their telephone numbers and then to evaluate the percentage of couples having exactly that many children. They were promised that the 3 most accurate respondents would be awarded $ 10 each. As predicted, the total percentage attributed to the numbers 0 through 9 (when added across different groups of subjects) greatly exceeded 1. The total of the means assigned by each group was 1.99, and the total of the medians was 1.80. Thus, subadditivity was very much in evidence, even when the selection of focal hypothesis was hardly informative. Subjects overestimated the percentage of couples in all categories, except for childless couples, and the discrepancy between the estimated and the actual percentages was greatest for the modal couple with 2 children. Furthermore, the sum of the probabilities for 0, 1,2, and 3 children, each of which exceeded .25, was 1.45. The observed subadditivity, therefore, cannot be explained merely by a tendency to overestimate very small probabilities.”

4 Political Forecasting: Is it blind luck?

4.1 Ontological Skeptics

Interdeterminacy is due to the properties of the external world. A world that would be just as unpredictable if we were smarter.

Notes

  • Path dependency, aka increasing returns
    • QWERTY
    • Polya’s urn: Small initial advantages accumulate
    • Rise of the West
      • Tiny advantages that Europe had: property rights, rule of law, market competition
    • Hard to know whether we face an increasing- or decreasing returns world. Ie., does history have a diverging branching structure that leads to a variety of possible worlds, or a converging structure that channels us into destinations predetermined long ago
    • Cleopatra’s nose.
  • Complexity theorists Aka, the butterfly effect
    • Gabriel Prinzip
    • Great oaks from little acorns. Problem: impossible to pick the influential little acorn before the fact.
  • Game theorists Multiple or mixed strategy equilibria
    • Players will second-guess each other to the point where political outcomes, like financial markets, resemble random walks.
    • Financial geniuses are statistical flukes

4.2 Psychological Skeptics

We mispredict because of the way our (limited) minds work

Notes

  • Preference for simplicity: “Bachar al Assad is like Hitler”
  • Aversion to ambiguity and dissonance
    • People are overconfidence in their counterfactual beliefs
    • People dislike dissonance. They like to couple good causes with good effects. But detested policies can sometimes have positive effects. E.g., valued allies can have a frightful human rights record.
    • People hate randomness.
      • e.g., rat experiment
      • When we know the base rate and not much else, we’d be better off predicting the most common outcome

4.3 Results

  • Most existing research makes no effort at testing their theory on future data
    • “isms”
    • statistical models
  • Tetlock: let’s see how well experts perform. 284 participants,
    • most with doctorates, almost all with postgraduate training in polsci, econ, international law, diplomacy, journalism
    • avg of 12 years of work experience
    • academia, think tanks, governments, IOs
    • Very thoughtful and articulate
    • Broad cross-section of political, econ and national security outcomes

4.4 Results

Source: Tetlock, p. 51

Source: Tetlock, p. 51

Notes

  • Humans overpredict rare events
  • Experts no better than dilettantes
  • All humans far worse than algorithms, even simple ones

4.5 The experts fight back

  • Perhaps we didn’t select the right experts
  • Perhaps our dilettantes are really experts
  • Maybe experts are very cautious.

Notes

  • Perhaps we didn’t select the right experts? But little evidence of that: equally poor regardless of seniority or domain (academia, government, etc.),
    • No better at short term vs long term, domestic v international, econ v political.
  • Perhaps our dilettantes are really experts. I.e., slightly less specialized, but still well read.
    • So let’s look at briefly briefed UG students. They are worse, so expertise does matter to an extent.
  • Maybe experts are very cautious. I.e., better safe than sorry. So we can correct for various such mistakes. In short, we take out the difference between their average forecast and the base rate for the outcome.

4.6 Foxes v hedgehogs

see Tetlock’s Expert Political Judgement

see Tetlock’s Expert Political Judgement

Notes

But hedgehogs are here to stay, mainly because of media attention and our strong desire for “expert” advice

5 What can and cannot be predicted?

5.1 Where algorithms do well

  • Nate Silver
  • Routine elections

Notes

  • Nate Silver performed very well in the 2008 election (not so well in 2016…)
  • Routine elections in rich countries like the United States are some of the softest targets in political forecasting.
    • Rules are transparent
    • high-quality data, including surveys of would-be voters, are often available
    • the connection between those data and the outcome of interest is fairly straightforward.

5.2 Where algorithms do less well

  • Nate Silver fails too, even for elections
  • For international events, we often lack data
  • Even simple indicators are tricky
  • Events are rare
  • Heterogenous environments

Notes

  • Nate Silver fails too, even for elections
  • For international events, we often lack data. We might know the predictors, but be unable to get the data
  • Even simple indicators are tricky
    • GDP is produced by government agencies
    • Some don’t even report national economic statistics
  • Events are rare
    • Most states are “safe”
    • Many states are obviously at risk
    • a small set is uncertain
    • Note: rare events \(\neq\) Black swans:
      • Black Swan: an event that has a low probability even conditional on other variables
      • Rare event: an event that occurs infrequently, but conditional on an appropriate set of variables, does not have a low probability
  • Heterogenous environment
    • is the system changing significantly while we are trying to model it? How far back are data still relevant?
    • Changing nature of conflict
      • In 1910:
        • Gunboat diplomacy” was an accepted norm, as were elements of bellicose and social Darwinism
        • Some competition occurred between approximate equals
        • Mediation was ad hoc with no established international institutions
        • Territorial change was credible
      • Threats in 2010
        • Highly asymmetric distribution of military power
        • Threats get almost immediate attention from potential mediators, including the UN
        • Non-military sanctions are credible (Iraq, Iran)
        • Territorial changes are rare and highly problematic
    • Will changes in the technological environment—internet, UAVs, various monitoring technologies—change probabilities?

See also Why the World Can’t Have a Nate Silver “This problem bears some resemblance to forecasting U.S. presidential elections, in which most of the 50 states dependably vote Democrat or Republican; the hard part is predicting the dozen or so swing states. In international politics, there are many cases that seem reliably”immune" to certain crises, and there’s often also a small but self-evident set of usual suspects. It’s the small but critical set of cases in between those two extremes that make us work to earn our paychecks."

6 Going further

6.1 Challenge yourself!

Good Judgement Project website

6.2 Further readings

  • Philip Tetlock. Expert Political Judgment (remarkable book)
  • Nassem Nicholas Taleb. The Black Swan (very self-absorbed but entertaining)
  • Daniel Kahneman. Thinking Fast and Slow (decades of research condensed)
  • Nate Silver. The Signal and the Noise (a good overview)

6.3 What Data to use?

  • Structural indicators are too slow
  • Social media too fast
  • Event data

6.4 Existing projects