“I’m so bored. I hate my life.” - Britney Spears

Das Langweilige ist interessant geworden, weil das Interessante angefangen hat langweilig zu werden. – Thomas Mann

"Never for money/always for love" - The Talking Heads

Thursday, January 19, 2006

plausible vs. miracle counterfactuals

Richard Lebow has written with Philip Tetlock, whose new book on what is wrong with experts we have referred to in LI before, and he wrote this wonderfully clarifying piece – a real Draino of a scholarly article – for World Politics. What’s so different about a counterfactual is a review of the use of counterfactuals in political science and history, with Lebow’s target being Niall Ferguson. Frankly, we aren’t convinced that all of Lebow’s objections to The Pity of War are valid. But we are convinced that Lebow does everybody a service by clearly laying out the protocols of counterfactual use.

What does this mean? For one thing, it demystifies the prediction business. It also helps us understand the blind use of analogies and patterns to explain historical instances – one remembers the nutty use of the occupation of Japan as a template for the occupation of Iraq, which targeted occupation as if all occupations are alike. Perhaps, to paraphrase Tolstoy, all happy occupations are alike, and all unhappy ones are different.

Lebow summarizes his goals in his paper like this:

“I begin my essay with the proposition that the difference between so-called factual and counterfactual arguments is greatly exaggerated; it is one of degree, not of kind. I go on to discuss three generic uses of counterfactual arguments and thought experiments. In the process, I distinguish between “miracle” and “plausible” world counterfactuals and identify the uses to which each is suited. I critique two recent historical works that make extensive use of counterfactuals and contend that they are seriously deficient in method and argument. I then review the criteria for counterfactual experimentation proposed by social scientists who
have addressed this problem and find many of their criteria unrealistic and overly restrictive. The methods of counterfactual experimentation need to be commensurate with the purposes for which they are used,and I conclude by proposing eight criteria I believe appropriate to plausible-world counterfactuals.

Counterfactuals are “what if” statements, usually about the past. Counterfactual experiments vary attributes of context or the presence or value of variables and analyze how these changes would have affected outcomes. In history and political science these outcomes are always
uncertain because we can neither predict the future nor rerun the tape of history.”

As an example of counterfactual use in policy-making, he uses an example beloved by conservatives: the notion that the appeasement of Hitler taught us all a lesson about the proper use of force in foreign policy:

“The controversy surrounding the strategy of deterrence provides an example of the use of counterfactuals in international relations. One of the principal policy lessons of the 1930s was that appeasement whets the appetites of dictators while military capability and resolve restrains them. The failure of Anglo-French efforts to appease Hitler is well established, but the putative efficacy of deterrence rests on the counterfactual that Hitler could have been restrained if France and Britain had demonstrated willingness to go to war in defense of the European territorial status quo. German documents make this an eminently researchable question, and historians have used these documents to try to determine at what point Hitler could no longer be deterred.”

As Lebow points out later, the appeasement model played a large role in Kennedy’s decision-making process concerning Soviet missiles in Cuba. Kennedy felt like Khruschev was encouraged to try that gamble because Kennedy had been too weak at the Bay of Pigs and in Berlin. But, as Lebow points out, “Evidence from Soviet and American archives and interviews with former officials make it possible to explore the validity of most of these counterfactuals and thus to evaluate the choices of Soviet and American leaders and the
subsequent scholarly analyses of the crisis.” The evidence points to the fact that Khruschev’s play was motivated not by the perception of Kennedy’s weakness, but by fear of American aggression:

“After Cuba, former Kennedy administration officials and many scholars maintained that
Khrushchev would not have deployed missiles in Cuba if Kennedy had been more decisive at the Bay of Pigs, at the Vienna summit, and in Berlin. There was no evidence to support this interpretation, but it became the conventional wisdom and helped to shape a host of subsequent policy decisions, including the disastrous intervention in Vietnam. The evidence that came to light in the Gorbachev era suggested, to the contrary, that Khrushchev decided to send missiles se cretly to Cuba because he overestimated Kennedy’s resolve. He feared
that Kennedy, preparing to invade Cuba, would send the American navy to stop any ships carrying missiles to Cuba to deter that invasion.”

If there is a lesson in counterfactuals, here, for the current situation with Iran, it might be that the U.S. should make public its lack of interest in acting aggressively against Iran. After all, the nuclear power program has been going on for thirty years in Iran, but it became a priority only after Bush, favoring the usual clueless adolescent phrase that thrills his followers and endangers the rest of us, labeled Iran part of the axis of evil. So much for Iran’s help during the Afghanistan war.

But to return to affairs sub specie aeternitatus. Lebow’s examination of counterfactuals covers not only the logic of their use by would-be policy-makers and historians, but the beliefs about counterfactual that animate policymakers. One should always remember that the average brilliant D.C. thinktanker is usually as ignorant as a drunk on a moonless night when it comes to having any feeling about the nations he advises on and worse than ignorant about counterfactuals. In other words, you could get better advice from a machine into which he slotted quarters than you can from your Wolfowitz types, who are harmless when put in well paying cages in, say, Johns Hopkins, but are armed and should be considered dangerous when appointed to any position of responsibility. Given that simple rule makes parsing what comes out of the Wizard of Oz voices in the media much easier – it will mostly be bullshit leavened with a heaping helping of psychosis.

Here is Lebow, with his much less aggressive summary of the case:

“International relations theorists seek to understand the driving forces behind events; they usually do so after the fact, when the outcome is known. The process of backward reasoning tends to privilege theories that rely on a few key variables to account for the forces allegedly responsible for the outcomes in question. For the sake of theoretical parsimony, the discipline generally favors independent variables that are structural in nature (for example, balance of power, state structure, size and nature of a coalition). The theory-building endeavor has a strong bias toward deterministic explanations and on the whole downplays understandings of outcomes as the products of complex, conjunctional causality.
A recent survey of international relations specialists revealed that those scholars who were most inclined to accept the validity of theories (for example, power transition, nuclear deterrence) and theory building as a scholarly goal were the most emphatically dismissive of
plausible-world counterfactuals. They were also most likely to invoke second-order counterfactuals to get developments diverted by counterfactuals back on the track.
In retrospect, almost any outcome can be squared with any theory unless the theory is rigorously specified. The latter requirement is rarely met in the field of international relations, and its deleterious effect is readily observed in the ongoing debate over the end of the cold war. Various scholars, none of whose theories predicted a peaceful end to that conflict, now assert that this was a nearly inevitable corollary of their respective theories.”

The smooth workings of vanity fingered in the last graf can be seen among the liberal hawks today, who are re-engineering their support so that, of course, they had simply not factored in George Bush. My, if Al or Hillary had been in power, how wonderfully our occupation would have gone! Just like occupations and wars supervised by Democrats in the past…

And so the examples pile up. But this is the part I really wanted to quote: Lebow’s theory of counterfactual benchmarks.

Lebow divides his theory of counterfactual use between plausible and miracle counterfactuals, each of which has its uses. Plausible counterfactuals must be embedded in what is possible in the circumstances in which it is inserted. So, for instance, one can’t simply make the analogy from the occupation of Spain by Napoleon to the occupation of Iraq by the Americans (a favorite LI example) without taking into account the lesser level of military technology in 1809, for example, or the information flow that goes both ways, immediately, in the Iraq war. What Lebow calls “miracle” counterfactuals posit something that would require a miracle – for instance, Napoleon returning from St. Helena in 1850 – when he was long dead. Hitler returning from Brazil, to which he never fled to in the first place. ”Miracle counterfactuals are particularly useful in evaluating existing interpretations,” as Lebow points out. They aren’t so much planning or prediction devices as ways to shake out hidden assumptions in a narrative.

Then there is this second benchmark: “Plausible counterfactuals must meet a second test: they must have a real probability of leading to the outcome the researcher intends to bring about.” This, again, allows us to sort through the steps that lead from one thing to another in the counterfactual. Which brings us to the common problems of counterfactuals, the awareness of which could constitute another benchmark.

3. The second benchmark naturally leads to criteria for real possibilities. “There is no consensus about what constitutes a good counterfactual, but there is a common recognition that it is extraordinarily difficult to construct a robust counterfactual—one whose antecedent we can assert with confidence could have led to the hypothesized consequent.There are three reasons for this well-warranted pessimism: the statistical improbability of multistep counterfactuals, the intercon- nectedness of events, and the unpredictable effects of second-order counterfactuals.”

It is extraordinary how often these problems are simply ignored. The pisspoor planning for the war in Iraq – a war the end of which the planners, apparently, falsely conceptualized – came about partly from ignoring these three reasons. Lebow provides an exemplary explanation of this:

“The probability of a consequent is a multiple of the probability of each counterfactual linking the hypothesized antecedent to it. Suppose I contend that neither world war nor the Holocaust would have occurred if Mozart had lived to the age of sixty-five.

Having pushed classical form as far as it could go in the Jupiter Symphony, his last three operas,and the requiem, Mozart’s next dramatic works would have been the precursors of a new, “postclassicist” style. He would have created a viable alternative to romanticism that would have been widely imitated by composers, writers, and artists. Postclassicism would have kept the political ideas of the Enlightenment alive and held romanticism in check. Nationalism would have been more restrained, and thus Austria-Hungary and Germany would have undergone very different political evolution. This alternative and vastly preferable world has at least five counterfactual steps linking antecedent to consequent: Mozart must survive to old age and develop a new style of artistic expression;subsequent composers, artists, and writers must imitate and elaborate it; romanticism must become to some degree marginalized; and artistic developments must have important political ramifications. This last counterfactual presupposes numerous other enabling counterfactuals
about the nature of the political changes that will lead to the hypothesized consequent (for example, internal reforms that resolve or reduce the threat of internal dissolution of Austria-Hungary, German unification under different terms, or at least a Germany satisfied with the status quo, no First World War, no Hitler and no Holocaust without Germany’s defeat in World War I). Even if every one of this long string of counterfactuals had a probability of at least 50 percent, the overall probability of the consequent would be a mere .03 for five steps and a frighteningly low .003 for eight steps. This particular counterfactual may appear far-fetched, but most interesting counterfactuals are no less improbable statistically. They may start with a tiny and plausible alteration of the real world but then infer numerous follow-on developments to end up with a major change in reality.”

As Lebow points out, scholars often cheat by simply assuming one change in a historial scenario and preserving the facts as we know them in the rest of the scenario. To solve the problem – that is, to keep from looking like total idiots – scholars have proposed various ways of restricting counterfactual use:

Recognition that counterfactual arguments often have indeterminate consequences has prompted scholars to impose restrictive criteria on their use. Fearon proposes a proximity criterion. We should consider only those counterfactuals in which the antecedent appears likely to bring about the intended consequent and little else. Counterfactuals, he suggests, must be limited to cases where “the proposed causes are temporally and, in some sense, spatially quite close to the consequents.” However, this seems to me mere wishful thinking, the wish being that events line up in a linear way in history. But it is easy to see how small changes could cause big ones. If a particular rifle company had manufactured rifles with a defect in them in 1960 and if one gunman had had his gun blow up in his hand in 1963, history would certainly have been different in many and unpredictable ways.

Another problem is, of course, that theories which test themselves against counterfactuals often only consider confirming counterfactuals.

“Based on an examination of the American literature on Iran, Herrmann and Fischerkeller conclude that “too often in world politics what is taken as a base rate for a generalization about the motives of another country is too much an ideological conviction and too
little a product of deductive and empirical behavioral science.”

Lebow and Stein have documented the same phenomenon with respect to deterrence; data sets used to test the strategy of deterrence were patently ideological in the cases they recognized as deterrence encounters and coded as successes for the West.”

A good example of this is the curious American idea that bombing or other forms of violence that are inflicted against Americans will justly cause a hostile response – but inflicted by Americans against other peoples, will cause the other peoples to fall in love with Americans, just as Krazy Kat would heart Ignatz after he bopped her with a brick. The mistake of thinking that the love foreigners hold for Americans only deepens when we kill their children derives, perhaps, from the post-war occupations of Japan and Germany. It rather ignores the circumstance that made those occupations possible: fear of the Soviet Union.

This brief survey of Lebow’s remarkable article doesn’t do it justice, but I hope it puts in perspective any “predictions” LI makes. We are going to be wrong about quite a bit, but some things we will be right about just because we are aware of the counterfactual traps.

No comments: