Willem Buiter – FT’s Maverecon Blog
September 10, 2009 3:00pm
Science with very few (if any) data
Doing statistical analysis on a sample of size 1 is either a very frustrating or a very liberating exercise. The academic discipline known as history falls into that category. Most applied social science, including applied economics, does too. Applied economists use means fair and foul to try to escape from the reality that economics is not a discipline where controlled experiments are possible. The situation that an economically relevant problem can be studied by means of a control group and a treatment group that are identical as regards all but one external or exogenous driver, whose influence can as a result be isolated, identified and measured, does not arise in practice.
Time series analysis (which establishes associations or correlations between measurable variables over time), cross section analysis (which studies associations or correlations between between variables observed during a common time period but differing by some other criterion (say location, individual or family identity or whatnot)), and panel data analysis, which combines time-series and cross-section data and methodology, all struggle (mostly in vain) with the problem that the economic analyst cannot control for all relevant influences on the behaviour of the phenomenon he is investigating, GDP growth or unemployment, say. The reason for this is first and foremost that the investigating economist doesn’t have a clue as to what should be on an exhaustive list of possible relevant drivers of GDP growth or unemployment. Second, many of the key variables he is aware of may not be measurable or only be measurable with serious errors. Expectations are an example. Finally, the long list of possible explanatory variables that are omitted from consideration are extremely unlikely to be statistically independent of the errors or residuals from a statistical or econometric analysis/estimation that uses just a truncated set of explanatory variables. The result: biased and inconsistent estimates of parameters and other features of the relationship between the explanandum and the explanans.
Economists have made a growth industry of seeking out or concocting quasi-natural experiments that might look like controlled experiments in the natural sciences. Freakonomics: A Rogue Economist Explores the Hidden Side of Everything by Steven Levitt and Stephen J. Dubner contains a sampling of this kind of work.
I have not read a single one of these quasi-natural experiment studies where one could not easily come up with a long list of potentially relevant omitted explanatory variables. Such ‘unobserved heterogeneity’ means that other things were definitely not equal, and the attribution of cause and effect is bound to be flawed. In addition, the determined chase for yet another quasi-controlled or natural experiment has led many economists to look under the lamppost for their missing knowledge, not because that is where they lost it, but because that’s where the light is. A flood of cute but irrelevant studies of issues of no conceivable economic significance has been undertaken simply because a cute but irrelevant natural experiment had been conducted.
Experimental economics was the last refuge of the empirically desperate. It mostly involves paying a bunch of undergraduates (or, if you have a very small budget, graduates) to play a variety of simple games with monetary and occasionally non-monetary pay-offs for the participants. The actions of the players in these highly artificial settings – artificial if only because the players are aware they are the guineau pigs in some academic experiment – are meant to cast light on the behaviour of people in real-word situations subject to similar constraints and facing similar incentives. Halfway between proper experimental or laboratory economics and natural experiments are ‘constructed experiments’ (aka randomised experiments) in which a (team of) economists conducts an experiment in a real-world setting and uses randomised evaluation methods to make inferences and test hypotheses. Typically, the guinea pigs in such randomised experiments are a selection of Indian villagers or other poor population groups – a little research grant goes a long way in a very poor environment. Again, the intentions are good, but the ceteris paribus assumption – that all other material influences on behaviour have either been controlled for or don’t distort the results of the study because they are independent of the unexplained component of behaviour in the constructed experiment – is (a) untestable and (b) generally implausible.
Of course, economics and the other social sciences are not alone in being bereft of meaningful controlled experiments. Two of the jewels in the ‘hard’ or natural science crown, cosmology and the theory of evolution, provide generous companionship for the social science waifs. Scientists at CERN may be able (wittingly or unwittingly) to create little bangs and mini black holes a few miles below ground in Switzerland and France, but this does not amount to a test of the big bang theory. They have no more than one dodgy observation on that. Evolutionary biologists may be able to observe evolution at work in real time ‘in the small’, that is, in microorganisms, butterflies etc. but they don’t have replications of the 4.5 billion year history of the earth, through a collection of parallel universes each of which differs from the others in one observable and measurable aspect only. This does not mean that anything goes. Finding fossils that can confidently be dated to be around 150 million years old makes rather a hash of strict young earth creationist accounts that date the creation of the universe to somewhere between 5,700 and 10,000 years ago. Other propositions, like intelligent design, cannot be proven or disproved and are therefore not scientific in nature.
So what is the poor economist to do when confronted with the need to make predictions or counterfactual analyses? Fundamentally, you pray and you punt.
How do we evaluate the reliability or quality of such forecasts and analyses? Ex-post, by matching them up against outcomes, in the case of forecasts. This is not straighforward – very few economists make completely unconditional forecasts, but it is in principle feasible. In the case of a counterfactual analysis where the counterfactual policy action or event did not take place – who knows? A necessary test of the acceptability of a counterfactual argument is its logical coherence – its internal consistency. That probably gets rid of about 90 percent of what is released into the public domain, but still leaves us with a lot of counterfactual propositions (and forecasts that can not yet be tested against outcomes).
For the counterfactual proposition and the forecasts beyond today that survive the test of logical coherence, all we have is the one data point of history to lend some plausiblity.
What will be the likely shape of the recovery in the overdeveloped world?
What does history teach us about the likely shape of the recovery in the overdeveloped world following the bottoming out of global GDP? Is the financial collapse phase of the Great Depression a good guide? Probably not, because the banking sector and the financial system of the late 1920s and early 1930s were so different from those that went down the chute starting August 2007. A financial sector that is one layer deep, with banks funding themselves mainly from household deposits and investing mainly in loans to the non-financial business sector, is a very different animal from the multi-layered financial sector in the second half of the first decade of the 21st century.
The modern banking system is part of a multi-layered financial sector. It funds itself to a large degree in financial wholesale markets and has on the asset side of its balance sheet many financial instruments issued by other banks and by non-bank financial institutions, including off-balance sheets vehicles of the banks. Rapid deleveraging of a 1930s-style single-layer financial system is bound to be massively disruptive for the real economy. Rapid deleveraging of a modern multi-layered financial system needs not be massively disruptive for the real economy, although it will be if it is done in an uncoordinated, voluntary manner. Since most liabilities (assets) of banks are assets (liabilities) of other banks and non-bank financial institutions, an intelligent, coordinated netting or write-down of intra-financial sector assets and liabilities is technically possible without significant impact on the exposure (on both sides of the balance sheet) of the financial system as a whole to the non-financial sectors.
There are many legal and political obstacles to such a de-layering of financial intermediation – it amounts to the temporary imposition of a form of central planning on the key banks and their financial sector counterparties, but it could be done if the political will were there.
This important difference between the situation of the 1930s and that facing us today makes one more optimistic about the pace of the recovery to come. Against that, much of the tentative insight I have gained about the financial crisis has not come from the lessons of the 1930s but from emerging markets crises since the 1970s. Except for the important qualifier that the US dollar is a global reserve currency, and that the US government (and private sector) has most of its domestic and external liabilities denominated in US dollars, the pathologies of financial boom, bubble and bust in the US, the UK, Iceland, Ireland and Spain (and many of the Central and East European emerging market economies) track those of classical emerging market crises in South America, Asia and CEE in the 1990s, rather well.
The emerging market analogy makes one less optimistic about a robust recovery, as typically, emerging markets whose financial sector was destroyed by a serious financial crisis took many years to recover their pre-crisis growth rates and often never recovered their pre-crisis GDP paths.
But cleary, there are many differences in economic structure, policy regimes and external environment between the overdeveloped world of today and either the industrialised world of the 1930s or the emerging markets of the 1980s and 1990s. For starters, we are now aware of what happened in the 1930s and in the emerging markets (the arrow of history flies only in one direction). Another key difference is that today’s emerging markets and developing countries, whose domestic financial sectors have not been destroyed by the financial crisis, add up to 50 percent of global GDP. Even if China itself cannot be a global locomotive (not even the little engine that could), a sufficient number of emerging markets jointly could lead the global economy out of recession. Controlling for all the things that make ceteris non paribus is a hopeless task. But just because it is hopeless does not mean that we can avoid it. Decisions have to be taken and choices have to be made on the basis of the best available information and analysis, even if the best is pretty crummy. It is, however, key that if it is indeed the case that the best is no good, we admit to this and don’t pretend otherwise. False certainty can be deadly.
Earlier this week, Joseph Stiglitz, told Bloomberg that the U.S. economy faces a significant chance of contracting again after emerging from the worst recession since the Great Depression of the 1930s.
“There’s a significant chance of a W, but I don’t think it’s inevitable,” … . The economy “could just bounce along the bottom.” Sure, but how does he know it whether it will be a V, a W, a rotated and reflected L, a W or double dip, a triple jump or a quadruple bypass? With a sample of at most size one to underpin one’s forecast, the quality of the argument/analysis that produces and supports the forecast is essential. The forecast itself is indeed likely to be much less important than the reasoning behind it. If the conclusion is open-ended, the story better be good.