Friday, March 6, 2009

The Pitfalls of Econometrics

de minimis has an interesting post advocating using more econometric models to guide policy making in Malaysia. While I tend to lean that way myself (there's too much unsubstantiated rhetoric flung around the news and blogosphere for my taste), I don't want to be blind to the potential pitfalls and shortcomings of an applied econometric approach to policy. So this post is both to clarify some of the issues, as well as serve as a reminder to myself not to be too "assertive", as my wife puts it.

First is that econometric modeling (as etheorist remarked the other day) is really an art, not a science. There are many, many ways of looking at an economy and generating forecasts, from simple time series techniques to hideously complex dynamic general equilibrium models. So model choice and specification (as well as accompanying underlying assumptions), and not to mention the ideological bent of the modelers, can lead to very different conclusions about the state of the economy at any given time. The issue is compounded by Malaysia being such an open economy, which means that ideally, you'd have to incorporate all the major trade partners into your model as well.

Secondly, the evolving economic structure within a developing country means that even if you do come up with a model close to reality at some point in time, it might be out of date very quickly later on and you won’t know it until something goes wrong. This is one point where I would be critical of DOS: the Malaysian input-output tables haven’t been updated in years, and you need this to model intra-industry dynamics.

Thirdly, any econometric model necessarily uses historical data, which means there will always be an unobservable error component in any forecast in the presence of a current shock. A corollary of this is that, almost by definition, a trade shock such as we just suffered cannot be predicted on the basis of concurrent data. Models are more useful as a predictive guide to inventory driven recessions and business cycle downturns. You can of course use models to predict what happens when a shock occurs, but not when or if a shock will occur.

Fourth, data accuracy is inversely proportionate to the speed at which data is published. In other words, the faster you publish it, the larger the error rate. Where I think DOS can improve on that score is to follow the general practice in the EU, US and yes, Singapore, i.e. issue advanced, preliminary, and final estimates of major statistical series. The current practice of a 6-week to 8-week lag and quietly revising the historical series, isn't transparent (the loose hair around my workplace is testament to that). Data revisions should always be made clear, especially for national accounts data, which has to be revised even 2-3 years down the road.

One exception to this observation is financial sector data, which is available very quickly. (Side note: I've visited BNM to study their data gathering process, and I was a member of one of the teams responsible for implementing CCRIS in one of our banks - I am very impressed with BNM's operation in this instance. The disaggregated trial balance of the entire banking system is available at about t+4 after every month end – in other words, don’t be fooled by the monthly publishing schedule). (Side note to the side note: this is one reason why monetary policy is generally the first recourse in any crisis – you have better data much faster than real economy data).

However, I should point out that a 2-month to 3-month lag hits the sweet spot between accuracy and timeliness, and is fairly typical worldwide. China for instance tends to issue data on a 1 month lag, but subsequent revisions tend to be very large. Some Canadian series have no revisions at all, but you have to wait 6 months(!) to get them. I cannot fault DOS on that score, though they have made some absolute boo-boos before (pay attention to 2004-2005 trade data before and after revision, for instance).

Fourth: some of the most critical variables required for a predictive model are unobservable. For example, consumer and investor expectations have a big impact on private consumption and investment, but can’t be quantified. It’s possible to use proxies, such as consumer confidence or business expectations surveys, but these are subject to error as well.

Take all the factors above, and you shouldn’t be surprised that most whole economy econometric models have very little predictive power more than a quarter or two ahead, out of sample. I’d note that I’d be very surprised if the government doesn’t have on hand some whole economy econometric models, especially for trade and tax policies. Could more use be made of modeling? Absolutely! Just don’t fall for the promise that they’ll be a panacea and perfect guide to policy.


  1. bro

    I'm glad that my diatribe finds some resonance with you. As you are no doubt a master of the "dark art" of fortune-telling via numbers (a more colourful way to describe econometrics) I fully agree with your caveats.

    That said, I still think the country's economic planning could do with more forecasting so that the economics-type ministers won't look so miserably out of touch with clear signals on economic trends. Surely there are better uses for 2-month old data i.e. use it in a qualified manner to project ahead, rather than to make statements in March about December data.

    That's why I issued the diatribe. The people in charge needs the blogging equivalent of defillibration from time to time and, sometimes, all the time :D

  2. "As you are no doubt a master of the "dark art" of fortune-telling via numbers (a more colourful way to describe econometrics) I fully agree with your caveats."

    I find that the more I learn, the more I realise what I don't know ;)

  3. Ah, wait till you find the wisdom in abandoning the art of forecasting the future using outdated numbers.

  4. Dear friends,

    Great. Learn a lot from HishamH's posts and discussions among yourselves.

    I read that George E. Lucas has a classic paper on "Econometric policy evaluation: A critique" and it is something to do with rational expectation. But econometric policies are still commonplace. US has lots of economic indicators. Wonder why.

    I'm interested in reading developments on econs. Is there any website where public can access economics papers ?

    I have blogged on "Financial Modelers' Manifesto" by Paul Wilmott and Emanuel Derman. Both are quants. I guess econometrists should have something like that too. :)

  5. WY,

    The Lucas critique (and it's Robert E. Lucas Jr, not George "Star Wars" Lucas ;) ) was aimed at the Keynesian-Neo-Classical synthesis hegemony of the 1950s and 60s, which tended to use system of equations models of economies (many variables were treated as exogenous i.e. plugged in from outside the model). These relied too much on historical data, and were thus vulnerable to structural economic changes.

    Lucas' innovation was to add micro-foundations to macro-modelling, thus providing some degree of robustness, specifically by assuming that economic agents were rational in a utilitarian sense e.g. maximising profit, leisure, wages etc. He won the Nobel prize for his ideas on rational expectations.


    Rational expectations have fallen seriously out of favour in the last two decades, as empirical evidence suggests that neither corporations nor individuals are rational in the Lucas sense of the term.

    Second, modern econometric models treat economic variables as endogenous, rather than exogenous, which overcomes most of Lucas' objections (see the seminal work by Sims, C.A. 1980, “Macroeconomics and Reality”, Econometrica, Vol 48 No 1 (Jan 1980) 1-48; also the work of Robert Engle, Clive Granger and Soren Johanssen on cointegration).

    For access to journal economic papers, join a library that has access to JSTOR. Most universities in Malaysia and Singapore ought to have a subscription. For American developments, NBER is the main source - access is free if your IP address identifies you as being from a qualifying country. Both REPEC and SSRN can be used to track down papers.

    Most multilateral institutions also publish working papers on economics with online access, e.g. the IMF, World Bank, OECD, Asian Development Bank etc, as do many central banks such as the Federal Reserve, Bank of England, and the European Central Bank.

    Why so many statistics? - you cannot make good policy without information. That's true even if you don't use econometric modelling.

  6. Thx for your many info especially on the literature and access.

    haha, my mistake.. I mix up between Robert & George Lucas.

    Don't know that these ARCH, cointegration, time series stuffs are seminal works. Came across them in CFA level 2 texts but couldn't understand their significances. Maybe it's time for me to read again. :)

  7. In real world terms, the Lucas Critique basically said that policy effects economic variables, which effects policy - there's a feedback loop. You cannot therefore make substantive claims on policy prescriptions, since the application of policy would change underlying economic relationships - in other words, people ain't stupid and they have eyes.

    Sims solved the Lucas critique by making all variables endogenous through a Vector Autoregressive framework. The attendant problem with the Sims approach is that you need to have enormous amounts of data for it to work. An ARDL framework has some promise because it uses less data, but is also more labour intensive.

    The concept of cointegration is extremely powerful, not just in economics, but also in the physical sciences. Not easy to prove, and can be sample sensitive, and you can get into all kinds of arguments regarding the presence of unit roots. It essentially solves the problem of weeding out potentially spurious regressions, which is common in time series - in other words, is your regression showing a 'true' relationship, or are your variables correlated simply because they are both increasing through time.

    ARCH - my experience with this (and with ARIMA) is that these models are great in-sample, but not terribly good out-of-sample - good only if all you're interested in is making inferences, and not forecasting. ARCH and ARIMA model specifications are very sensitive to the sample chosen, so you can't say with any certainty that a particular model specification is valid at any point in time. They do however solve problems that hinder statistical inferences, specifically heteroscedasticity (unstable variance) and serial correlation (intertemporal correlation). Both of these conditions result in biased estimates in regressions if not handled.