This is going to be a technical post – feel free to skip to the end if you’re not comfortable with the math.
There’s apparently some question of why the ETP has a target of USD15,000 in per capita GNI by 2020, and how it was derived.
Scepticism over the target has always struck me as a little strange because of the uncertainty of forecasting over such a long horizon, and the desire for pseudo-precision involved in assuming such targets are meant to be operative. I mean – if you manage to get to USD14,999, does that mean we’ve failed? How about USD15,001? Would USD15,002 be significantly better?
Given the imprecision in aggregate economic statistics, particularly national account statistics, the desire for such accuracy is an exercise in futility. Nevertheless, doing a forecast projection is a more or less trivial exercise, which I’m going to try to demonstrate.
First the data – the ETP target is actually derived from the World Bank’s high income classification, which can be downloaded here, and the sample runs from 1988 to 2012. I had to fudge the data a bit, because the World Bank’s financial year starts in July, rather than January. And the reason why that’s important is because at an operational level, the income thresholds that the Bank uses is actually meant for determining loan eligibility.
The plot of the data is as below (GNI per capita in USD):
Note that while it is mostly increasing over time, the high income threshold isn’t exactly a linear progression. That presents a bit of a problem which I’ll get into later.
The model I’m using is an AR(1) model with a time trend:
Ln(GNI) = α + β*T + γ*AR(1) + ε
…where T is time. The auto-regressive term helps manage a serial correlation problem that’s present in the standard regression estimation.
Here ‘s the result of the preliminary run through:
Ln(GNI) = 8.15 + 0.02*T + [AR(1)=0.76] R2 = 0.94
All the coefficients are statistically significant at the 0.1% level. There’s significant heteroscedasticity in the residuals, so the estimation uses White’s heteroscedasticity-consistent errors.
So what does all those numbers mean? Essentially, for every unit increase in T, Ln(GNI) increases by 0.02. In EconoEnglish: the high income threshold increases 2% every year. To be more precise, it increases by 2.04448753251947% every year, more or less. Yes, that was sarcasm.
Projecting that forward, we obtain a point estimate of USD14,492.89 by 2020 and a 95% confidence range forecast of USD16,470 to USD12,515 (standard error of USD1,149.7). The USD15,000 target is approximately just under the 66% boundary, i.e. there’s about a one-third probability of the high income threshold actually exceeding that level based on this projection.
In other words, the USD15,000 target is a “stretch” target that’s still achievable, and gives a better than 50% chance of Malaysia being classed as a high income country if we actually get there.
Now, to return to that problem I mentioned – the heteroscedasticity problem means that the estimated coefficients and the forecast projections can be sensitive to the sample used in the estimation. Robustness checks via changing the sample size resulted in significant changes in the estimated coefficients and the final forecast.
However, going back to just 2009, which would be the latest known data point before the publication of the New Economic Model which originally established all these targets, doesn’t alter the results all that much:
Sample range: 1988-2009
Ln(GNI) = 8.16 + 0.02*T + [AR(1)=0.76] R2 = 0.92
Coefficient of T: 2.026672335140296%
Std error: USD1,336.06
Point Forecast: USD14,434.73
Range Forecast: USD16,751-USD12,118
What it all boils down to is – the 2.0% adjustment factor that PEMANDU has been using to project forward the existing World Bank high income threshold to 2020 and set a target of USD15,000 for the ETP, has some statistical validation.
You could of course try the long way around and make a projection based on assumptions for the variables embedded in the World Bank’s calculation – exchange rates and GDP deflators for both Malaysia and the G5. But since all that information is incorporated in the high income time series, that’s probably more trouble than its worth. The univariate approach used here is more parsimonious, and likely to be less error-prone.