Sunday, February 14, 2010

The Adventures of Scattershot Billy: Malaysia's Demographics Part III Continued

[Continued from last post]

Model II

This second attempt at finding a relationship between population characteristics and income uses a both a broader and narrower approach than that in the first post in the series. Rather than taking one country and tracking the data over time, a cross-sectional approach uses more countries (to ensure more universal applicability) but at only one point in time.

I've culled and cross referenced the IMF's World Economic Outlook database (for income) and the US Census Bureau's International Data Base to see if a relationship exists between my two population structure variables of choice (median age and dependency ratio) and income (in this case, GDP per capita in international dollars). I have a sample size of 181 countries with the data I'm looking for, out of the approximately 210 available from both databases - all data is from 2007, as 2008 data for some countries still consists of estimates.

The best way to show the relationship between the data is through scatter diagrams, with each data point representing an individual country:


Some observations:

  1. It's clear that a low population median age is associated with low income, but the opposite cannot be said to hold - a high population median age does not guarantee a high income economy. This is partly due to the effect of some countries in the sample which have generally done badly, or are only just starting to take off after policy/strategy mistakes, e.g. the ex-Soviet bloc countries, which had relatively mature stable populations but had not benefited from following a market system until recently. However, this also means that any regression attempt must be qualified by a relatively large confidence range.
  2. The dependency ratios substantiate the stylized facts - "young" countries tend to have lower incomes, while "older" countries tend to have higher incomes. But as with the median age, the experience between countries varies widely, depending on their particular circumstances. It's interesting to note that there's a "floor" to the youth ratio and a "ceiling" for the old age ratio, beyond which countries rarely cross.
To minimize problems with this wide variance in experience, I trimmed the sample by cutting out the top 10 with the highest GDP per capita. Hopefully, this will give me a more "normal" relationship between the variables and income that would be more typical of the experience most countries would undergo. Eyeballing the scatter charts, there's little difference to see except between age and income:

...which looks better behaved, and will hopefully yield results more representative of "true" relationship.

Running the regressions (based on the trimmed sample), we get:
  1. LOG(GDP_TRIM) = 3.40*LOG(AGE_TRIM) - 2.36
  2. LOG(GDP_TRIM) = -3.32*LOG(RATIO_T_TRIM) + 6.92
  3. LOG(GDP_TRIM) = -2.03*LOG(RATIO_Y_TRIM) + 7.05
  4. LOG(GDP_TRIM) = 1.35*LOG(RATIO_O_TRIM) + 11.84
  5. LOG(GDP_TRIM) = 2.73*LOG(AGE_TRIM) - 0.81*LOG(RATIO_T_TRIM) - 0.61

...which gives the same interpretation that we found in the single country example in the first post in this series i.e. higher median age and a higher old age dependency ratio are associated with higher income, while the opposite relationship exists for the total dependency and youth dependency ratios. Comparing the two sets of estimates, the biggest differences in the coefficients are with median age (3.40 here against 6.86) and to a lesser extent, with the old age dependency ratio (1.35 against 2.80).

The best two fits are with regression 1 and regression 4, but having said that, none of these regressions yield future income estimates that make plausible sense. So while this approach has yielded some nice visual representations, it's a failure as a conduit for forecasting Malaysia's future income.

From a certain perspective that's understandable - this approach would best describe the "global population", and ignore any country specific differences that may influence income. Given that Malaysia is a little unusual in having a "younger"-than-normal population for its income level, we're not likely to get usable forecasts. On the other hand, we should at least allow for this more "global" experience to influence any other forecasts we make, as there is alwasy the possibility that Malaysia in future would "revert" back to the mean of the global experience.

That leads to the last approach, where we combine time series (from the first post) with cross-section data (this post). Coming up next.

Technical Notes
  1. GDP data from the IMF World Economic Outlook Database (April 2009)
  2. Population estimates from the US Census Bureau International Data Base

No comments:

Post a Comment