This is a follow-up to last month’s post, in which I considered the use of panel data to answer questions about causal ordering: does x cause y or does y cause x? In the interim, I’ve done many more simulations to compare the two competing methods, Arellano-Bond and ML-SEM, and I’m going to report some key results here. If you want all the details, read my recent paper by clicking here. If you’d like to learn how to use these methods, check out my seminar titled Longitudinal Data Analysis Using SEM.
Quick review: The basic approach is to assume a cross-lagged linear model, with y at time t affected by both x and y and time t-1, and x at time t also affected by both lagged variables. The equations are
yit = b1xi(t-1) + b2yi(t-1) + ci + eit
xit = a1xi(t-1) + a2yi(t-1) + fi + dit
for i = 1,…, n, and t = 1,…, T.
The terms ci and fi represent individual-specific unobserved heterogeneity in both x and y. They are treated as “fixed effects”, thereby allowing one to control for all unchanging characteristics of the individuals, a key factor in arguing for a causal interpretation of the coefficients. Finally, eit and dit are assumed to represent pure random noise, independent of any variables measured at earlier time points. Additional exogenous variables could also be added to these equations.
Conventional estimation methods are biased because of the lagged dependent variable and because of the reciprocal relationship between the two variables. The most popular solution is the Arellano-Bond (A-B) method (or one of its cousins), but I have previously argued for the use of maximum likelihood (ML) as implemented in structural equation modeling (SEM) software.
Last month I presented very preliminary simulation results showing that ML-SEM had substantially lower mean-squared error (MSE) than A-B under a few conditions. Since then I’ve done simulations for 31 different sets of parameter values and data configurations. For each condition, I generated 1,000 samples, ran the two methods on each sample, and then calculated bias, mean squared error, and coverage for confidence intervals. Since the two equations are symmetrical, the focus is on the coefficients in the first equation, b1 for the effect of x on y, and b2 for the effect of y on itself.
The simulations for ML were done with PROC CALIS in SAS. I originally started with the sem command in Stata, but it had a lot of convergence problems for the smaller sample sizes. The A-B simulations were done in Stata with the xtabond command. I tried PROC PANEL in SAS, but couldn’t find any combination of options that produced approximately unbiased estimates.
Here are some of the things I’ve learned:
Under every condition, ML showed little bias and quite accurate confidence interval coverage. That means that about 95% of the nominal 95% confidence intervals included the true value.
Except under “extreme” conditions, A-B also had little bias and reasonably accurate confidence interval coverage.
However, compared with A-B, ML-SEM always showed less bias and smaller sampling variance. My standard of comparison is relative efficiency, which is the ratio of MSE for ML to MSE for A-B. (MSE is the sum of the sampling variance plus the squared bias.) Across 31 different conditions, relative efficiency of the two estimators ranged from .02 to .96, with a median of .50. To translate, if the relative efficiency is .50, you’d need twice as large a sample to get the same accuracy with A-B as with ML.
Relative efficiency of the two estimators is strongly affected by the value of the parameter b2, the effect of yt-1 on yt. As b2 gets close to 1, the A-B estimators for both b1 and b2 become badly biased (toward 0), and the sample variance increases, which is consistent with previous literature on the A-B estimator. For ML, on the other hand, bias and variance are rather insensitive to the value of b2. Here are the numbers:
|Rel Eff b1||Rel Eff b2|
Relative efficiency is strongly affected by the number of time points, but in the opposite direction for the two coefficients. Thus, relative efficiency for b1 increases almost linearly as the number of time points goes from 3 to 10. But for b2, relative efficiency is highest at T=3, declines markedly for T=4 and T=5, and then remains stable.
|Rel Eff b1||Rel Eff b2|
Relative efficiency is also strongly affected by the ratio of the variance of ci, (the fixed effect) to the variance of eit (the pure random error). In the next table, I hold constant the variance of c and vary the standard devation of e.
|Rel Eff b1||Rel Eff b2|
Relative efficiency is not strongly affected by:
- Sample size
- The value of b1
- The correlation between ci and fi, the two fixed-effects variables.
Because ML is based on the assumption of multivariate normality, one might suspect that A-B would do better than ML if the distributions were not normal. To check that out, I generated all the variables using a 2-df chi-square variable, which is highly skewed to the right. ML still did great in this situation, and was still about twice as efficient as A-B.
In sum, ML-SEM outperforms A-B in every situation studied, by a very substantial margin.