This is a followup to last month’s post, in which I considered the use of panel data to answer questions about causal ordering: does x cause y or does y cause x? In the interim, I’ve done many more simulations to compare the two competing methods, ArellanoBond and MLSEM, and I’m going to report some key results here. If you want all the details, read my recent paper by clicking here. If you’d like to learn how to use these methods, check out my new seminar titled Longitudinal Data Analysis Using SEM.
Quick review: The basic approach is to assume a crosslagged linear model, with y at time t affected by both x and y and time t1, and x at time t also affected by both lagged variables. The equations are
y_{it} = b_{1}x_{i}_{(t1)} + b_{2}y_{i}_{(t1)} + c_{i} + e_{it}
x_{it} = a_{1}x_{i}_{(t1)} + a_{2}y_{i}_{(t1) }+ f_{i} + d_{it}
for i = 1,…, n, and t = 1,…, T.
The terms c_{i} and f_{i} represent individualspecific unobserved heterogeneity in both x and y. They are treated as “fixed effects”, thereby allowing one to control for all unchanging characteristics of the individuals, a key factor in arguing for a causal interpretation of the coefficients. Finally, e_{it} and d_{it} are assumed to represent pure random noise, independent of any variables measured at earlier time points. Additional exogenous variables could also be added to these equations.
Conventional estimation methods are biased because of the lagged dependent variable and because of the reciprocal relationship between the two variables. The most popular solution is the ArellanoBond (AB) method (or one of its cousins), but I have previously argued for the use of maximum likelihood (ML) as implemented in structural equation modeling (SEM) software.
Last month I presented very preliminary simulation results showing that MLSEM had substantially lower meansquared error (MSE) than AB under a few conditions. Since then I’ve done simulations for 31 different sets of parameter values and data configurations. For each condition, I generated 1,000 samples, ran the two methods on each sample, and then calculated bias, mean squared error, and coverage for confidence intervals. Since the two equations are symmetrical, the focus is on the coefficients in the first equation, b_{1} for the effect of x on y, and b_{2} for the effect of y on itself.
The simulations for ML were done with PROC CALIS in SAS. I originally started with the sem command in Stata, but it had a lot of convergence problems for the smaller sample sizes. The AB simulations were done in Stata with the xtabond command. I tried PROC PANEL in SAS, but couldn’t find any combination of options that produced approximately unbiased estimates.
Here are some of the things I’ve learned:
Under every condition, ML showed little bias and quite accurate confidence interval coverage. That means that about 95% of the nominal 95% confidence intervals included the true value.
Except under “extreme” conditions, AB also had little bias and reasonably accurate confidence interval coverage.
However, compared with AB, MLSEM always showed less bias and smaller sampling variance. My standard of comparison is relative efficiency, which is the ratio of MSE for ML to MSE for AB. (MSE is the sum of the sampling variance plus the squared bias.) Across 31 different conditions, relative efficiency of the two estimators ranged from .02 to .96, with a median of .50. To translate, if the relative efficiency is .50, you’d need twice as large a sample to get the same accuracy with AB as with ML.
Relative efficiency of the two estimators is strongly affected by the value of the parameter b_{2}, the effect of y_{t}_{1} on y_{t}. As b_{2} gets close to 1, the AB estimators for both b_{1} and b_{2} become badly biased (toward 0), and the sample variance increases, which is consistent with previous literature on the AB estimator. For ML, on the other hand, bias and variance are rather insensitive to the value of b_{2}. Here are the numbers:

Rel Eff b1

Rel Eff b2

b2=0

0.546207

.8542228

b2=.25

0.509384

.6652079

b2=.50

0.462959

.5163349

b2=.75

0.202681

.2357591

b2=.90

0.022177

.0269079

b2=1.0

0.058521

.0820448

b2=1.25

0.248683

.4038526

Relative efficiency is strongly affected by the number of time points, but in the opposite direction for the two coefficients. Thus, relative efficiency for b_{1} increases almost linearly as the number of time points goes from 3 to 10. But for b_{2}, relative efficiency is highest at T=3, declines markedly for T=4 and T=5, and then remains stable.

Rel Eff b1

Rel Eff b2

T=3

0.243653

.9607868

T=4

0.398391

.8189295

T=5

0.509384

.6652079

T=7

0.696802

.6444535

T=10

0.821288

.6459828

Relative efficiency is also strongly affected by the ratio of the variance of c_{i}, (the fixed effect) to the variance of e_{it} (the pure random error). In the next table, I hold constant the variance of c and vary the standard devation of e.

Rel Eff b1

Rel Eff b2

SD(e)=.25

0.234526

.3879175

SD(e)=1.0

0.509384

.6652079

SD(e)=1.5

0.551913

.7790358

SD(e)=2

0.613148

.7737681

Relative efficiency is not strongly affected by:
 Sample size
 The value of b_{1}
 The correlation between c_{i} and f_{i}, the two fixedeffects variables.
Because ML is based on the assumption of multivariate normality, one might suspect that AB would do better than ML if the distributions were not normal. To check that out, I generated all the variables using a 2df chisquare variable, which is highly skewed to the right. ML still did great in this situation, and was still about twice as efficient as AB.
In sum, MLSEM outperforms AB in every situation studied, by a very substantial margin.
Does x cause y or does y cause x? Virtually everyone agrees that crosssectional data are of no use in answering this question. The ideal, of course, would be to do two randomized experiments, one examining the effect of x on y, and the other focused on the reverse effect. Absent this, most social scientists would say that some of kind of longitudinal data ought to do the trick. But what kinds of data are needed and how should they be analyzed?
In this post, I review some earlier work I’ve done on these questions, and I report new simulation results comparing the ArellanoBond method with maximum likelihood (ML) using structural equation modeling (SEM) software. ArrelanoBond is hugely popular among economists, but not widely known in other disciplines. ML with SEM is a method that I’ve been advocating for almost 15 years (Allison 2000, 2005a, 2005b, 2009). Long story short: ML rules.
I focus on panel data in which we observe y_{it} and x_{it} for i =1,…, n and t =1,…, T. The proposed linear model allows for reciprocal, lagged effects of these two variables on each other:
y_{it} = b_{1}x_{i}_{(t1)} + b_{2}y_{i}_{(t1)} + c_{i} + e_{it}
x_{it} = a_{1}x_{i}_{(t1)} + a_{2}y_{i}_{(t1) }+ f_{i} + d_{it}
The terms c_{i} and f_{i} represent individualspecific unobserved heterogeneity in both x and y. They are treated as “fixed effects”, thereby allowing one to control for all unchanging characteristics of the individuals, a key factor in arguing for a causal interpretation of the coefficients. Finally, e_{it} and d_{it} are assumed to represent pure random noise, independent of any variables measured at earlier time points.
If all the assumptions are met, b_{1} can be interpreted as the causal effect of x on y, and a_{2} can be interpreted as the causal effect of y on x. This model can be elaborated in various ways to include, for example, other predictor variables, different lags, and coefficients that change over time.
Estimation of the model is not straightforward for reasons that are well known in the econometric literature. First, the presence of a lagged dependent variable as a predictor in each equation means that conventional fixed effects methods yield biased estimates of the coefficients under almost any condition. But even if the lagged dependent variables were excluded from the equations, the error term in each equation would still be correlated with all future values of both x and y. For example, e_{2} > y_{2} > x_{3}. So, again, conventional fixed effects will produce biased coefficients.
Arrelano and Bond (1991) solved these problems by using earlier lagged values of x and y as instrumental variables and by applying a generalized method of moments (GMM) estimator. Several software packages now implement this method, including SAS, Stata, LIMDEP, and the plm package for R.
My solution to the problems has been to estimate each equation separately by ML using any SEM package (e.g., LISREL, Mplus, PROC CALIS in SAS, or sem in Stata). Two “tricks” are necessary. Focusing on the first equation, fixed effects are accommodated by allowing c to be correlated with all measurements of x (as well as the initial measurement of y). Second, the error term e is allowed to be correlated with all future measurements of x. Analogous methods are used to estimate the second equation. For details, see the SEM chapters in my 2005 and 2009 books.
In my 2005 paper, I presented simulation evidence that the MLSEM method produces approximately unbiased estimates of the coefficients under a variety of conditions. For years, I’ve been promising to do a headtohead comparison of ML with ArellanoBond, but I’ve just now gotten around to doing it.
What I’m going to report here are some very preliminary but dramatic results. The model used to generate the data was one in which x has a positive effect on y, but y has a negative effect on x:
y_{it} = .5x_{i}_{(t1)} + .5y_{i}_{(t1)} + c_{i} + e_{it}
x_{it} = .5x_{(t1)} – .5y_{i}_{(t1) }+ f_{i} + d_{it}
All variables have normal distributions, c has a positive correlation with x, f has a positive correlation with y, and c and f are positively correlated with each other. The baseline model had 5 time points (T=5), with sample sizes of 50, 100, 400, and 1600. Then, keeping the sample size at 400, I examined T= 4, and 10. For each condition I did 1000 replications.
I focus here on the coefficient for the effect of x on y in the first equation. For each condition, I calculated the mean squared error (MSE), which is the variance of the estimator plus its squared bias. There was little bias in either estimator, so the MSE primarily reflects sampling variance.
Here are the preliminary results:
Mean Squared Error for Two Estimators

Condition

MLSEM

ArrelanoBond

Relative efficiency

N=50,T=5

.0057128

.0110352

.5176833

N=100,T=5

.0027484

.0058433

.4703557

N=400,T=5

.0006348

.0014961

.4242679

N=1600, T=5

.0001556

.0003682

.4226466

N=400, T=4

.0011632

.0039785

.2923685

N=400, T=10

.0001978

.0002503

.7902897

The last column, relative efficiency, is the ratio of the MSE for ML to the MSE for AB. With 5 time points, AB is only about half as efficient as MLSEM, for any sample size. But the number of time points has a dramatic effect. AB is only 29% efficient for T=4 but 79% efficient for T=10.
The next steps are to vary such things as the magnitudes of the coefficients, the variances of the error terms, and the correlations between c and f with each other and with the predictor variables.
Besides its efficiency advantage, the MLSEM framework makes it easier than AB to accomplish several things:
 Handle missing data by FIML.
 Relax various constraints, such as constant error variance or constant coefficients.
 Construct a likelihood ratio test comparing fixed vs. random effects, the equivalent of the Hausman test which not infrequently breaks down.
 Add an autoregressive structure to the timespecific error components.
Before concluding, I must mention that Hsiao et al. (2002) also did a simulation study to compare ML with a variety of other estimators for the panel model, including AB. However, their approach to ML was very different than mine, and it has not been implemented in any commercial software packages. Hsiao et al. found that ML did better with respect to both bias and efficiency than any of the other estimators, under almost all conditions. Nevertheless, the differences between ML and AB were much smaller than those reported here.
If you’re reading this post, you should definitely read next month’s follow up by clicking here.
To learn more about these and other methods for panel data, check out my seminars, Longitudinal Data Analysis Using SAS and Longitudinal Data Analysis Using Stata. Both will be offered in the spring of 2015. Plus, I am offering a new, more advanced seminar titled Longitudinal Data Analysis Using SEM in Fort Myers, Florida, January 2324.
References
Allison, Paul D. (2000) “Inferring Causal Order from Panel Data.” Paper presented at the Ninth International Conference on Panel Data, June 22, Geneva, Switzerland.
Allison, Paul D. (2005a) “Causal Inference with Panel Data.” Paper presented at the Annual Meeting of the American Sociological Association, August, Philadelphia, PA.
Allison, Paul D. (2005b) Fixed Effects Regression Methods for Longitudinal Data Using SAS. Cary, NC: SAS Institute.
Allison, Paul D. (2009) Fixed Effects Regression Models. Thousand Oaks, CA: Sage Publications.
Arellano, M. and S. Bond (1991) “Some tests of specification for panel data: Monte Carlo evidence and an application to employment equations.” The Review of Economic Studies 58: 277297.
Hsiao, Cheng, M. Hashem Pesaran, and A. Kamil Tahmiscioglu (2002) “Maximum likelihood estimation of fixed effects dynamic panel data models covering short time periods.”Journal of Econometrics 109: 107150.