In my November and December posts, I extolled the virtues of SEM for estimating dynamic panel models. By combining fixed effects with lagged values of the predictor variables, I argued that this approach offers the best option for making causal inferences with non-experimental panel data. It controls for all time-invariant variables, whether observed or not, and it allows for the possibility of reciprocal causation. Simulation evidence strongly favors SEM over the more popular Arellano-Bond method for estimating these kinds of models.

Despite the potential for this method, I recently learned that it’s vulnerable to a very troubling kind of bias when the lag structure is misspecified. In the latest issue of *Sociological Methods and Research*, Stephen Vaisey and Andrew Miles showed by both simulation and formal proof that a positive contemporaneous effect will often show up as a *negative* effect when estimating a fixed effects model with a predictor that is lagged by one time unit. They concluded that, for most social science applications, “artifactual negative ‘effects’ will likely be the rule rather than the exception.”

Vaisey and Miles investigated this problem only for the case of three periods of data, no lagged effect of the dependent variable *y* on itself, and no true effect of *y* on *x*. In that case, maximum likelihood reduces to OLS regression using difference scores: *y*_{3}–*y*_{2} on *x*_{2}–*x*_{1}. They showed that the coefficient for *x*_{2}–*x*_{1 }has an expected value that is exactly -.5 times the true coefficient.

My own simulations suggest that a sign reversal can also happen with four or more periods and a lagged dependent variable. And the effect of one variable on the other doesn’t have to be exactly contemporaneous. The reversal of sign can also occur if the correct lag is one week, but the estimated model specifies a lag of one year. Note that this artifact does not arise with random effects models. It’s specific to fixed effects models with lagged predictors. That should not be interpreted as an endorsement of random effects models, however, because they are much more prone to bias from omitted variables.

As noted by Vaisey and Miles, a 2011 article in the *Journal of Quantitative Criminology *may exemplify the problem of misspecified lags. Following *my* advice, Ousey, Wilcox and Fisher used the fixed effects SEM method to examine the relationship between victimization and offending. Numerous studies have found a positive, cross-sectional relationship between these variables: people who report being victims of crimes are also more likely to commit crimes. But Ousey et al. found *negative* effects of each variable on the other. Respondents who reported higher levels of offending in year *t* had *lower* levels of victimization in year *t*+1, after adjusting for fixed effects. And respondents with higher levels of victimization in year *t *had lower levels of offending in year *t*+1.

This surprising result could be real. But it could also occur if there is a positive effect of victimization on offending that is almost instantaneous rather than lagged by one year. And, finally, it could also occur if there is a positive, instantaneous effect of offending on victimization.

What can be done about this problem? Well, one implication is that more thought should go into the design of panel surveys. If you expect that changes in *x* will produce changes in *y* a month later, then collecting monthly data would be much better than collecting annual data. This could have the added advantage of reducing the total time for data collection, although it might also increase certain kinds of response bias.

What if your data have already been collected? Here’s a tentative recommendation that worked well in a few simulations. As a robustness check, estimate models that include *both* contemporaneous and lagged predictors. If a one-year lag is the correct specification, then the contemporaneous effect should be small and not statistically significant. If, on the other hand, the contemporaneous effect is large and significant, it should raise serious doubts about validity of the method and the kinds of conclusions that can be drawn. It may be that the data are simply not suitable for separating the effect of *x* on *y* from the effect of *y* on *x*.

I tried this strategy on a subset of the data used by Ousey et al. to study victimization and offending. When both contemporaneous and lagged predictors were included, I found a strong positive effect of victimization on offending in the same year. The one-year lagged effect was negative but small and non-significant. The same thing happened in the reverse direction. Offending had a strong positive effect on victimization in the same year, but the lagged effect was negative and not significant. My take: these data don’t allow one to draw any firm conclusions about whether victimization affects offending or offending affects victimization. They certainly don’t provide a basis for claiming negative effects of each variable on the other.

Clearly this is a problem that needs a great deal more study. There is a substantial econometric literature on determining the number of lags needed for autoregressive models but, as far as I know, Vaisey and Miles are the first to identify this particular phenomenon.

By the way, Steve Vaisey teaches a highly-rated course for Statistical Horizons called Treatment Effects Analysis.