Standard methods for the analysis of panel data depend on an assumption of directional symmetry that most researchers don’t even think about. Specifically, these methods assume that if a one-unit increase in variable X produces a change of B units in variable Y, then a one-unit decrease in X will result in a change of –B units in Y

Does that make sense? Probably not for most applications. For example, is it plausible that the increase in happiness when a person gets married is exactly matched by the decrease in happiness when a person gets divorced? Or that a \$10K increase in income has the same effect on savings as a \$10K decrease (in the opposite direction).

In this post, I’m going to show you how to relax that assumption. I’ll do it for the simplest situation where there are only two time points. But I’ve also written a more detailed paper covering the multi-period situation.

Here’s the example with two-period data. The data set has 581 children who were studied in 1990 and 1992 as part of the National Longitudinal Survey of Youth.  I’ll use three variables that were measured at each of the two time points:

anti     antisocial behavior, measured with a scale from 0 to 6.
self      self-esteem, measured with a scale ranging from 6 to 24.
pov      poverty status of family, coded 1 for family in poverty, otherwise 0.

The goal is to estimate the causal effects of self and pov on anti. I’ll focus on fixed effects methods (Allison 2005, 2009) because they are ideal for studying the effects of increases or decreases over time. They also have the remarkable ability to control for all time-invariant confounders.

For two-period data, there are several equivalent ways to estimate a fixed effects model. The difference score method is the one that’s the most straightforward for allowing directional asymmetry. It works like this: for each variable, subtract the time 1 value from the time 2 value to create a difference score. Then, just estimate an ordinary linear regression with the difference scores.

Here is Stata code for a standard symmetrical model:

```use nlsy.dta, clear
generate antidiff=anti92-anti90
generate selfdiff=self92-self90
generate povdiff=pov92-pov90
regress antidiff selfdiff povdiff```

You’ll find equivalent SAS code at the end of this post.

And here are the results:

```------------------------------------------------------------------------------
antidiff |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
selfdiff |  -.0391292   .0136396    -2.87   0.004    -.0659185     -.01234
povdiff |   .1969039   .1326352     1.48   0.138    -.0636018    .4574096
_cons |   .0403031   .0533833     0.75   0.451    -.0645458    .1451521
------------------------------------------------------------------------------```

Self-esteem has a highly significant negative effect on antisocial behavior. Specifically, for each 1-unit increase in self-esteem, antisocial behavior goes down by .039 units. But that also means that for each 1-unit decrease in self-esteem, antisocial behavior goes up by .039 units. Poverty has a positive (but non-significant) effect on self-esteem. Children who move into poverty have an estimated increase in antisocial behavior of .112. But children who move out of poverty have an estimated decrease antisocial behavior of .112.

How can we relax the constraint that these effects have to be the same in both directions?  York and Light (2017) showed the way. What’s needed is to decompose the difference score for each predictor variable into its positive and negative components. Specifically, if D is a difference score for variable X, create a new variable D+ which equals D if D is greater than 0, otherwise 0. And create a second variable D which equals –D if D is less than 0, otherwise 0.

Here’s how to create these variables in Stata:

```generate selfpos=selfdiff*(selfdiff>0)
generate selfneg=-selfdiff*(selfdiff<0)
generate povpos=povdiff*(povdiff>0)
generate povneg=-povdiff*(povdiff<0)```

The inequalities in parentheses are logical expressions that have a value of 1 if the inequality is true and 0 if the inequality is false.

Now just regress antidiff on all four of these variables,

`regress antidiff selfpos selfneg povpos povneg`

which produces the following table:

```--------------------------------------------------------------------------
antidiff |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
---------+----------------------------------------------------------------
selfpos |  -.0048386   .0251504    -0.19   0.848    -.0542362    .0445589
selfneg |   .0743077    .025658     2.90   0.004      .023913    .1247024
povpos |   .2502064   .2003789     1.25   0.212     -.143356    .6437688
povneg |   -.126328   .1923669    -0.66   0.512    -.5041541    .2514981
_cons |  -.0749517    .086383    -0.87   0.386    -.2446157    .0947123```
`--------------------------------------------------------------------------`

What does this tell us?  A 1-unit increase in self-esteem lowers antisocial behavior by .005 units (an effect that is far from statistically significant). A 1-unit decrease in self-esteem increases antisocial behavior by .074 units (highly significant). So it looks like decreases in self-esteem have a big effect, while increases have little impact. Note that the original estimate of -.039 was about midway between these two estimates.

Are the two effects significantly different? We can test that with the command

`test selfpos=-selfneg`

which yields a p-value of .10, not quite statistically significant.

Neither of the two poverty coefficients is statistically significant, although they are in the expected direction: moving into poverty increases antisocial behavior while moving out of poverty reduces it, but by about half the magnitude. These two effects are definitely not significantly different.

So that’s basically it for two-period data. When there are three or more periods, you have to create multiple records for each individual. Each record contains difference scores for adjacent periods. When estimating the regression model, you need to allow for negative correlations between adjacent records by using generalized least squares.

I discuss all these options in an article that can be found here. That paper also presents a data generating model that justifies the asymmetric first difference method. The data generating model can be extended to allow for the estimation of asymmetric logistic regression models, which can’t be estimated with difference scores.

If you want to learn more about fixed effects methods, see my two books on this topic:  Fixed Effects Regression Models and Fixed Effects Regression Methods for Longitudinal Data Using SASOr take one of my seminars on longitudinal data analysis.

References:

Allison, Paul D. Fixed effects regression models. Vol. 160. SAGE publications, 2009.
Allison, Paul D. Fixed effects regression methods for longitudinal data using SAS. SAS Institute, 2005.
York, Richard, and Ryan Light. “Directional asymmetry in sociological analyses.” Socius 3 (2017): 1-13.

```
SAS Program

data nlsydiff;
set my.nlsy;
antidiff=anti2-anti1;
selfdiff=self2-self1;
povdiff=pov2-pov1;
proc reg data=nlsydiff;
model antidiff=selfdiff povdiff;
run;
data nlsydiff;
set nlsydiff;
selfpos=selfdiff*(selfdiff>0);
selfneg=-selfdiff*(selfdiff<0);
povpos=povdiff*(povdiff>0);
povneg=-povdiff*(povdiff<0);
proc reg data=nlsydiff;
model antidiff=selfpos selfneg povpos povneg;
test selfpos=-selfneg;
test povpos=-povneg;
run;```