Statistical Horizons Blog


Instrumental Variables in Structural Equation Models

When I teach courses on structural equation modeling (SEM), I tell my students that any model with instrumental variables can be estimated in the SEM framework. Then I present a classic example of simultaneous causation in which X affects Y, and Y also affects X. Models like this can be estimated if each of the […]

Read More »

For Causal Analysis of Competing Risks, Don’t Use Fine & Gray’s Subdistribution Method

Competing risks are common in the analysis of event time data. The classic example is death, with distinctions among different kinds of death: if you die of a heart attack, you can’t then die of cancer or suicide. But examples also abound in other fields. A marriage can end either by divorce or by the […]

Read More »

Using “Between-Within” Models to Estimate Contextual Effects

In my courses and books on longitudinal data analysis, I spend a lot of time talking about the between-within model for fixed effects. I used to call it the hybrid model, but others have convinced me that “between-within” provides a more meaningful description. Last week my long-time collaborator, Paula England, asked me a question about […]

Read More »

The Peculiarities of Missing at Random

I thought I knew what it meant for data to be missing at random. After all, I’ve written a book titled Missing Data, and I’ve been teaching courses on missing data for more than 15 years. I really ought to know what missing at random means. But now that I’m in the process of revising […]

Read More »

In Defense of Logit – Part 2

In my last post, I explained several reasons why I prefer logistic regression over a linear probability model estimated by ordinary least squares, despite the fact that linear regression is often an excellent approximation and is more easily interpreted by many researchers. I addressed the issue of interpretability by arguing that odds ratios are, in […]

Read More »

In Defense of Logit – Part 1

In a recent guest blog, Paul von Hippel extended his earlier argument that there are many situations in which a linear probability model (estimated via ordinary least squares) is preferable to a logistic regression model. In his two posts, von Hippel makes three major points: Within the range of .20 to .80 for the predicted […]

Read More »

When Can You Fit a Linear Probability Model? More Often Than You Think

In July 2015 I pointed out some advantages of the linear probability model over the logistic model. The linear model is much easier to interpret, and the linear model runs much faster, which can be important if the data set is large or the model is complicated. In addition, the linear probability model often fits […]

Read More »

Causal Mediation Analysis

On April 21-22, 2017, I will be offering a seminar on causal mediation analysis in Philadelphia with Statistical Horizons. The course will cover very recent developments in this area. Mediation is about the mechanisms or pathways by which some treatment or exposure affects an outcome. Questions about mediation arise with considerable frequency in the biomedical […]

Read More »

Teaching Stata in Bangladesh

When Paul Allison asked me if I wanted to teach a course in Bangladesh, my first reaction was confusion.  Other than knowing a few basic facts about the place, I had spent little time thinking about the country and none at all imagining myself going there. And here, suddenly, was an opportunity to spend a […]

Read More »

Linear vs. Logistic Probability Models: Which is Better, and When?

In his April 1 post, Paul Allison pointed out several attractive properties of the logistic regression model.  But he neglected to consider the merits of an older and simpler approach: just doing linear regression with a 1-0 dependent variable.  In both the social and health sciences, students are almost universally taught that when the outcome variable in […]

Read More »
Older Entries