## Statistical Horizons Blog

#### In Defense of Logit – Part 1

In a recent guest blog, Paul von Hippel extended his earlier argument that there are many situations in which a linear probability model (estimated via ordinary least squares) is preferable to a logistic regression model. In his two posts, von Hippel makes three major points: Within the range of .20 to .80, the linear probability […]

#### When Can You Fit a Linear Probability Model? More Often Than You Think

In July 2015 I pointed out some advantages of the linear probability model over the logistic model. The linear model is much easier to interpret, and the linear model runs much faster, which can be important if the data set is large or the model is complicated. In addition, the linear probability model often fits […]

#### Causal Mediation Analysis

On April 21-22, 2017, I will be offering a seminar on causal mediation analysis in Philadelphia with Statistical Horizons. The course will cover very recent developments in this area. Mediation is about the mechanisms or pathways by which some treatment or exposure affects an outcome. Questions about mediation arise with considerable frequency in the biomedical […]

When Paul Allison asked me if I wanted to teach a course in Bangladesh, my first reaction was confusion.  Other than knowing a few basic facts about the place, I had spent little time thinking about the country and none at all imagining myself going there. And here, suddenly, was an opportunity to spend a […]

#### Linear vs. Logistic Probability Models: Which is Better, and When?

In his April 1 post, Paul Allison pointed out several attractive properties of the logistic regression model.  But he neglected to consider the merits of an older and simpler approach: just doing linear regression with a 1-0 dependent variable.  In both the social and health sciences, students are almost universally taught that when the outcome variable in […]

#### Don’t Put Lagged Dependent Variables in Mixed Models

When estimating regression models for longitudinal panel data, many researchers include a lagged value of the dependent variable as a predictor. It’s easy to understand why. In most situations, one of the best predictors of what happens at time t is what happened at time t-1.  This can work well for some kinds of models, […]

#### Maximum Likelihood is Better than Multiple Imputation: Part II

In my July 2012 post, I argued that maximum likelihood (ML) has several advantages over multiple imputation (MI) for handling missing data: ML is simpler to implement (if you have the right software). Unlike multiple imputation, ML has no potential incompatibility between an imputation model and an analysis model. ML produces a deterministic result rather than […]

#### What’s So Special About Logit?

For the analysis of binary data, logistic regression dominates all other methods in both the social and biomedical sciences. It wasn’t always this way. In a 1934 article in Science, Charles Bliss proposed the probit function for analyzing binary data, and that method was later popularized in David Finney’s 1947 book Probit Analysis. For many […]