Missing Data

A 2-Day Seminar Taught by Paul Allison, Ph.D.

Read reviews of this seminar

If you’re using conventional methods for handling missing data, you may be missing out. Conventional methods for missing data, like listwise deletion or regression imputation, are prone to three serious problems:

  • Inefficient use of the available information, leading to low power and Type II errors.
  • Biased estimates of standard errors, leading to incorrect p-values.
  • Biased parameter estimates, due to failure to adjust for selectivity in missing data.

More accurate and reliable results can be obtained with maximum likelihood or multiple imputation.

These new methods for handling missing data have been around for at least a decade, but have only become practical in the last few years with the introduction of widely available and user friendly software. Maximum likelihood and multiple imputation have very similar statistical properties. If the assumptions are met, they are approximately unbiased and efficient–that is, they have minimum sampling variance. 

What’s remarkable is that these newer methods depend on less demanding assumptions than those required for conventional methods for handling missing data. Maximum likelihood is available for linear models, logistic regression and Cox regression. Multiple imputation can be used for virtually any statistical problem.

This course will cover the theory and practice of both maximum likelihood and multiple imputation. Maximum likelihood for linear models will be demonstrated with SAS, Stata, and Mplus. Mplus will also be used for maximum likelihood with logistic regression. Multiple imputation will be demonstrated with both SAS and Stata.


Virtually anyone who does statistical analysis can benefit from new methods for handling missing data. To take this course, you should have a good working knowledge of the principles and practice of multiple regression, as well as elementary statistical inference. But you do not need to know matrix algebra, calculus, or likelihood theory. 


The class will meet from 9 to 4 each day with a 1-hour lunch break. 

Participants receive a bound manual containing detailed lecture notes (with equations and graphics), examples of computer printout, and many other useful features. This book frees participants from the distracting task of note taking. 

Registration and Lodging

The fee of $895 includes all course materials. 

Lodging Reservation Instructions

A block of rooms has been reserved at the Club Quarters Hotel, 1628 Chestnut St., Philadelphia, PA at a rate of $142 per night for a Standard room. This hotel is about a 5-minute walk from the course location. To register, you must call 203-905-2100 during business hours and identify yourself by mentioning the group code STA410.  For guaranteed rate and availability, you must make your reservation before  March 10, 2014.


  1. Assumptions for missing data methods
  2. Problems with conventional methods
  3. Maximum likelihood (ML)
  4. ML with EM algorithm
  5. Direct ML with Mplus, Stata and SAS
  6. ML for contingency tables
  7. Multiple Imputation (MI)
  8. MI under multivariate normal model
  9. MI with SAS and Stata
  10. MI with categorical and nonnormal data
  11. Interactions and nonlinearities
  12. Using auxiliary variables
  13. Other parametric approaches to MI
  14. Linear hypotheses and likelihood ratio tests
  15. Nonparametric and partially parametric methods
  16. Fully conditional models
  17. MI and ML for nonignorable missing data

Recent Comments from participants

“Professor Allison’s short courses have always been very practical. The math was discussed at the right level and the time on application was very well spent. His notes on when one should and shouldn’t use certain methods are also very important. I would recommend his short courses to others.”
  Yihua Gu, AbbVie

“Paul is very knowledgeable. The workshop has a good balance of theory and application, including instruction in various software programs. If you want to improve your understanding of missing data treatments and/or receive the latest information on such methods, I highly recommend this workshop.”
  Keenan Pituch, University of Texas at Austin

“This is an excellent course that provides participants with a comprehensive review of all important methods about missing data. It is also an amazing course covering many statistical models (though the linear model or regression serves as the key model), and almost all available software packages (SAS, Stata, Mplus, S-plus, R & SPSS). I highly recommend this course to every researcher, from beginners to sophisticated analysts!”
  Shenyang Guo, University of North Carolina at Chapel Hill

“Excellent course – great opportunity to learn many aspects of data development and model development using many different software and statistical methods.”
  Paul Holness, Statistics Canada

“While I always knew missing data issues were a problem, they were only mentioned in passing in other statistics course. This course was a great broad and also in depth tour of the issues and how best to handle them in different situations. I now feel equipped to apply these methods in both basic and complex analyses and with some confidence.”
  Alison Papadakis, Loyola University Maryland