Missing Data

A 2-Day Seminar Taught by Paul Allison, Ph.D.

Read reviews of this seminar

If you’re using conventional methods for handling missing data, you may be missing out. Conventional methods for missing data, like listwise deletion or regression imputation, are prone to three serious problems:

  • Inefficient use of the available information, leading to low power and Type II errors.
  • Biased estimates of standard errors, leading to incorrect p-values.
  • Biased parameter estimates, due to failure to adjust for selectivity in missing data.

More accurate and reliable results can be obtained with maximum likelihood or multiple imputation.

These new methods for handling missing data have been around for at least a decade, but have only become practical in the last few years with the introduction of widely available and user friendly software. Maximum likelihood and multiple imputation have very similar statistical properties. If the assumptions are met, they are approximately unbiased and efficient–that is, they have minimum sampling variance. 

What’s remarkable is that these newer methods depend on less demanding assumptions than those required for conventional methods for handling missing data. Maximum likelihood is available for linear models, logistic regression and Cox regression. Multiple imputation can be used for virtually any statistical problem.

This course will cover the theory and practice of both maximum likelihood and multiple imputation. Maximum likelihood for linear models will be demonstrated with SAS, Stata, and Mplus. Mplus will also be used for maximum likelihood with logistic regression. Multiple imputation will be demonstrated with both SAS and Stata.


Virtually anyone who does statistical analysis can benefit from new methods for handling missing data. To take this course, you should have a good working knowledge of the principles and practice of multiple regression, as well as elementary statistical inference. But you do not need to know matrix algebra, calculus, or likelihood theory. 

Location, Format AND MATERIALS

The seminar meets on Friday, October 30 and Saturday, October 31 from 9 to 4 each day with a 1-hour lunch break at Temple University Center City, 1515 Market Street, Philadelphia, PA 19103.

Participants receive a bound manual containing detailed lecture notes (with equations and graphics), examples of computer printout, and many other useful features. This book frees participants from the distracting task of note taking. 

Registration and Lodging

The fee of $995 includes all course materials. 

Lodging Reservation Instructions

A block of rooms has been reserved at the Club Quarters Hotel, 1628 Chestnut St., Philadelphia, PA at a nightly rate of $147 for a Standard room. This hotel is about a 5-minute walk from the seminar location. To make a reservation, you must call 203-905-2100 during business hours and identify yourself by giving the group code SH1029. For guaranteed rate and availability, you must make your reservation by September 29, 2015. 


  1. Assumptions for missing data methods
  2. Problems with conventional methods
  3. Maximum likelihood (ML)
  4. ML with EM algorithm
  5. Direct ML with Mplus, Stata and SAS
  6. ML for contingency tables
  7. Multiple Imputation (MI)
  8. MI under multivariate normal model
  9. MI with SAS and Stata
  10. MI with categorical and nonnormal data
  11. Interactions and nonlinearities
  12. Using auxiliary variables
  13. Other parametric approaches to MI
  14. Linear hypotheses and likelihood ratio tests
  15. Nonparametric and partially parametric methods
  16. Fully conditional models
  17. MI and ML for nonignorable missing data

Recent Comments from participants

“I read about a dozen articles on missing data techniques before taking this course – including two excellent articles by the course instructor Paul Allison. That reading was helpful but not necessary to understand the course lectures. Most importantly, the course helped me understand what I had read and introduced numerous ideas/facts not contained in the readings. This course was definitely worth the time and money – I walk away with it much more knowledgeable about missing data techniques and more confident in my ability to implement them properly.”
  Wm. Michael Lynn, Cornell University

“The class is well organized, the lectures are well paced, and the material is well thought out. Clear, concise, well organized information for those beginner to intermediate users of Missing Value procedures. Dr. Allison is receptive to questions and to individual problem discussion. Highly recommended.”
  Carl Peiper, Duke University Medical Center

“This course really gave me a great theoretical foundation for different missing data approaches and the tools to apply this new knowledge confidently in my work. Paul Allison provides examples, informed by tested as well as emerging approaches, that have given me working knowledge of approaches to handling missing data.”
  Amanda Latimore, Johns Hopkins Bloomberg School of Public Health 

“This is a very useful class on missing data. It informs us how to use the popular software to address missing data issues.”
  Pengxiang Alex Li, University of Pennsylvania

“Learning with this self-selected and motivated group was very effective. The instructor is willing to answer any and all questions at all times so that keeps the flow of discussion going. A good accessible coverage of topic that will help me apply the methods quickly.”
  Rakesh Niraj, Case Western Reserve University