Missing Data

A 2-Day Seminar Taught by Paul Allison, Ph.D.

Read reviews of this course

To see a sample of the course materials, click here.

If you’re using conventional methods for handling missing data, you may be missing out. Conventional methods for missing data, like listwise deletion or regression imputation, are prone to three serious problems:

  • Inefficient use of the available information, leading to low power and Type II errors.
  • Biased estimates of standard errors, leading to incorrect p-values.
  • Biased parameter estimates, due to failure to adjust for selectivity in missing data.

More accurate and reliable results can be obtained with maximum likelihood or multiple imputation.

These new methods for handling missing data have been around for at least a decade, but have only become practical in the last few years with the introduction of widely available and user friendly software. Maximum likelihood and multiple imputation have very similar statistical properties. If the assumptions are met, they are approximately unbiased and efficient–that is, they have minimum sampling variance. 

What’s remarkable is that these newer methods depend on less demanding assumptions than those required for conventional methods for handling missing data. Maximum likelihood is available for linear models, logistic regression and Cox regression. Multiple imputation can be used for virtually any statistical problem.

This course will cover the theory and practice of both maximum likelihood and multiple imputation. Maximum likelihood for linear models will be demonstrated with SAS, Stata, and Mplus. Mplus will also be used for maximum likelihood with logistic regression. Multiple imputation will be demonstrated with both SAS and Stata.


This is a hands-on course with at least one hour each day devoted to carefully structured and supervised assignments. To optimally benefit, you are strongly encouraged to bring your own laptop with a recent version of SAS or Stata installed. 

There is now a free version of SAS, called the SAS University Edition, that is available to anyone. It has everything needed to run the exercises in this course, and it will run on Windows, Mac or Linux computers. However, you do need a 64-bit machine with at least 1 GB of RAM. You also have to download and install virtualization software that is available free from third-party vendors. The SAS Studio interface runs in your browser, but you do not have to be connected to the Internet. The download and installation are a bit complicated, but well worth the time and effort.  

Seminar participants who are not yet ready to purchase Stata could take advantage of StataCorp’s free 30-day evaluation offer or their 30-day software return policy.

Who should attend?

Virtually anyone who does statistical analysis can benefit from new methods for handling missing data. To take this course, you should have a good working knowledge of the principles and practice of multiple regression, as well as elementary statistical inference. But you do not need to know matrix algebra, calculus, or likelihood theory. 

Location, Format, and materials

The class will meet from 9 am to 5 pm each day with a 1-hour lunch break at Temple University Center City, 1515 Market Street, Philadelphia, PA 19103. 

Participants receive a bound manual containing detailed lecture notes (with equations and graphics), examples of computer printout, and many other useful features. This book frees participants from the distracting task of note taking. 

Registration and Lodging

The fee of $995 includes all course materials.

Refund Policy

If you cancel your registration at least two weeks before the course is scheduled to begin, you are entitled to a full refund (minus a processing fee of $50). 

Lodging Reservation Instructions

A block of guest rooms has been reserved at the Club Quarters Hotel, 1628 Chestnut Street, Philadelphia, PA at a special rate of $159 per night. This location is about a 5 minute walk to the seminar location. In order to make reservations, call 203-905-2100 during business hours and identify yourself by using group code SH0405 or click here. For guaranteed rate and availability, you must reserve your room no later than Monday, March 5, 2018. 

If you make reservations after the cut-off date ask for the Statistical Horizons room rate (do not use the code) and they will try to accommodate your request. 


  1. Assumptions for missing data methods
  2. Problems with conventional methods
  3. Maximum likelihood (ML)
  4. ML with EM algorithm
  5. Direct ML with Mplus, Stata and SAS
  6. ML for contingency tables
  7. Multiple Imputation (MI)
  8. MI under multivariate normal model
  9. MI with SAS and Stata
  10. MI with categorical and nonnormal data
  11. Interactions and nonlinearities
  12. Using auxiliary variables
  13. Other parametric approaches to MI
  14. Linear hypotheses and likelihood ratio tests
  15. Nonparametric and partially parametric methods
  16. Fully conditional models
  17. MI and ML for nonignorable missing data

Comments by recent participants

“It is my third course with Paul Allison from Statistical Horizons and it will definitely not be the last. Every time I go back home inspired, re-energized and with a desire to attend another course that will definitely expand on the knowledge I acquired in my last one.”
  Grettel Castro, Florida International University

“This class provides a very clear picture of the theory and use of multiple imputation. It covers almost all possible kinds of “missingness” one could face in data analysis. It is a very informative and practical class to take.”
  Chiping Nieh, Henry M. Jackson Foundation

“The course was very well-organized. Everything ran on schedule. The course materials provided were helpful and comprehensive.”
  Kelly Doran, New York University School of Medicine

“This course balanced sufficient background information as to why procedures are done as they are, with practical information about how to do the procedures. The content and pacing was well-considered- neither dragging nor overwhelming.”
  Elizabeth Van Voorhees, Duke University Medical Center

“This course was very practical and easy to understand. I was able to implement what was discussed almost immediately. It had just the right mix of theory and application.”
  Bilal Karriem, Keystats Incorporated  

“This seminar opens a window to a very complex and common issue- missing data. The professor made it very clear and I would recommend this course 100%. Excellent course!”
  Pura Rodriguez de la Vega, Florida International University

“I am really glad that I had the opportunity to take this course. The materials and examples are very clear and cover the problem of missing data from practical and theoretical viewpoints. What I’ve learned has dispelled many misconceptions that I had about the best approaches to missing data. I feel a lot more confident about trying to use these approaches on my own rather than always defaulting to a complete case analysis.”
  Sarah Nyante, University of North Carolina at Chapel Hill

“As always Dr. Allison made a lot of complex issues much simpler. Although difficult to cover all the aspects of “missing data” in 2 days, touched upon many. Enjoyed the course very much. Thank you.”
  Ashutosh Tamhane, University of Alabama-Birmingham

“I appreciated the logical progression of materials presented in the course and the appropriate pacing. Additionally Dr. Allison’s presentation style and ability to respond to questions is excellent.”
  Sam Shirazi, SUNY, Stony Brook

“This is a great overview of methods to deal with missing data. I found Dr. Allison’s explanations clear, and his experience and expertise really helpful in handling difficult situations we all experience when analyzing data.”
  Jonathan Jarvis, Brigham Young University

“I took this course as a refresher to help me better model my data and supervise my graduate students. It was most helpful in that it confirmed my understanding of a dynamic topic. I also learned of more advanced techniques in the field, in particular FCS estimation. Prof. Allison is very ‘user friendly’- presenting in a way that was informative and accessible to a very diverse audience.”
  Carson Mencken, Baylor University