Missing Data - Online Course
A 4-Week On-Demand Seminar Taught by
Paul AllisonEach Monday you will receive an email with instructions for the following week.
All course materials are available 24 hours a day. Materials will be accessible for an additional 2 weeks after the official close on October 7.
If you’re using conventional methods for handling missing data, you may be missing out. Conventional methods for missing data, like listwise deletion or regression imputation, are prone to three serious problems:
-
- Inefficient use of the available information, leading to low power and Type II errors.
- Biased estimates of standard errors, leading to incorrect p-values.
- Biased parameter estimates, due to failure to adjust for selectivity in missing data.
More accurate and reliable results can be obtained with maximum likelihood or multiple imputation.
Maximum likelihood and multiple imputation have very similar statistical properties. If the assumptions are met, they are approximately unbiased and efficient. What’s remarkable is that these newer methods depend on less demanding assumptions than those required for older methods for handling missing data.
Maximum likelihood is available for linear regression, logistic regression, Cox regression, and regression for count data. Multiple imputation can be used for virtually any statistical problem.
Although these newer methods for handling missing data have been around for more than two decades, they have only become practical with the introduction of widely available and user friendly software.
Based on Paul Allison’s book Missing Data, this seminar covers both the theory and practice of multiple imputation and maximum likelihood.
The course takes place in a series of four weekly installments of videos, quizzes, readings, and assignments, and requires about 10 hours/week. You can participate at your own convenience; there are no set times when you are required to be online. The course can be accessed with any recent web browser on almost any platform, including iPhone, iPad, and Android devices. It consists of 12 modules:
-
- Basic principles and assumptions.
- Conventional methods for missing data.
- Maximum likelihood (ML) for categorical variables.
- ML and the EM algorithm.
- Direct ML with SEM software and with mixed models.
- Basic principles of multiple imputation (MI).
- MI for non-monotone data using MCMC.
- MCMC options and complications.
- Fully conditional specification.
- Multivariate inference, interactions, and nonlinearities.
- Other methods, panel data, clustered data.
- Non-ignorable missing data.
Each module begins with an introductory video, followed by a narrated PowerPoint presentation. The modules contain all the slides in the livestream version of the course. But there are also many additional slides that wouldn’t fit into the live course, including several slides on imputation with clustered data.
Each module is followed by a short multiple-choice quiz to test your knowledge. There are also weekly exercises that ask you to apply what you’ve learned to a real data set.
There is also an online discussion forum where you can post questions or comments about any aspect of the course. All questions will be promptly answered by Dr. Allison.
Computing
In the videos, SAS will be the main software package used to demonstrate the methods. Mplus and LEM will also be used for portions of the modules on maximum likelihood estimation. For those who prefer to use Stata or R, slides and exercises using these packages can be downloaded from the course site.
You should be at least moderately proficient at using one of these packages: SAS, Stata or R.
There is now a free version of SAS, called SAS OnDemand for Academics, that works in your web browser.
If you’d like to familiarize yourself with Mplus basics before the seminar begins, we recommend reading through UCLA’s short guide here.
In the videos, SAS will be the main software package used to demonstrate the methods. Mplus and LEM will also be used for portions of the modules on maximum likelihood estimation. For those who prefer to use Stata or R, slides and exercises using these packages can be downloaded from the course site.
You should be at least moderately proficient at using one of these packages: SAS, Stata or R.
There is now a free version of SAS, called SAS OnDemand for Academics, that works in your web browser.
If you’d like to familiarize yourself with Mplus basics before the seminar begins, we recommend reading through UCLA’s short guide here.
Who should register?
Virtually anyone who does statistical analysis can benefit from new methods for handling missing data. To optimally benefit from this course, you should have a good working knowledge of the principles and practice of linear regression, as well as elementary statistical inference. But you do not need to know matrix algebra, calculus, or likelihood theory.
Virtually anyone who does statistical analysis can benefit from new methods for handling missing data. To optimally benefit from this course, you should have a good working knowledge of the principles and practice of linear regression, as well as elementary statistical inference. But you do not need to know matrix algebra, calculus, or likelihood theory.
Seminar outline
- Assumptions for missing data methods
- Problems with conventional methods
- Maximum likelihood (ML)
- ML with EM algorithm
- Direct ML with Mplus, Stata and SAS
- ML for contingency tables
- Multiple Imputation (MI)
- MI under multivariate normal model
- MI with SAS and Stata
- MI with categorical and nonnormal data
- Interactions and nonlinearities
- Using auxiliary variables
- Other parametric approaches to MI
- Linear hypotheses and likelihood ratio tests
- Nonparametric and partially parametric methods
- Fully conditional models
- MI and ML for nonignorable missing data
- Assumptions for missing data methods
- Problems with conventional methods
- Maximum likelihood (ML)
- ML with EM algorithm
- Direct ML with Mplus, Stata and SAS
- ML for contingency tables
- Multiple Imputation (MI)
- MI under multivariate normal model
- MI with SAS and Stata
- MI with categorical and nonnormal data
- Interactions and nonlinearities
- Using auxiliary variables
- Other parametric approaches to MI
- Linear hypotheses and likelihood ratio tests
- Nonparametric and partially parametric methods
- Fully conditional models
- MI and ML for nonignorable missing data
Registration instructions
The fee of $695 (USD) includes all course materials. All major credit cards are accepted.
This course is hosted on a platform called DigitalChalk. To register, you’ll need to go to statisticalhorizons.digitalchalk.com and click on Create Account. Then you will enter your name and email address, and create a password. Be sure to save your password because you will need it to logon to the course itself.
When you have created your account, you’ll be taken to your new home page. Click on the Register Now button (or click the Catalog icon on the left-hand column), and you’ll see “Missing Data” as one of the available courses. At the bottom of the box for that course, click the green button Add to Cart. Next click the green button at the top that says Checkout. You will then be prompted for your credit card information.
When you have finished the payment process, you will be taken back to your home page. Click on Dashboard to see Missing Data. When the course begins on September 9, you can click the play button to get started.
The fee of $695 (USD) includes all course materials. All major credit cards are accepted.
This course is hosted on a platform called DigitalChalk. To register, you’ll need to go to statisticalhorizons.digitalchalk.com and click on Create Account. Then you will enter your name and email address, and create a password. Be sure to save your password because you will need it to logon to the course itself.
When you have created your account, you’ll be taken to your new home page. Click on the Register Now button (or click the Catalog icon on the left-hand column), and you’ll see “Missing Data” as one of the available courses. At the bottom of the box for that course, click the green button Add to Cart. Next click the green button at the top that says Checkout. You will then be prompted for your credit card information.
When you have finished the payment process, you will be taken back to your home page. Click on Dashboard to see Missing Data. When the course begins on September 9, you can click the play button to get started.