Linear Regression

An On-Demand Seminar Taught by
Paul D. Allison, Ph.D.

Read reviews of this seminar

To see a sample of the course slides, click here.

For several years, Dr. Paul Allison has been presenting a 2-day, in-person seminar on Linear Regression at various locations around the US. Based on his book Multiple Regression, the course provides a very practical, intuitive, and non-mathematical introduction to the topic of linear regression.

This course takes place in a series of four weekly installments of videos, quizzes, readings, and assignments, and requires about 6-8 hours/week. You can participate at your own convenience; there are no set times when you are required to be online. The course can be accessed with any recent web browser on almost any platform, including iPhone, iPad, and Android devices. It consists of 10 modules:

  1. Introduction to Linear Regression
  2. Trivariate Regression
  3. Statistical Inference in Regression
  4. Dummy Variables and Standardized Coefficients
  5. Non-linearity
  6. Interaction
  7. Heteroscedasticity and Multicollinearity
  8. Missing Data
  9. Maximum Likelihood and Multiple Imputation
  10. Model Building and Variable Selection

The modules contain videos of the live, 2-day version of the course in its entirety. Each module is followed by a short multiple-choice quiz to test your knowledge. There are also weekly exercises that ask you to apply what you’ve learned to a real data set. You may submit your work for review by Dr. Allison.  

Each week, there are 2-3 assigned articles to read. There is also an online discussion forum where you can post questions or comments about any aspect of the course. All questions will be promptly answered by Dr. Allison. 

Downloadable course materials include the following pdf files:

  • All slides displayed in the videos.
  • Exercises for each week. 
  • Readings for each week.
  • Computer code for all exercises (in SAS, Stata, and R formats).
  • A certificate of completion.


Linear regression is the most widely-used method for the statistical analysis of non-experimental (observational) data. It’s also the essential foundation for understanding more advanced methods like logistic regression, survival analysis, multilevel modeling, and structural equation modeling. Without a thorough mastery of linear regression, there’s little point in trying to learn more complex regression methods.

If you’ve never had a course on linear regression, or if you took one so long ago that you have forgotten most of it, this seminar will get you up to speed. Over four weeks, we’ll cover almost a semester’s worth of material. When it’s over, you’ll be a knowledgeable and effective user of regression methods. And you will have the necessary preparation to take most of Statistical Horizons’ more advanced seminars.

The seminar will begin by focusing on the two major goals of linear regression: prediction and hypothesis testing. We’ll look at several examples from published articles to see how linear regression is used in practice and how to interpret regression tables.

Next we’ll consider all the things that can go wrong when using linear regression, and we’ll see how to critique the analyses done by others.

We’ll delve into the mathematical theory behind linear regression, focusing on the essential assumptions, and on the implied properties of the least squares method. We’ll also spend considerable time on techniques for building non-linearity into linear regression by way of transformations, interactions, and dummy (indicator) variables.

There will be lots of hands-on exercises using SAS, Stata, or R.


Because this is a hands-on course, you should have your own computer loaded with a recent version of SAS (release 9.2 or later), Stata (release 13 or later), or R.

Seminar participants who are not yet ready to purchase Stata could take advantage of StataCorp’s free 30-day evaluation offer or their 30-day software return policy.

There is now a free version of SAS, called the SAS University Edition, that is available to anyone. It has everything needed to run the exercises in this course, and it will run on Windows, Mac or Linux computers. However, you do need a 64-bit machine with at least 1 GB of RAM. You also have to download and install virtualization software that is available free from third-party vendors. The SAS Studio interface runs in your browser, but you do not have to be connected to the Internet. The download and installation are a bit complicated, but well worth the time and effort.

If you’d like to use R for the course but are concerned that your R skills aren’t sufficient, there are excellent on-line resources for learning the basics. Here are our recommendations.

WHO SHOULD Register?

This seminar is designed for people who have a basic background in statistics, and who want to learn more about the theory and practice of linear regression. You’ll need to have taken an introductory course in statistics, and be comfortable with such concepts as random sampling, measures of center and variability, correlation, sampling distributions, standard errors, confidence intervals, and hypothesis testing. You should also have at least some experience using SAS, Stata, or R. Neither matrix algebra nor calculus will be used.

Although the course is relatively non-mathematical, considerable emphasis will be placed on the underlying assumptions and their implications. Upon completion of this seminar, you should be able to run your own linear regressions, build and evaluate regression models, and interpret and critique regression results.


“Thank you SO MUCH for this course – I have learned an incredible amount. The balance you struck between concepts/theory and practical application was perfect. As an applied public health researcher, I learned most of my statistical methods “on the job” and this has been a great refresher course for me. As a journal editor, I am frequently faced with debate between author and reviewer re: some of the issues you raised that are commonly misunderstood (e.g. which variables need normal distribution, whether to include original and transformed variables, etc.) With the course slides and resources, I now have a cheat sheet for many of these common statistical issues.”
  Kenda Cunningham, Helen Keller International