### 2018 Stata SUMMER School:

# Linear Regression Using Stata

Taught by Paul Allison, Ph.D.

August 14-15, Hotel Birger Jarl Conference

Stockholm, Sweden

Linear regression is the most widely-used method for the statistical analysis of non-experimental (observational) data. It’s also the essential foundation for understanding more advanced methods like logistic regression, survival analysis, multilevel modeling, and structural equation modeling. Without a thorough mastery of linear regression, there’s little point in trying to learn more complex regression methods.

If you’ve never had a course on linear regression, or if you took one so long ago that you have forgotten most of it, this seminar will get you up to speed. In two days, we’ll cover almost a semester’s worth of material. When it’s over, you’ll be a knowledgeable and effective user of regression methods. And you will have the necessary preparation to take more advanced seminars.

Paul Allison has been teaching courses on linear regression for more than 30 years. He is the author of the popular text, *Multiple Regression,* which* *provides a very practical, intuitive, and non-mathematical introduction to the topic of linear regression.

The seminar will begin by focusing on the two major goals of linear regression: prediction and hypothesis testing. We’ll look at several examples from published articles to see how linear regression is used in practice and how to interpret regression tables.

Next we’ll consider all the things that can go wrong when using linear regression, and we’ll see how to critique the analyses done by others.

We’ll delve into the mathematical theory behind linear regression, focusing on the essential assumptions, and on the implied properties of the least squares method. We’ll also spend considerable time on techniques for building non-linearity into linear regression by way of transformations, interactions, and dummy (indicator) variables.

There will be lots of hands-on exercises using Stata. (Slides using SAS are also available on request.)

### COMPUTING

This seminar will use Stata for the many empirical examples and the exercises. At least one hour each day will be devoted to hands-on exercises. Power outlets will be provided at each seat.

### WHO SHOULD ATTEND?

This seminar is designed for people who have a basic background in statistics, and who want to learn more about the theory and practice of linear regression. You’ll need to have taken an introductory course in statistics, and be comfortable with such concepts as random sampling, measures of center and variability, correlation, sampling distributions, standard errors, confidence intervals, and hypothesis testing. You should also have at least *some* experience using Stata. Neither matrix algebra nor calculus will be used.

Although the course is relatively non-mathematical, considerable emphasis will be placed on the underlying assumptions and their implications. Upon completion of this seminar, you should be able to run your own linear regressions, build and evaluate regression models, and interpret and critique regression results.

### FORMAT AND MATERIALS

Participants receive a bound manual containing detailed lecture notes (with equations and graphics), examples of computer printout, and many other useful features. This book frees participants from the distracting task of note taking.

### REGISTRATION AND LODGING

Please go to the Metrika website for information on registration, and discounted hotel accommodations.

### COURSE OUTLINE

- What is linear regression and what is it good for?
- Examples of published regression analyses and interpretation of results.
- The mechanics of regression in Stata.
- Bivariate and trivariate regression.
- Assumptions of linear regression and properties of least squares estimation.
- Evaluation of regression models.
- What can go wrong in linear regression.
- Regression, correlation, and standardized coefficients.
- Nonlinearity and interaction.
- Dummy (indicator) variables.
- Multicollinearity.
- Model building strategies.
- Missing data.
- Heteroscedastity.

### COMMENTS BY RECENT PARTICIPANTS

“Great overview of Linear Regression – its assumptions, the implications of relaxing them and interpreting the results. Very well organized, thorough and conceptual, with detailed examples illustrating how to think about actual results. Very useful, and I learned a lot even though I’ve been running regression for 20 years.”

* Michael Cook, Credit Suisse** *

“The seminar is at the same time thorough and rigorous, yet full of direct applications that can immediately be implemented. I wholeheartedly recommend this course.”

* Monica Ionescu, The Wharton School, University of Pennsylvania*

“A thorough and comprehensive overview of regression. Fabulous refresher course for anyone and a first step for those interested in taking more advanced courses. Highly recommended!”

* Gigliana Melzi, New York University** *

“This course is a great review or introduction to linear regression. It is very practical and will definitely prove to be helpful in my work environment. I felt it was a great benefit that I learned as much about how to use regression in the software I use as I did about the assumptions and premises of the models.”

* Jessica Rast, Drexel Autism Institute** *

“Am excellent overview of linear models. Covered concepts that I had not had since graduate school in a clear; concise, and understandable way.”

* Thomas Cohen, US Courts** *

“This was an excellent course – clear and accessible. Thanks!”

* Clara Wagner, U.S. Department of Veterans Affairs** *

“Great class. Professor Allison does an excellent job of introducing this topic. Even though I have used linear regression for a long time, I learned a few tricks on missing data, what assumptions I should be concerned about, and the Mean Variance Model to correct heteroscedasticity.”

* Vijay Raghavan, Actavis** *