Longitudinal Data Analysis Using SAS

A 2-Day Seminar Taught by Paul D. Allison, Ph.D.

Read reviews of this seminar 

To see a sample of the course materials, click here.


The most common type of longitudinal data is panel data, consisting of measurements of predictor and response variables at two or more points in time for many individuals. Such data have two major attractions: the ability to control for unobservables, and the determination of causal ordering.

However, there is also a major difficulty with panel data: repeated observations are typically correlated and this invalidates the usual assumption that observations are independent. As a result, confidence intervals and p-values can be severely biased. In some cases, coefficients may also be biased downward.

This course covers four methods for solving the problem of dependent observations: robust standard errors, generalized estimating equations, random effects models and fixed effects models. You’ll learn how to use these methods for quantitative outcomes, categorical outcomes, and count data outcomes. You’ll also learn which methods are best suited for different kinds of applications.

This is a hands-on seminar with ample opportunities to practice these new methods.

Here are a few of the topics you won’t want to miss:

  • How to use panel data to control for unobserved variables.
  • Why fixed effects methods often give very different results from random effects methods.
  • How to reshape data from long form to wide form and back again.
  • Why the default correlation structure for GEE is usually not the best.
  • The difference between maximum likelihood and restricted maximum likelihood.
  • How to estimate and interpret random coefficient models.
  • Why first-order autoregressive structures are usually unsatisfactory.
  • The difference between subject-specific coefficients and population-averaged coefficients, and why it matters.
  • How to do longitudinal analysis using ordered logit or multinomial logit.

In this seminar, we will use the following SAS procedures: GLM, SURVEYREG, GENMOD, MIXED, LOGISTIC, SURVEYLOGISTIC, GLIMMIX, and CALIS. Lecture notes using Stata are available on request from registered participants.


This seminar will use SAS for the many empirical examples and the exercises. However, lecture notes and exercises using Stata are also available on request. At least one hour each day will be devoted to exercises. To optimally benefit, you should bring your own laptop with a recent version of SAS (or Stata) installed.

There is now a free version of SAS, called the SAS University Edition, that is available to anyone. It has everything needed to run the exercises in this course, and it will run on Windows, Mac or Linux computers. However, you do need a 64-bit machine with at least 1 GB of RAM. You also have to download and install virtualization software that is available free from third-party vendors. The SAS Studio interface runs in your browser, but you do not have to be connected to the Internet. The download and installation are a bit complicated, but well worth the time and effort.  


If you need to analyze longitudinal data and have a basic statistical background, this seminar is for you. You should have a good working knowledge of the principles and practice of multiple regression, as well as elementary statistical inference. It is also helpful to have some familiarity with logistic regression. But you do not need to know matrix algebra, calculus, or likelihood theory.  


The class will meet from 9 am to 5 pm each day with a 1-hour lunch break at the Courtyard by Marriott Chicago Downtown Magnificent Mile, 165 E Ontario St, Chicago, IL 60611.

Participants receive a bound manual containing detailed lecture notes (with equations and graphics), examples of computer printout, and many other useful features. This book frees participants from the distracting task of note taking.

Registration and lodging

The fee of $995.00 includes all seminar materials. The early registration fee of $895 is available until August 28.

Refund Policy

If you cancel your registration at least two weeks before the course is scheduled to begin, you are entitled to a full refund (minus a processing fee of $50). 

Lodging Reservation Instructions

A block of guest rooms has been reserved at the Courtyard by Marriott Chicago Downtown Magnificent Mile, 165 E Ontario St, Chicago, IL 60611, where the seminar takes place, at a special rate of $199. In order to make reservations, click here. For guaranteed rate and availability, you must reserve your room no later than Tuesday, August 28, 2018.


  1. Opportunities and challenges of panel data.
            a. Data requirements
            b. Benefits of panel data
            c. Problem of dependence
            d. Software considerations
  2. Linear models
            a. Robust standard errors
            b. Generalized least squares
            c. Random effects models
            d. Fixed effects models
            e. Between-within models
  3. Logistic regression models
           a. Robust standard errors
           b. Generalized estimating equations
           c. Subject-specific vs. population averaged methods
           d. Random effects models
           e. Fixed effects models
           f.  Between-within models
  4. Models for count data     
           a. Poisson vs. negative binomial models
           b. GEE and random effects
           c. Fixed effects and between-within models
  5. Linear structural equation models
           a. Fixed and random effects in the SEM context
           b. Models for reciprocal causation with lagged effects


“This is by far the best course on longitudinal data analysis I’ve taken with an excellent balance of theoretical background and plenty of real-life examples and applications. Would highly recommend to anyone doing longitudinal data analysis. An extremely worthwhile use of time!”
  Jenny Lin, Icahn School of Medicine at Mount Sinai

“This is a very well-organized course that was extremely useful for the advancement of my knowledge and skills in the area of longitudinal data analysis. While I am familiar with this topic and have used some of the SAS procedures before, this course increased my confidence level in being able to quickly select the best analytic strategy and apply it to solving problems in the field of epidemiology and public health.”
  Hind Baydoun, Food and Drug Administration

“Really efficient course. I was in need of knowledge to perform several analyses for current research. This allowed me to learn what I needed in two intensive days. Applicable to all types of researchers, no matter what field you’re in.”

“This was one of the most useful workshops I have taken. The course materials were clear and will serve as excellent reference information going forward. The instructor was patient and extremely knowledgeable. I would highly recommend this course to anyone interested in longitudinal data analysis using SAS.”
  Emily Goldmann, New York University

“This course is well-organized and addresses the exact needs of the research community. Examples are very good and useful in handling real life data analysis problems, especially which analysis type to use in what situation.”
  Wasantha Jayawardene, Indiana Prevention Resource Center

“Longitudinal statistics are complicated and there are many options to analyze a given problem. Two prior courses had given me background about the statistical theory, but no confidence about the practical issues of choosing and performing models. This very practical class has changed all that, with good examples and lots of practice. I highly recommend this class to anyone who wishes to analyze repeating or clustered data.”
  Rita McGill, University of Chicago

“This course covered the subject matter in a clear and comprehensive manner. The integration of theory, practical examples, coding, and interpretation of results was excellent. It especially helped demystify when to use and how to interpret many of the multitude of model options available in SAS. Dr. Allison brings a wealth of knowledge and experience to the course.”
  James McMahon, University of Rochester

“The professor has explained the concepts clearly and he has always used examples for each model. I like that he has also explained how to solve many of the problems that arise in panel data estimates.”
  Maria Ángeles Rodríguez-Serrano, University of Seville

“I was faced with a new type of data – repeated measures. Traditional methods were inappropriate. I needed a course that would explain both the theory behind repeated measures analysis and how to code it correctly and optimally in SAS. This course was just that! After this, you can hit the ground running – it is designed for maximum applicability.”
  Moneeza Siddiqui, University of Dundee

“I learned a lot – thanks.”
  Ananda Manage, Sam Houston State University