Longitudinal Data Analysis Using Stata

A 2-Day Seminar Taught by Paul Allison, Ph.D.

Read reviews of this course

To see a sample of the course materials, click here.


The most common type of longitudinal data is panel data, consisting of measurements of predictor and response variables at two or more points in time for many individuals. Such data have two major attractions: the ability to control for unobservables, and the investigation of causal ordering.

However, there is also a major difficulty with panel data: repeated observations are typically correlated, and this invalidates the usual assumption that observations are independent. As a result, confidence intervals and p-values can be severely biased. In some cases, coefficients may also be biased downward.

This course covers four methods for solving the problem of dependent observations: robust standard errors, generalized estimating equations, random effects models and fixed effects models. You’ll learn how to use these methods for quantitative outcomes, categorical outcomes, and count data outcomes. You’ll also learn which methods are best suited for different kinds of applications.

This is a hands-on seminar with ample opportunities to practice these new methods.

Here are a few of the topics you won’t want to miss:

  • How to use panel data to control for unobserved variables.
  • Why fixed effects methods often give very different results from random effects methods.
  • How to reshape data from long form to wide form and back again.
  • Why the default correlation structure for GEE is usually not the best.
  • The difference between maximum likelihood and restricted maximum likelihood.
  • How to estimate and interpret random coefficient models.
  • Why first-order autoregressive structures are usually unsatisfactory.
  • The difference between subject-specific coefficients and population-averaged coefficients, and why it matters.
  • How to do longitudinal analysis using ordered logit or multinomial logit.

In this seminar, we will use the following Stata commands: reg, reshape, xtreg, areg, mixed, xtset, xtgee, logit, xtlogit, clogit, melogit, meologit, nbreg, menbreg, lrtest, margins, marginsplot, hausman, xthybrid, and xtdpdml. Lecture notes using SAS and R are available on request from registered participants.


This seminar will use Stata for the many empirical examples and exercises. However, no previous experience with Stata is assumed. Lecture notes and exercises using SAS and R are also available on request. To participate in the hands-on exercises, you are strongly encouraged to bring a laptop computer with Stata installed (release 13 or higher; IC, SE, or MP versions are all acceptable). A power outlet and wireless access will be available at each seat.

Seminar participants who are not yet ready to purchase Stata could take advantage of StataCorp’s free 30-day evaluation offer or their 30-day software return policy.


If you need to analyze longitudinal data and have a basic statistical background, this course is for you. You should have a good working knowledge of the principles and practice of multiple regression, as well as elementary statistical inference. It is also helpful to have some familiarity with logistic regression. But you do not need to know matrix algebra, calculus, or likelihood theory. 

LOCAtion, Format, And Materials 

The class will meet from 9 am to 5 pm each day with a 1-hour lunch break at the SpringHill Suites San Diego Downtown/Bayfront, 900 Bayfront Court, San Diego, CA 92101.

Participants receive a bound manual containing detailed lecture notes (with equations and graphics), examples of computer printout, and many other useful features. This book frees participants from the distracting task of note taking. 

Registration and lodging

The fee of $995.00 includes all seminar materials.

Refund Policy

If you cancel your registration at least two weeks before the course is scheduled to begin, you are entitled to a full refund (minus a processing fee of $50). 

Lodging Reservation Instructions 

A block of guest rooms has been reserved at the SpringHill Suites San Diego Downtown/Bayfront, 900 Bayfront Court, San Diego, CA 92101, where the seminar takes place, at a special rate of $219 per night. In order to make reservations, call 888-287-9400 during business hours and identify yourself as part of the Statistical Horizons LLC group staying at the SpringHill Suites San Diego Downtown/Bayfront, or click here. For guaranteed rate and availability, you must reserve your room no later than Friday, January 17, 2020.

We also recommend going directly to the hotel’s website or checking other online hotel sites. Pricing varies and you may be able to secure a better rate.


1. Opportunities and challenges of panel data.
        a. Basic data structure and notation
        b. Why do we want panel data?
        c. Problem of dependence
        d. Software considerations

2. Linear models
        a. Robust standard errors
        b. Generalized least squares
        c. Random effects models
        d. Fixed effects models
        e. Between-within (hybrid) models

3. Logistic regression models
       a. Robust standard errors
       b. Generalized estimating equations
       c. Subject-specific vs. population averaged methods
       d. Random effects models
       e. Fixed effects models
        f. Between-within (hybrid) models

4. Methods for count data
       a. Poisson and negative binomial models.
       b. Robust standard errors.
       c. GEE
       d. Random effects
       e. Fixed Effects
        f. Between-within (hybrid) models

5. Linear structural equation models
     a. Fixed and random effects in the SEM framework
     b. xtdpdml command
     c. Models for reciprocal causation with lagged effects


“This course (and the materials) has clearly been crafted very thoughtfully and meticulously. The organization and flow of topics is coherent and consistent, and the exercises and examples are fitting and not too challenging. As someone with zero knowledge of Stata before this week, I was concerned that the programming might be additionally difficult to understand, but this was not at all the case! Paul is very knowledgeable, patient, and funny! Overall, this course was an excellent experience that I would recommend highly to anyone.”
  Erin Baker, The State University of New York at Albany

“The course is very informative and helpful in improving my statistical analysis skills. The course slides are very clear and organized with detailed explanations on both theories (formulas) and outputs from programs. I am new to longitudinal data analysis. This course is very helpful from data manipulation to modeling.”
  Shu Cao, Stanford University

“This course provided in-depth information on a variety of complex methods, with helpful real-world examples. I look forward to applying some of the techniques immediately in my research and sharing the knowledge in other methods to consider new applications or approaches for future research.”
  Maddy Oritt

“The short course was a straight-forward and well-paced overview of options for handling various types of panel data. The atmosphere allowed others to benefit from questions posed by other participants during class.”
  Brian Kelly, Purdue University

“This course has provided me with the confidence and skills to be able to move forward with longitudinal analysis of my dissertation data. It was a lot of material, packed into two days, but presented in a way that was easy to follow and great resources to return to when back in the field. I would take this class again!”
  Kirsten Marchand, University of British Columbia

“I really appreciated the detailed, step-by-step walkthroughs about what each part of our output meant and how to interpret the numbers conceptually.”
  Emily Kan, University of California, Irvine

“I highly recommend Longitudinal Data Analysis Using Stata! This course provided an excellent overview and provided the tools needed to run these models using my own data. Paul Allison was a fantastic instructor and made the content accessible to students with differing levels of Stata experience.”
  Alissa Knowles, University of California, Irvine

“Dr. Allison is a very didactic instructor. I especially like all the opportunities given in class to practice the materials covered using actual time on the computer. The handouts are a great way of having all the material organized and allow course-takers to be able to pay attention without the stress of taking notes.”
  Grettel Castro, Florida International University