Longitudinal Data Analysis Using Stata

A 2-Day Seminar Taught by Paul Allison

Read reviews of this course


The most common type of longitudinal data is panel data, consisting of measurements of predictor and response variables at two or more points in time for many individuals. Such data have two major attractions: the ability to control for unobservables, and the determination of causal ordering.

However, there is also a major difficulty with panel data: repeated observations are typically correlated and this invalidates the usual assumption that observations are independent. There are four widely available methods for dealing with dependence: robust standard errors, generalized estimating equations, random effects models and fixed effects models. This course examines each of these methods in some detail, with an eye to discerning their relative advantages and disadvantages. Different methods are considered for quantitative outcomes and categorical outcomes.

This is a hands-on course with ample opportunity for participants to practice the different methods. 


If you need to analyze longitudinal data and have a basic statistical background, this course is for you. You should have a good working knowledge of the principles and practice of multiple regression, as well as elementary statistical inference. And it is also helpful to have some familiarity with logistic regression. But you do not need to know matrix algebra, calculus, or likelihood theory. 


This seminar will use Stata for the many empirical examples and exercises. However, no previous experience with Stata is assumed. Lecture notes and exercises using SAS are also available on request. To participate in the hands-on exercises, you are strongly encouraged to bring a laptop computer with Stata installed (release 14; IC, SE, or MP versions are all acceptable). Stata 12 or 13 is OK, but earlier versions of Stata will lack some of the functionality demonstrated in the seminar.  A power outlet and wireless access will be available at each seat.

Seminar participants who are not yet ready to purchase Stata could take advantage of StataCorp’s 30-day software return policy and obtain Stata 14 on a trial basis. 


The class will meet from 9 am to 5 pm each day with a 1-hour lunch break at Courtyard Fort Myers at Gulf Coast Town Center, 10050 Gulf Center Drive, Fort Myers, Florida 33913.

This hotel is part of a large shopping center with numerous stores, restaurants, and a movie theater. It’s 3 miles from the Fort Myers International Airport, and there is a complementary hotel shuttle to and from the airport. Although you can expect the weather to be comfortably warm (75 is the average high in January), this is definitely not a resort-type location. However, it’s about a half-hour drive to several attractive vacation areas, including Naples, Sanibel Island, and Fort Myers Beach.

The Fort Myers International Airport (RSW) is served by numerous airlines with direct flights to and from most major cities in the U.S. However, demand for seats in January is quite high, so be sure to make reservations at your earliest opportunity.

Participants receive a bound manual containing detailed lecture notes (with equations and graphics), examples of computer printout, and many other useful features. This book frees participants from the distracting task of note taking. 

Registration and lodging

The fee of $995.00 includes all seminar materials.

Lodging Reservation Instructions

Three hotels are recommended: 

Please make your own reservations at the Courtyard or the Residence Inn. For the Hilton Garden, we have arranged a special rate of $149 per night. To get this rate, call 239-210-7200 during business hours and use group code “STAT.” The room block will expire when it is full or on Tuesday, December 22, 2015.


1. Opportunities and challenges of panel data.
        a. Data requirements
        b. Control for unobservables
        c. Determining causal order
        e. Problem of dependence
        d. Software considerations

2. Linear models
        a. Robust standard errors
        b. Generalized estimating equations
        c. Random effects models
        d. Fixed effects models
        e. Hybrid models

3. Logistic regression models
       a. Robust standard errors
       b. Generalized estimating equations
       c. Subject-specific vs. population averaged methods
       d. Random effects models
       e. Fixed effects models
       f. Hybrid models

4. Linear structural equation models
     a. Fixed and random effects in the SEM context
     b. Models for reciprocal causation with lagged effects


“The course provided a fantastic overview to understanding modeling panel data. I have no doubt it will be invaluable to my future research. I would highly recommend the course to anyone interested in pursuing research using panel data.”
  Karina Salazar, The University of Arizona

“I was grateful for the clear, organized way Dr. Allison presented complex ideas and techniques – once I felt confident in the first material presented, more complex applications were easy to understand and implement. I feel confident about applying those techniques to my own projects after completing this course. It was a large-scale payoff for a small investment in time.”
  Mikaela Dufur, Brigham Young University 

“I appreciated the clarity of presentation and the take-home material. This will make it easy to translate what I’ve learned to my own questions and data.”
  Vivia McCutcheon, Washington University School of Medicine 

“Great overview of several methods to analyze longitudinal data. This was a great refresher and review for someone who has already had training in LDA methods and for someone new to the topic.”
  Michelle Hughes, Sinai Urban Health Institute