Difference in Differences

A 4-Day Remote Seminar Taught by
Pedro H. C. Sant’Anna, Ph.D.

Difference-in-Differences (DiD) methods are widely used in the estimation of causal effects of policy interventions in the social and medical sciences. At their core, DiD methods leverage the fact that units are exposed to treatment at different points in time (or never exposed). Consequently, researchers can recover an average treatment effect by comparing outcomes from different treatment cohorts, before and after they have been exposed to treatment. A major advantage of using the DiD framework is that we can account for time trends and (time-invariant) unobserved heterogeneity when recovering causal effects.

This seminar offers a thorough introduction to classical and modern Difference-in-Differences methods. The main goal of this seminar is to enable researchers to get closer to the DiD research frontier.

Starting June 28, we are offering this seminar as a 4-day synchronous*, remote workshop for the first time. Each day will consist of a 3-hour live lecture held via the free video-conferencing software Zoom. You are encouraged to join the lecture live, but will have the opportunity to view the recorded session later in the day if you are unable to attend at the scheduled time.

Each lecture session will conclude with a hands-on exercise reviewing the content covered, to be completed on your own. An additional lab session will be held Monday and Wednesday afternoons, where you can review the exercise results with the instructor and ask any questions.

*We understand that scheduling is difficult during this unpredictable time. If you prefer, you may take all or part of the course asynchronously. The video recordings will be made available within 24 hours of each session and will be accessible for two weeks after the seminar, meaning that you will get all of the class content and discussions even if you cannot participate synchronously.


We will cover the theory and practice of DiD methods in great detail including topics such as:

  • The canonical two-periods, two-groups DiD
  • The role of covariates in DiD setups
  • DiD with variation in treatment timing
  • Design-based inference in DiD settings
  • DiD with violations of the parallel trends

At the end of the seminar, we expect that you should be comfortable and confident in using DiD methods to tackle your own research questions.


This seminar will use R for the empirical examples and exercises. To participate in the hands-on exercises, you are strongly encouraged to use a computer with the most recent version of R and RStudio installed. RStudio is a front-end for R that makes it easier to work with. This software is free and available for Windows, Mac, and Linux platforms.

If you’d like to take this course but are concerned that you don’t know enough R, there are excellent online resources for learning the basics including Statistics with R. Here are our recommendations.

In case you are more comfortable with Stata, the course will also reference Stata packages that implement the same DiD tools that will be demonstrated with R, whenever such packages exist.

WHO SHOULD Register? 

If you want to learn how to conduct causal inference using difference-in-differences methods, and have a basic statistical background, this course is for you. You should have a good working knowledge of the principles and practice of multiple regression, as well as elementary statistical inference. It is also helpful to have some basic familiarity with the R programming language.

Seminar Outline

1. The potential outcome framework

2. Review of basic statistical inference

3. The canonical two-periods, two-groups DiD

               a. Role of identifying assumptions: no-anticipation and parallel trends

               b. Implementation via simple comparison of means

               c. Implementation via regressions

4. Role of covariates in DiD setups

               a. Allowing for covariate-specific trends

               b. Pitfalls of two-way fixed effects linear regression specifications

               c. Estimating treatment effects using the outcome-regression

               d. Estimating treatment effects using the inverse-probability weighting

               e. Estimating treatment effects using a doubly-robust approach

5. DiD with variation in treatment timing

               a. What are the causal parameters of interest?

               b. What type of parallel trends are we willing to impose?

               c. Pitfalls of two-way fixed effects linear regression specifications

               d. Recovering meaningful causal parameters

               e. Highlighting treatment effect dynamics via event-studies

               f. Highlighting other sources of treatment effect heterogeneity

6. Other DiD topics:

               a. Design-based inference in DiD settings

               b. DiD with violations of the parallel trends assumption

               c. When is parallel trends sensitive to functional form?

               d. Distributional DiD methods

               e. Fuzzy DiD