Machine Learning for Estimating Causal Effects - Online Course
A 4-Day Livestream Seminar Taught by
Ashley Naimi10:30am-12:30pm (convert to your local time)
1:30pm-3:00pm
Machine learning is increasingly being used to evaluate cause-effect relations with social, economic, health, and business data. When used properly, these tools have tremendous potential to yield robust effect estimates with minimal assumptions. However, both machine learning and causal inference techniques add considerable complexity to an analysis, making proper use a challenge.
In this seminar, you will learn how to minimize biases that result from improper use of machine learning methods to answer practical questions about cause-effect relations in non-experimental data.
Starting August 6, we are offering this seminar as a 4-day synchronous*, livestream workshop held via the free video-conferencing software Zoom. Each day will consist of two lecture sessions which include hands-on exercises, separated by a 1-hour break. You are encouraged to join the lecture live, but will have the opportunity to view the recorded session later in the day if you are unable to attend at the scheduled time.
*We understand that finding time to participate in livestream courses can be difficult. If you prefer, you may take all or part of the course asynchronously. The video recordings will be made available within 24 hours of each session and will be accessible for four weeks after the seminar, meaning that you will get all of the class content and discussions even if you cannot participate synchronously.
Closed captioning is available for all live and recorded sessions. Captions can be translated to a variety of languages including Spanish, Korean, and Italian. For more information, click here.
More details about the course content
We will discuss how machine learning can be used to relax modeling assumptions, while avoiding problems with machine learning methods that result from the “curse of dimensionality.”
Through practical data and coding examples, you will learn to use cutting-edge “double-robust” machine learning methods (targeted minimum loss-based estimation, augmented inverse probability weighting) to estimate different treatment effects in real and simulated data.
The course will focus on building intuition, with numerous coding examples to gain practical experience.
We will discuss how machine learning can be used to relax modeling assumptions, while avoiding problems with machine learning methods that result from the “curse of dimensionality.”
Through practical data and coding examples, you will learn to use cutting-edge “double-robust” machine learning methods (targeted minimum loss-based estimation, augmented inverse probability weighting) to estimate different treatment effects in real and simulated data.
The course will focus on building intuition, with numerous coding examples to gain practical experience.
Computing
This seminar will use R for the empirical examples and exercises. To participate in the hands-on exercises, you are strongly encouraged to use a computer with the most recent version of R and RStudio installed. RStudio is a front-end for R that makes it easier to work with. This software is free and available for Windows, Mac, and Linux platforms. Basic familiarity with R is highly desirable, but even novice R coders should be able to follow the presentation and do the exercises.
If you’d like to take this course but are concerned that you don’t know enough R, there are excellent online resources for learning the basics. Here are our recommendations.
This seminar will use R for the empirical examples and exercises. To participate in the hands-on exercises, you are strongly encouraged to use a computer with the most recent version of R and RStudio installed. RStudio is a front-end for R that makes it easier to work with. This software is free and available for Windows, Mac, and Linux platforms. Basic familiarity with R is highly desirable, but even novice R coders should be able to follow the presentation and do the exercises.
If you’d like to take this course but are concerned that you don’t know enough R, there are excellent online resources for learning the basics. Here are our recommendations.
Who should register?
Participants should have a sound working knowledge of applied statistical analysis and interpretation, and the use and interpretation of linear and generalized linear regression modeling. Prior experience with machine learning and the counterfactual approach to causal inference will be helpful, but is not required.
Participants should have a sound working knowledge of applied statistical analysis and interpretation, and the use and interpretation of linear and generalized linear regression modeling. Prior experience with machine learning and the counterfactual approach to causal inference will be helpful, but is not required.
Seminar outline
Day 1
- Introduction to the Datasets
- Potential Outcomes, Estimands, Identifiability
- Parametric Regression for Effect Estimation
- G Computation
- Inverse Probability Weighting
- Machine Learning for Effect Estimation: The Curse of Dimensionality
- Double Robust Methods: Some Intuition
- Augmented Inverse Probability Weighting (AIPW)
- Targeted Minimum Loss-Based Estimation (TMLE)
Day 2
- Modeling the Exposure and the Outcome
- Machine Learning Algorithms 1:
- Neural Networks via nnet package
- Gradient boosting via xgboost
- Machine Learning Algorithms 2:
- CARTs and Random Forests via ranger
- Support Vector Machines via e1071
Day 3
- Meta Learners for the Exposure and Outcome Models: Stacking
- SuperLearner and sl3
- Tuning Parameter Grids
- Selection Algorithms
- Estimating Effects in Example Datasets 1
- TMLE3 + sl3 for the ATE, ATT, and ATU
- AIPW + sl3 for the ATE, ATT, and ATU
Day 4
- Estimating Effects in Example Datasets 2
- TMLE3 + sl3 for the ATE, ATT, and ATU
- AIPW + sl3 for the ATE, ATT, and ATU
- Machine Learning for Causal Effect Estimation: Wrapping Up
- Alternative Estimands
- Time-Dependent Exposure and Confounder Modeling
- Mediation Analysis
- Further Reading/Learning Materials
Day 1
- Introduction to the Datasets
- Potential Outcomes, Estimands, Identifiability
- Parametric Regression for Effect Estimation
- G Computation
- Inverse Probability Weighting
- Machine Learning for Effect Estimation: The Curse of Dimensionality
- Double Robust Methods: Some Intuition
- Augmented Inverse Probability Weighting (AIPW)
- Targeted Minimum Loss-Based Estimation (TMLE)
Day 2
- Modeling the Exposure and the Outcome
- Machine Learning Algorithms 1:
- Neural Networks via nnet package
- Gradient boosting via xgboost
- Machine Learning Algorithms 2:
- CARTs and Random Forests via ranger
- Support Vector Machines via e1071
Day 3
- Meta Learners for the Exposure and Outcome Models: Stacking
- SuperLearner and sl3
- Tuning Parameter Grids
- Selection Algorithms
- Estimating Effects in Example Datasets 1
- TMLE3 + sl3 for the ATE, ATT, and ATU
- AIPW + sl3 for the ATE, ATT, and ATU
Day 4
- Estimating Effects in Example Datasets 2
- TMLE3 + sl3 for the ATE, ATT, and ATU
- AIPW + sl3 for the ATE, ATT, and ATU
- Machine Learning for Causal Effect Estimation: Wrapping Up
- Alternative Estimands
- Time-Dependent Exposure and Confounder Modeling
- Mediation Analysis
- Further Reading/Learning Materials
Payment information
The fee of $995 includes all course materials.
PayPal and all major credit cards are accepted.
Our Tax ID number is 26-4576270.
The fee of $995 includes all course materials.
PayPal and all major credit cards are accepted.
Our Tax ID number is 26-4576270.