Machine Learning and Mediation - Online Course
A 4-Day Livestream Seminar Taught by
David BenkeserTuesday, August 1–
Friday, August 4, 2023
10:30am-12:30pm (convert to your local time)
1:30pm-3:00pm
A fundamental goal in many areas of research is to describe the pathways whereby a treatment or intervention has an impact on downstream outcomes. Many methods have been developed over the years across many different literatures to tackle this problem, providing researchers with a set of tools for assessing mediation questions and the formal causal assumptions that they require.
At the same time, interest in machine learning and artificial intelligence has blossomed. This naturally leads to the question as to whether and how these tools can be appropriately combined with mediation methods.
In this course, we will guide you through the latest methods in mediation analysis, with a special emphasis on the integration of cutting-edge machine learning and artificial intelligence techniques. You will learn how to use regression stacking and super learning to build regression models that can unlock new insights and pathways for your research.
The course will also cover multiply robust approaches, which provide a natural and powerful means of incorporating machine learning into mediation analysis while preserving the validity of confidence intervals and hypothesis tests. All methods will be illustrated with hands-on data analysis using R.
Starting August 1, we are offering this seminar as a 4-day synchronous*, livestream workshop held via the free video-conferencing software Zoom. Each day will consist of two lecture sessions which include hands-on exercises, separated by a 1-hour break. You are encouraged to join the lecture live, but will have the opportunity to view the recorded session later in the day if you are unable to attend at the scheduled time.
*We understand that finding time to participate in livestream courses can be difficult. If you prefer, you may take all or part of the course asynchronously. The video recordings will be made available within 24 hours of each session and will be accessible for four weeks after the seminar, meaning that you will get all of the class content and discussions even if you cannot participate synchronously.
Closed captioning is available for all live and recorded sessions. Live captions can be translated to a variety of languages including Spanish, Korean, and Italian. For more information, click here.
Computing
Demonstrations of methods will use R and RStudio. All code will be provided.
The code can be run in a virtual computing environment that can be accessed in a web browser. Alternatively, if students wish to run code on their local computers, detailed instructions will be provided for proper installation of R and R packages.
If you’d like to take this course but are concerned that you don’t know enough R, there are excellent on-line resources for learning the basics. Here are our recommendations.
Demonstrations of methods will use R and RStudio. All code will be provided.
The code can be run in a virtual computing environment that can be accessed in a web browser. Alternatively, if students wish to run code on their local computers, detailed instructions will be provided for proper installation of R and R packages.
If you’d like to take this course but are concerned that you don’t know enough R, there are excellent on-line resources for learning the basics. Here are our recommendations.
Who should register?
This course is geared towards researchers with experience in data analysis and statistics. Some understanding of the following topics is necessary for this course:
-
- Probability (e.g., what is meant by the distribution of random variable, its mean and its variance)
- Statistical inference (what is meant by confidence intervals, hypothesis tests)
- Basic regression (linear and logistic).
Additional understanding of the following topics is useful but not essential: directed acyclic graphs, basic machine learning concepts (e.g., penalization, cross-validation), and the use of an integral to represent the expectation of a random variable. Prior experience with R is highly desirable.
This course is geared towards researchers with experience in data analysis and statistics. Some understanding of the following topics is necessary for this course:
-
- Probability (e.g., what is meant by the distribution of random variable, its mean and its variance)
- Statistical inference (what is meant by confidence intervals, hypothesis tests)
- Basic regression (linear and logistic).
Additional understanding of the following topics is useful but not essential: directed acyclic graphs, basic machine learning concepts (e.g., penalization, cross-validation), and the use of an integral to represent the expectation of a random variable. Prior experience with R is highly desirable.
Seminar outline
- Types of causal mediation estimands
- Controlled direct effect
- Natural direct effect
- Natural indirect effect
- Impact of exposure-induced confounding
- Classic estimation of mediation estimands using linear or logistic regression
- Motivation for machine learning
- Introduction to super learning
- Implementation in R
- Types of causal mediation estimands
- Controlled direct effect
- Natural direct effect
- Natural indirect effect
- Impact of exposure-induced confounding
- Classic estimation of mediation estimands using linear or logistic regression
- Motivation for machine learning
- Introduction to super learning
- Implementation in R
Payment information
The fee of $995 includes all course materials.
PayPal and all major credit cards are accepted.
Our Tax ID number is 26-4576270.
The fee of $995 includes all course materials.
PayPal and all major credit cards are accepted.
Our Tax ID number is 26-4576270.