Categorical Data Analysis
A 4-Day Livestream Seminar Taught by
Trenton MizeTuesday, June 14, 2022 –
Friday, June 17, 2022
10:30am-12:30pm ET (New York time): Live session via Zoom
1:30pm-3:00pm ET: Live session via Zoom
Many—perhaps even most— behavioral, health, and social science questions include outcome variables that are categorical. E.g. Which political candidate will win the next election? How does a parent’s social class influence children’s educational attainment? How many publications does it take to receive tenure? Do men or women drink more alcoholic drinks? Is a vaccine effective at preventing disease? Answering these—and countless other—questions cannot be adequately accomplished via the linear regression model and instead require the more advanced techniques covered extensively in this seminar.
Categorical Data Analysis is a seminar in applied statistics that primarily deals with regression models in which the dependent variable is binary, nominal, ordinal, or count. Many common statistical issues including interpretation of coefficients, calculation of predictions, testing of interaction effects, testing for mediation or other cross-model comparisons, and assessing model fit, require a different approach for models with categorical dependent variables. The focus of the course is on interpretation and learning to deal with the complications introduced by the nonlinearity of the models.
Specific models considered include: probit and logit for binary outcomes; ordered logit/probit and the generalized ordered logit model for ordinal outcomes; multinomial logit for nominal outcomes; and Poisson, negative binomial, and zero inflated models for counts.
Starting June 14, we are offering this seminar as a 4-day synchronous*, livestream workshop held via the free video-conferencing software Zoom. Each day will consist of two lecture sessions which include hands-on exercises, separated by a 1-hour break. Participants are encouraged to join the lecture live, but will have the opportunity to view the recorded session later in the day if they are unable to attend at the scheduled time.
*We understand that scheduling is difficult during this unpredictable time. If you prefer, you may take all or part of the course asynchronously. The video recordings will be made available within 24 hours of each session and will be accessible for four weeks after the seminar, meaning that you will get all of the class content and discussions even if you cannot participate synchronously.
Closed captioning is available for all live and recorded sessions.
Computing
The vast majority of what you will learn in this seminar can be applied in any software package. Resources (e.g. template code and examples) will be provided to all participants for conducting analyses in Stata and R. As the learning curve is harder for R than for Stata, those unfamiliar with either are encouraged to use Stata for the seminar.
Stata
The lecture slides are accompanied by a full set of replication files; the replication files use Stata. To replicate the instructor’s examples, you should have Stata already installed on your computer when the course begins. No previous experience with Stata is needed, however, because all necessary code will be provided.
For Stata users, version 17 will be used for the examples, but the exercises can also be done with versions 14-16.
If you’d like to familiarize yourself with Stata basics before the seminar begins, we recommend following along with a “getting started” video like the one here.
Seminar participants who are not yet ready to purchase Stata could take advantage of StataCorp’s free 30-day evaluation offer or their 30-day software return policy.
R
All methods taught in the seminar can also be done using R.
The seminar exercises can be done using R; template code will be provided. Additional resources to replicate the primary methods covered in the class using R will also be provided.
To participate in the hands-on exercises using R, you are strongly encouraged to use a computer with the most recent version of R installed. Participants are also encouraged to download and install RStudio, a front-end for R that makes it easier to work with. This software is free and available for Windows, Mac, and Linux platforms.
If you’d like to take this course but are concerned that you don’t know enough R, there are excellent on-line resources for learning the basics. Here are our recommendations.
The vast majority of what you will learn in this seminar can be applied in any software package. Resources (e.g. template code and examples) will be provided to all participants for conducting analyses in Stata and R. As the learning curve is harder for R than for Stata, those unfamiliar with either are encouraged to use Stata for the seminar.
Stata
The lecture slides are accompanied by a full set of replication files; the replication files use Stata. To replicate the instructor’s examples, you should have Stata already installed on your computer when the course begins. No previous experience with Stata is needed, however, because all necessary code will be provided.
For Stata users, version 17 will be used for the examples, but the exercises can also be done with versions 14-16.
If you’d like to familiarize yourself with Stata basics before the seminar begins, we recommend following along with a “getting started” video like the one here.
Seminar participants who are not yet ready to purchase Stata could take advantage of StataCorp’s free 30-day evaluation offer or their 30-day software return policy.
R
All methods taught in the seminar can also be done using R.
The seminar exercises can be done using R; template code will be provided. Additional resources to replicate the primary methods covered in the class using R will also be provided.
To participate in the hands-on exercises using R, you are strongly encouraged to use a computer with the most recent version of R installed. Participants are also encouraged to download and install RStudio, a front-end for R that makes it easier to work with. This software is free and available for Windows, Mac, and Linux platforms.
If you’d like to take this course but are concerned that you don’t know enough R, there are excellent on-line resources for learning the basics. Here are our recommendations.
Who should register?
If you need to analyze categorical outcome data (i.e. binary, ordinal, nominal, or count dependent variables) and have a basic statistical background, this seminar is for you. The seminar is helpful for graduate students, applied researchers, faculty, and others who want to learn these methods for the first time—but also for researchers who have some familiarity with the methods but want to learn the contemporary techniques now widely available for analyzing categorical data.
If you have a good working knowledge of linear regression, you are well-prepared for this seminar.
If you need to analyze categorical outcome data (i.e. binary, ordinal, nominal, or count dependent variables) and have a basic statistical background, this seminar is for you. The seminar is helpful for graduate students, applied researchers, faculty, and others who want to learn these methods for the first time—but also for researchers who have some familiarity with the methods but want to learn the contemporary techniques now widely available for analyzing categorical data.
If you have a good working knowledge of linear regression, you are well-prepared for this seminar.
Seminar outline
Day 1
-
- Why can’t I use OLS for all dependent variables?
- Nonlinear effects, interaction effects, and nonlinear interaction effects
- Binary dependent variables: logit and probit models
- Take-home data analysis assignment #1 (optional): binary DV models
Day 2
-
- Interpreting categorical dependent variable models: coefficients, multiplicative effects, predictions, marginal effects, and visualizations
- Count dependent variables: Poisson and negative binomial models
- Zero-inflated count models
Day 3
-
- Nominal dependent variables: multinomial logit models
- Ordinal models: ordinal logit and probit, generalized ordered logit models
- Take-home data analysis assignment #2 (optional): nominal and ordinal DV models
Day 4
-
- Interaction / moderation for categorical models
- Comparing predictions and effects across categorical models (e.g. mediation)
- Absolute and comparative model fit for categorical models
- Model diagnostics for categorical models
Day 1
-
- Why can’t I use OLS for all dependent variables?
- Nonlinear effects, interaction effects, and nonlinear interaction effects
- Binary dependent variables: logit and probit models
- Take-home data analysis assignment #1 (optional): binary DV models
Day 2
-
- Interpreting categorical dependent variable models: coefficients, multiplicative effects, predictions, marginal effects, and visualizations
- Count dependent variables: Poisson and negative binomial models
- Zero-inflated count models
Day 3
-
- Nominal dependent variables: multinomial logit models
- Ordinal models: ordinal logit and probit, generalized ordered logit models
- Take-home data analysis assignment #2 (optional): nominal and ordinal DV models
Day 4
-
- Interaction / moderation for categorical models
- Comparing predictions and effects across categorical models (e.g. mediation)
- Absolute and comparative model fit for categorical models
- Model diagnostics for categorical models
Payment instructions
The fee of $895 includes all course materials.
PayPal and all major credit cards are accepted.
Our Tax ID number is 26-4576270.
The fee of $895 includes all course materials.
PayPal and all major credit cards are accepted.
Our Tax ID number is 26-4576270.