Categorical Data Analysis - Online Course
A 4-Day Livestream Seminar Taught by
Trenton Mize10:30am-12:30pm (convert to your local time)
1:30pm-3:00pm
Check out an exclusive 1-hour clip of Categorical Data Analysis, now on YouTube!
Many—perhaps even most—behavioral, health, and social science questions include outcome variables that are categorical. E.g., Which political candidate will win the next election? How does a parent’s social class influence children’s educational attainment? How many publications does it take to receive tenure? Do men or women drink more alcoholic drinks? Is a vaccine effective at preventing disease? Answering these—and countless other—questions cannot be adequately accomplished via the linear regression model and instead require the more advanced techniques covered extensively in this seminar.
Categorical Data Analysis is a seminar in applied statistics that primarily deals with regression models in which the dependent variable is binary, nominal, ordinal, or count. Many common statistical issues including interpretation of coefficients, calculation of predictions, testing of interaction effects, testing for mediation or other cross-model comparisons, and assessing model fit, require a different approach for models with categorical dependent variables. The focus of the course is on interpretation and learning to deal with the complications introduced by the nonlinearity of the models.
Specific models considered include: probit and logit for binary outcomes; ordered logit/probit and the generalized ordered logit model for ordinal outcomes; multinomial logit for nominal outcomes; and Poisson, negative binomial, and zero inflated models for counts.
Starting June 11, we are offering this seminar as a 4-day synchronous*, livestream workshop held via the free video-conferencing software Zoom. Each day will consist of two lecture sessions which include hands-on exercises, separated by a 1-hour break. You are encouraged to join the lecture live, but will have the opportunity to view the recorded session later in the day if you are unable to attend at the scheduled time.
*We understand that finding time to participate in livestream courses can be difficult. If you prefer, you may take all or part of the course asynchronously. The video recordings will be made available within 24 hours of each session and will be accessible for four weeks after the seminar, meaning that you will get all of the class content and discussions even if you cannot participate synchronously.
Closed captioning is available for all live and recorded sessions. Live captions can be translated to a variety of languages including Spanish, Korean, and Italian. For more information, click here.
Computing
The vast majority of what you will learn in this seminar can be applied in any software package. Resources (e.g. template code and examples) will be provided to all participants for conducting analyses in Stata and R. As the learning curve is harder for R than for Stata, those unfamiliar with either are encouraged to use Stata for the seminar.
Stata
The lecture slides are accompanied by a full set of replication files; the replication files use Stata. To replicate the instructor’s examples, you should have Stata already installed on your computer when the course begins. For Stata users, version 18 will be used for the examples, but the exercises can also be done with versions 14-17.
To follow along with the course exercises, you should be able to perform basic data manipulation and analyses in Stata. For users new to Stata, an “Introduction to Stata” guide will be provided before the seminar begins which covers the basics of getting started using Stata.
If you’d like to familiarize yourself with Stata basics before the seminar begins, we recommend following along with a “getting started” video like the one here.
Seminar participants who are not yet ready to purchase Stata could take advantage of StataCorp’s 30-day software return policy.
R
All methods taught in the seminar can also be done using R.
The seminar exercises can be done using R; template code will be provided. Additional resources to replicate the primary methods covered in the class using R will also be provided.
To participate in the hands-on exercises using R, you are strongly encouraged to use a computer with the most recent version of R installed. Participants are also encouraged to download and install RStudio, a front-end for R that makes it easier to work with. This software is free and available for Windows, Mac, and Linux platforms.
If you’d like to use R for this course but don’t yet have much experience with that package, here are some excellent on-line resources for building your R skills. You may want to consider taking a short introductory seminar on R.
The vast majority of what you will learn in this seminar can be applied in any software package. Resources (e.g. template code and examples) will be provided to all participants for conducting analyses in Stata and R. As the learning curve is harder for R than for Stata, those unfamiliar with either are encouraged to use Stata for the seminar.
Stata
The lecture slides are accompanied by a full set of replication files; the replication files use Stata. To replicate the instructor’s examples, you should have Stata already installed on your computer when the course begins. For Stata users, version 18 will be used for the examples, but the exercises can also be done with versions 14-17.
To follow along with the course exercises, you should be able to perform basic data manipulation and analyses in Stata. For users new to Stata, an “Introduction to Stata” guide will be provided before the seminar begins which covers the basics of getting started using Stata.
If you’d like to familiarize yourself with Stata basics before the seminar begins, we recommend following along with a “getting started” video like the one here.
Seminar participants who are not yet ready to purchase Stata could take advantage of StataCorp’s 30-day software return policy.
R
All methods taught in the seminar can also be done using R.
The seminar exercises can be done using R; template code will be provided. Additional resources to replicate the primary methods covered in the class using R will also be provided.
To participate in the hands-on exercises using R, you are strongly encouraged to use a computer with the most recent version of R installed. Participants are also encouraged to download and install RStudio, a front-end for R that makes it easier to work with. This software is free and available for Windows, Mac, and Linux platforms.
If you’d like to use R for this course but don’t yet have much experience with that package, here are some excellent on-line resources for building your R skills. You may want to consider taking a short introductory seminar on R.
Who should register?
If you need to analyze categorical outcome data (i.e. binary, ordinal, nominal, or count dependent variables) and have a basic statistical background and familiarity with regression, this seminar is for you. The seminar is helpful for graduate students, applied researchers, faculty, and others who want to learn these methods for the first time—but also for researchers who have some familiarity with the methods but want to learn the contemporary techniques now widely available for analyzing categorical data.
If you have a good working knowledge of linear regression, you are well-prepared for this seminar. The seminar assumes knowledge of linear regression at the level of Lewis-Beck’s Applied Regression. Those wanting a refresher in regression before beginning the seminar are encouraged to read Applied Regression, a short (120 page) and accessible overview of regression modeling.
If you are interested in this topic, check out our Distinguished Speaker seminar “Ordinal Regression” taught by Frank Harrell on May 29.
If you need to analyze categorical outcome data (i.e. binary, ordinal, nominal, or count dependent variables) and have a basic statistical background and familiarity with regression, this seminar is for you. The seminar is helpful for graduate students, applied researchers, faculty, and others who want to learn these methods for the first time—but also for researchers who have some familiarity with the methods but want to learn the contemporary techniques now widely available for analyzing categorical data.
If you have a good working knowledge of linear regression, you are well-prepared for this seminar. The seminar assumes knowledge of linear regression at the level of Lewis-Beck’s Applied Regression. Those wanting a refresher in regression before beginning the seminar are encouraged to read Applied Regression, a short (120 page) and accessible overview of regression modeling.
If you are interested in this topic, check out our Distinguished Speaker seminar “Ordinal Regression” taught by Frank Harrell on May 29.
Seminar outline
Day 1
-
- Why can’t I use OLS for all dependent variables?
- Nonlinear effects, interaction effects, and nonlinear interaction effects
- Binary dependent variables: logit and probit models
- Take-home data analysis assignment #1 (optional): binary DV models
Day 2
-
- Interpreting categorical dependent variable models: coefficients, multiplicative effects, predictions, marginal effects, and visualizations
- Count dependent variables: Poisson and negative binomial models
- Zero-inflated count models
Day 3
-
- Nominal dependent variables: multinomial logit models
- Ordinal models: ordinal logit and probit, generalized ordered logit models
- Take-home data analysis assignment #2 (optional): nominal and ordinal DV models
Day 4
-
- Interaction / moderation for categorical models
- Comparing predictions and effects across categorical models (e.g. mediation)
- Absolute and comparative model fit for categorical models
- Model diagnostics for categorical models
Day 1
-
- Why can’t I use OLS for all dependent variables?
- Nonlinear effects, interaction effects, and nonlinear interaction effects
- Binary dependent variables: logit and probit models
- Take-home data analysis assignment #1 (optional): binary DV models
Day 2
-
- Interpreting categorical dependent variable models: coefficients, multiplicative effects, predictions, marginal effects, and visualizations
- Count dependent variables: Poisson and negative binomial models
- Zero-inflated count models
Day 3
-
- Nominal dependent variables: multinomial logit models
- Ordinal models: ordinal logit and probit, generalized ordered logit models
- Take-home data analysis assignment #2 (optional): nominal and ordinal DV models
Day 4
-
- Interaction / moderation for categorical models
- Comparing predictions and effects across categorical models (e.g. mediation)
- Absolute and comparative model fit for categorical models
- Model diagnostics for categorical models
Payment instructions
The fee of $995 includes all course materials.
PayPal and all major credit cards are accepted.
Our Tax ID number is 26-4576270.
The fee of $995 includes all course materials.
PayPal and all major credit cards are accepted.
Our Tax ID number is 26-4576270.