On November 1, 2023, Professor William H. Greene will present a Distinguished Speaker seminar on “Econometric Analysis of Discrete Choices.” In this post, Professor Trenton Mize–who teaches several courses for Statistical Horizons–explains what discrete choice analysis is all about and why many researchers have found it to be such a useful tool.
What affects the choices we make? In most situations, our choices are a product of both our individual preferences and the attributes of the options we must choose from. Our eventual choices reveal useful details about the importance of various factors. For example, to understand the last election we might examine how individual characteristics of voters (e.g., their gender, age, party affiliation, etc.) correlate with vote choice. Contextual level factors (e.g., where someone lives, how the economy is doing, etc.) certainly matter too. But we should also consider how aspects of the candidates themselves (e.g., their experience, policy positions, party affiliation, etc.) affect the votes they received (or not). Finally, who else was on the ballot? That is, what were the choices available?
My goal in this post is to introduce discrete choice analysis to researchers who may be unfamiliar with these models and/or have never considered them for their work, and to provide several examples of applications in the health and social sciences. Discrete choice analysis allows us to expand our models to include characteristics of the choosers (and their contexts), characteristics of the choice options, interactions between chooser characteristics and option characteristics and, finally, the makeup of the choice set itself. This expansion can provide a much richer and more realistic understanding of the processes by which decisions are made.
Potential Applications of Discrete Choice Analysis
Discrete choice analysis originated in economics and marketing to model consumer behavior. However, it also has great potential for modeling many phenomena in the health and the social sciences. Here are a few examples:
- Applications to health science. When designing a diet, discrete choice analysis could be used to determine which aspects of a diet people are not willing to give up and which they are willing to change in return for some wanted outcome (e.g., weight loss). Further, we can see what individual-level (e.g., age or current weight) and contextual-level (e.g., where someone lives) factors influence these choices. On top of all of this, we can—and should—account for the differential choices available to different people (a point key to medical sociology’s health lifestyle theory).
- Applications to labor market studies. Audit and hiring studies are popular methods for studying which characteristics of job applicants are most influential for who gets hired, and to determine whether certain groups face discrimination. A discrete choice experiment could be used in the hiring context to determine which characteristics of workers are deemed essential, vs. those seen as optional, vs. those seen as nonstarters. Additionally, one can examine how the importance of those characteristics varies across different types of employers.
- Applications to environmental science. Making choices that are good for the environment can involve tradeoffs. For example, you may really like a car that gets poor gas mileage, but also be willing to consider an alternative like an electric car. Contextual factors, like the widespread availability of electric car chargers where you live, as well as option-specific factors like the range of the electric car, will affect your eventual choice.
- Applications to education. Many students who apply to college apply to multiple places. How do they decide on where to apply given so many options? How do they decide where to attend if they are accepted to multiple colleges? How important is location, class size, national ranking, tuition cost, etc.?
- Applications to political science. What characteristics of a politician predict success? What limitations on a candidate’s resume are people willing to overlook? And what characteristics do they think are necessary to earn their vote? Finally, how is all of this dependent on the context in which a candidate is running—and who the alternatives on the ballot are?
- Applications to urban planning. Space planning always involves tradeoffs. Discrete choice analysis could help reveal whether residents (or potential residents) of a city prioritize things like proximity to schools or proximity to restaurants and nightlife. Further, how do individual residents prioritize tradeoffs differently, such as differing preferences of young vs. old residents?
The types of decisions described above are all strong candidates for discrete choice analysis. While many of these models will look quite similar to other categorical data analysis models (e.g., binary logit/probit, multinomial logit, etc.) the discrete choice framework leads to useful and important extensions. Discrete choice analysis expands on categorical models in ways that more fully reflect the way choices are made.
Most of the statistical underpinnings of discrete choice models are shared with models for categorical dependent variables, such as those you would learn in a class on categorical data analysis or on generalized linear models. One important difference is the theoretical underpinning of the models, with discrete choice models usually motivated by microeconomic theory which assumes that individual choices help reveal preferences and acceptable alternatives or tradeoffs. Sometimes, this leads to a model and analysis that will look familiar to those outside of economics—such as a binary logit model to understand how different factors affect how consumers make choices between two options. However, discrete choice analysis includes many extensions beyond these situations that build choice-architecture into the models in important ways.
Explaining Nominal Choices
Nominal outcomes offer a useful example of some unique aspects of discrete choice models. Consider a situation where you would like to know what predicts the choices someone made among a set of three or more unordered options. For most health and social scientists like me, we immediately think of the multinomial logit model which is the most popular option for modeling nominal dependent variables. And indeed, this is a good option for many of the types of data health and social scientists use. For example, almost all my work analyzes survey responses, trying to understand how individual and/or contextual level factors influence the choices people make in response to a survey item—e.g., their opinion about parental leave. In these cases, all participants in the study are given the same choices—e.g., asked whether the father, mother, or both parents should take parental leave after the birth of a child. A multinomial logit model can be used to examine how factors such as the participant’s age or gender affect attitudes about who should take parental leave.
If you have learned about the multinomial logit model before, you have probably heard of the independence of irrelevant alternatives (IIA) assumption. The assumption here is that the ability of the independent variables to distinguish between categories of the dependent variable would be similar even if participants had been given different choices. In many cases, this is implausible—for example, adding a category of “neither should take leave” could affect the dynamics among the other categories of the parental leave question mentioned above. However, such a hypothetical is rarely of direct interest in social science studies: we usually are fine accepting that results may differ slightly with different survey instruments with different response options. For the work I do, I agree with Tutz (2021) that “In sociology, the objective is often to analyze the response behavior in questionnaires without assuming that relationships would be identical if response options differed. Then no alternative settings are needed, and the problem of irrelevant alternatives is of no relevance.”
However, there are many cases where IIA really does matter, and we need models that consider individual-level choice architecture. I don’t think of this as a nuisance, but instead an opportunity—and a place where discrete choice analysis shines. For example, if we want to know which grocery store/supermarket someone chooses to shop at, we have to recognize that people live in different places—all with their own options for groceries. For example, I used to live in Georgia where I would always choose to shop at Publix—but unfortunately there are no Publix stores where I now live in Indiana (this is a point of sadness). In order to understand how individual (e.g., gender, income, age, etc.), contextual (e.g., county size, urbanicity, public transport, etc.), and option-specific (e.g., aspects of the stores themselves) factors affect the choices people make about where to buy groceries, we need to use models that can reflect the different choices each person has available to them. Discrete choice models can incorporate this important complication.
Additional Applications of Discrete Choice
Most of my examples have focused on observational data. However, these ideas are also instrumental for a class of experiments called discrete choice experiments. These are similar in many ways to the types of experiments that are booming in popularity in the social sciences, such as factorial and conjoint experiments. With a factorial experimental design we can include multiple factors and examine how they impact choices both individually and in combination with other factors. With a conjoint design we can force individuals to choose between multiple options (often two) which helps reveal even subtle preferences. Both types of designs can be used individually or in combination to help understand discrete choices. Further, we can incorporate aspects of choice-architecture into the experimental design, by providing different choices to different participants to better reflect real-world situations where we rarely have every possible choice as a realistic option.
Want to Learn More?
If you want to learn more about discrete choice analysis, consider attending the upcoming seminar by William H. Greene as part of Statistical Horizons’ Distinguished Speaker Series. For an overview of discrete choice models, I highly recommend Greene’s 2009 chapter. And finally, if you are interested in using these types of models in your work, they are now widely available in popular statistical software programs such as Stata and R.
Cockerham, W. C. (2005). Health lifestyle theory and the convergence of agency and structure. Journal of Health and Social Behavior, 46(1), 51-67.
de Bekker‐Grob, E. W., Ryan, M., & Gerard, K. (2012). Discrete choice experiments in health economics: a review of the literature. Health Economics, 21(2), 145-172.
Greene, W. (2009). Discrete choice modeling. In Palgrave Handbook of Econometrics: Volume 2: Applied Econometrics (pp. 473-556). London: Palgrave Macmillan UK.
Mize, T. D., Kaufman, G., & Petts, R. J. (2021). Visualizing shifts in gendered parenting attitudes during COVID-19. Socius, 7, 23780231211013128.
Mize, T. D., & Manago, B. (2022). The past, present, and future of experimental methods in the social sciences. Social Science Research, 108, 102799.
Tutz, G. (2021). Uncertain choices: the heterogeneous multinomial logit model. Sociological Methodology, 51(1), 86-111.