Analysis of Cost Data
A 3-Day Livestream Seminar Taught by
Henry Glick11:00am-3:00pm ET (New York time): Live lecture via Zoom
5:00pm-6:00pm ET: Live lab session via Zoom (Thursday and Friday only)
Data on costs typically have distributions that differ dramatically from the normal distribution. They are usually highly skewed to the right, with long heavy tails and high kurtosis, often with a preponderance of zeros. These characteristics often lead to violations of assumptions underlying typical univariate and multivariable tests of means such as t-tests and multiple regression analysis (OLS). Both appropriate and inappropriate methods have been proposed to overcome these violations.
This seminar assesses a number of these methods for analyzing costs and enables researchers to evaluate which methods may be more or less appropriate for the analysis of cost data. We will cover:
- Univariate statistics
- OLS/Log OLS
- Generalized linear models
- Generalized estimating equations and Extended estimating equations
- Models and methods for addressing missing data
- Constructing the cost outcome
- Addressing cost over time
- Sample size and power
The style of instruction is designed for participants coming from a variety of different subject-matter backgrounds. Examples will be presented using the Stata software package.
Starting February 24, we are offering this seminar as a 3-day synchronous*, livestream workshop. Each day will consist of a 4-hour live lecture held via the free video-conferencing software Zoom. You are encouraged to join the lecture live, but will have the opportunity to view the recorded session later in the day if you are unable to attend at the scheduled time.
Each day will include a hands-on exercise to be completed on your own after the lecture session is over. An additional lab session will be held Thursday and Friday afternoons, where you can review the exercise results with the instructor and ask any questions.
*We understand that scheduling is difficult during this unpredictable time. If you prefer, you may take all or part of the course asynchronously. The video recordings will be made available within 24 hours of each session and will be accessible for four weeks after the seminar, meaning that you will get all of the class content and discussions even if you cannot participate synchronously.
Closed captioning is available for all live and recorded sessions.
Computing
Stata will be used for all worked examples, and Dr. Glick’s Stata programs will be distributed for a number of the topics discussed. Data used in the exercises will be made available in other formats (e.g., SAS), but there will be no support available for programing in these languages. You will still greatly benefit from the instruction, comprehensive set of slides, and software syntax that you can apply later. If you wish to try the exercises, you should use a computer with the basic Stata package installed. For all but one or two methods (e.g., extended estimating equations), add-ons are not needed.
If you’d like to familiarize yourself with Stata basics before the seminar begins, we recommend following along with a “getting started” video like the one here.
Stata will be used for all worked examples, and Dr. Glick’s Stata programs will be distributed for a number of the topics discussed. Data used in the exercises will be made available in other formats (e.g., SAS), but there will be no support available for programing in these languages. You will still greatly benefit from the instruction, comprehensive set of slides, and software syntax that you can apply later. If you wish to try the exercises, you should use a computer with the basic Stata package installed. For all but one or two methods (e.g., extended estimating equations), add-ons are not needed.
If you’d like to familiarize yourself with Stata basics before the seminar begins, we recommend following along with a “getting started” video like the one here.
Who should register?
The course will benefit applied researchers, analysts, and students interested in enhancing their understanding of cost analysis and developing their application skills. Participants are assumed to have been exposed to introductory parametric statistics, such as that offered through an in-depth workshop or a typical university course.
The course will benefit applied researchers, analysts, and students interested in enhancing their understanding of cost analysis and developing their application skills. Participants are assumed to have been exposed to introductory parametric statistics, such as that offered through an in-depth workshop or a typical university course.
Seminar outline
Introduction to cost analysis
What cost statistic should we estimate?
- Welfare economic principles
Basic principles and univariate analysis
- Role of:
- Parametric tests of difference in cost
- Nonparametric tests of other characteristics of cost distribution
- Trade-offs between bias and skewness
- Transformation of data so that parametric tests’ assumptions are met
- Problems with analysis of log (and other) transformations
- Tests of sample mean that avoid parametric assumptions
Basics of cost data
- Generating the cost outcome
- Addressing costs incurred at different times
- Inflation
- Discounting
Multivariable models for cost analysis
OLS/log OLS
Generalized linear models
- Role of link function
- Difference between log OLS and log link
- Role of the family
- Diagnosing appropriate links and families
- Pregibon link test, Pearson correlation test, Modified Hosmer and
Lemeshow test, Modified Parks test, AIC, BIC
- Observed vs predicted mean costs
- Inconvenient truths
Other multivariable approaches
- GEE and EEE
Analysis in the face of missing cost data
Missing data methods
- Naïve methods
- GLM with inverse probability weights
- Linn ’97, Carrides regression method, multiple imputation
- Population average maximum likelihood longitudinal panel data analyses
Sample size and power for cost and cost-effectiveness analysis
Introduction to cost analysis
What cost statistic should we estimate?
- Welfare economic principles
Basic principles and univariate analysis
- Role of:
- Parametric tests of difference in cost
- Nonparametric tests of other characteristics of cost distribution
- Trade-offs between bias and skewness
- Transformation of data so that parametric tests’ assumptions are met
- Problems with analysis of log (and other) transformations
- Tests of sample mean that avoid parametric assumptions
Basics of cost data
- Generating the cost outcome
- Addressing costs incurred at different times
- Inflation
- Discounting
Multivariable models for cost analysis
OLS/log OLS
Generalized linear models
- Role of link function
- Difference between log OLS and log link
- Role of the family
- Diagnosing appropriate links and families
- Pregibon link test, Pearson correlation test, Modified Hosmer and
Lemeshow test, Modified Parks test, AIC, BIC
- Pregibon link test, Pearson correlation test, Modified Hosmer and
- Observed vs predicted mean costs
- Inconvenient truths
Other multivariable approaches
- GEE and EEE
Analysis in the face of missing cost data
Missing data methods
- Naïve methods
- GLM with inverse probability weights
- Linn ’97, Carrides regression method, multiple imputation
- Population average maximum likelihood longitudinal panel data analyses
Sample size and power for cost and cost-effectiveness analysis
Payment information
The fee of $895 includes all course materials.
PayPal and all major credit cards are accepted.
Our Tax ID number is 26-4576270.
The fee of $895 includes all course materials.
PayPal and all major credit cards are accepted.
Our Tax ID number is 26-4576270.