A 2-day seminar taught by Kieran Healy, Ph.D.
To see a sample of the course materials, click here.
Check out the draft version of Kieran Healy’s upcoming book,”Data Visualization for Social Science” here.
The effective use of graphs and charts is an important way to explore data for yourself and to communicate your ideas and results to others. Being able to produce effective plots from data is also the best way to develop an eye for reading and understanding visualizations made by others, whether presented in academia, business, policy, or the media.
This two-day seminar provides an intensive, hands-on introduction to the principles and practice of data visualization. We will begin with an overview of some basic principles. We will focus not just on the aesthetic aspects of good plots, but on how their effectiveness is rooted in the way we perceive properties like length, absolute and relative size, orientation, shape, and color. Students will learn how to produce and refine plots using ggplot, a powerful, versatile, and widely-used visualization library for R. It implements a “grammar of graphics” that gives us a coherent way to produce visualizations by expressing relationships between the attributes of data and their graphical representation.
Through a series of worked examples, students will learn how to build plots piece by piece, beginning with summaries of single variables and moving on to more complex graphics. Topics covered include plotting continuous and categorical variables, layering information on graphics; faceting grouped data to produce effective “small multiple” plots; transforming data to easily produce visual summaries on the graph such as trend lines, linear fits, error ranges, and boxplots; creating maps, together with simpler alternatives to maps for country – or state – level data.
We will also cover cases where we are not working directly with a dataset but rather with estimates from a statistical model. Using these tools, we will then explore the practical process of refining plots to accomplish common tasks such as highlighting key features of the data, labeling particular points, annotating plots, and changing their overall appearance. Finally, we will examine some strategies for presenting graphical results in different formats (such as in print, online, or in slides) and to different sorts of audiences.
At the end of the course, participants will:
– Understand the basic principles behind effective data visualization
– Have a practical sense for why some graphs and figures work well while
others may fail to inform or actively mislead
– Know how to create a wide range of plots in R using ggplot2
– Know how to refine plots for effective presentation
To participate in the hands-on exercises, you are strongly encouraged to bring a laptop computer with the most recent version of R installed, together with the tidyverse library. Participants are also encouraged to download and install R Studio, a front-end for R that makes it easier to work with. This software is free and available for Windows, Mac, and Linux platforms.
WHO SHOULD ATTEND?
This course is for anyone who wants to learn how to produce, refine, and present effective visualizations generated from datasets, summary tables, or the output of statistical models.
Location, FORMAT, AND MATERIALS
The class will meet from 9 am to 5 pm each day with a 1-hour lunch break from 12 pm to 1 pm at Temple University Center City, 1515 Market Street, Philadelphia, PA 19103.
Participants receive a bound manual containing detailed lecture notes (with equations and graphics), examples of computer printout, and many other useful features. This book frees participants from the distracting task of note taking.
REGISTRATION AND LODGING
The fee of $995 includes all course materials.
If you cancel your registration at least two weeks before the course is scheduled to begin, you are entitled to a full refund (minus a processing fee of $50).
Lodging Reservation Instructions
A block of guest rooms has been reserved at the Club Quarters Hotel, 1628 Chestnut Street, Philadelphia, PA at a special rate of $134. In order to make reservations, call 203-905-2100 during business hours and identify yourself by using group code STAT30 or click here. For guaranteed rate and availability, you must reserve your room no later than Monday, October 30.
If you make reservations after the cut-off date ask for the Statistical Horizon’s room rate (do not use the code) and they will try to accommodate your request.
1. Basic principles of data visualization
– Why look at data?
– Beyond “good taste” in graphics
– Object perception and misperception
– Encoding data graphically
2. Using ggplot2 and R
– How to think about R
– How to think about ggplot
3. Understanding the grammar of graphics
– Mapping data values to plot aesthetics
– Building plots layer by layer
4. Plots of one, two, or more continuous or categorical variables
5. Grouped data, Faceting, and Small Multiples
6. Plots of estimates and effects
– Plotting tables of results
– Plotting directly from models
7. Plots in space and time
– Drawing maps, and their alternatives
– Animating plots
8. Refining plots for presentation
– Adding annotations or highlighting features
– Keys and labels
– Controlling overall appearance with themes
– Redrawing bad plots
“Great introduction to the conceptual framework needed to produce many standard graphics in my field. I brought my own data to class and created very sophisticated plots in a matter of hours.”
Chris Delcher, University of Florida
“This course was extremely beneficial in introducing many of the most common graphs produced in R. I now feel able to produce high quality graphics for articles and presentations.”