Data Visualization

A 4-Day Remote Seminar Taught by Kieran Healy, Ph.D.

Read reviews of this seminar

To see a sample of the course materials, click here.


The effective use of graphs and charts is an important way to explore data for yourself and to communicate your ideas and results to others. Being able to produce effective plots from data is also the best way to develop an eye for reading and understanding visualizations made by others, whether presented in academia, business, policy, or the media.

This seminar provides an intensive, hands-on introduction to the principles and practice of data visualization. We will begin with an overview of some basic principles. We will focus not just on the aesthetic aspects of good plots, but on how their effectiveness is rooted in the way we perceive properties like length, absolute and relative size, orientation, shape, and color. Students will learn how to produce and refine plots using ggplot, a powerful, versatile, and widely-used visualization library for R. It implements a “grammar of graphics” that gives us a coherent way to produce visualizations by expressing relationships between the attributes of data and their graphical representation.

Starting August 4, we are offering this seminar as a 4-day synchronous*, remote workshop for the first time. Each day will consist of a 3-hour, live morning lecture held via the free video-conferencing software Zoom. Participants are encouraged to join the lecture live, but will have the opportunity to view the recorded session later in the day if they are unable to attend at the scheduled time. Each lecture session will conclude with a hands-on exercise reviewing the content covered, to be completed on one’s own that afternoon. A final session will be held each evening as an “office hour”, where participants can review the exercise results with the instructor and ask any questions.

*We understand that scheduling is difficult during this unpredictable time. If you prefer, you may take all or part of the course asynchronously. The video recordings will be made available within 24 hours of each session, meaning that you will get all of the class discussion and exercise solutions even if you cannot participate synchronously.

MORE DETAILS ABOUT COURSE CONTENT


Through a series of worked examples, students will learn how to build plots piece by piece, beginning with summaries of single variables and moving on to more complex graphics. Topics covered include plotting continuous and categorical variables, layering information on graphics; faceting grouped data to produce effective “small multiple” plots; transforming data to easily produce visual summaries on the graph such as trend lines, linear fits, error ranges, and boxplots; creating maps, together with simpler alternatives to maps for country – or state – level data.

We will also cover cases where we are not working directly with a dataset but rather with estimates from a statistical model. Using these tools, we will then explore the practical process of refining plots to accomplish common tasks such as highlighting key features of the data, labeling particular points, annotating plots, and changing their overall appearance. Finally, we will examine some strategies for presenting graphical results in different formats (such as in print, online, or in slides) and to different sorts of audiences.

At the end of the course, participants will:
– Understand the basic principles behind effective data visualization
– Have a practical sense for why some graphs and figures work well while
   others may fail to inform or actively mislead
– Know how to create a wide range of plots in R using ggplot2
– Know how to refine plots for effective presentation


Computing

This remote seminar is held via Zoom, a free video conferencing application. Instructions for joining a session via Zoom are available here. Before the seminar begins, participants will receive an email with the meeting code and password you must use to join.  

To participate in the hands-on exercises, you are strongly encouraged to use a laptop computer with the most recent version of R installed, together with the tidyverse library. Participants are also encouraged to download and install RStudio, a front-end for R that makes it easier to work with. This software is free and available for Windows, Mac, and Linux platforms.

If you’d like to take this course but are concerned that you don’t know enough R, there are excellent on-line resources for learning the basics. Here are our recommendations.


Who should Register? 

This course is for anyone who wants to learn how to produce, refine, and present effective visualizations generated from datasets, summary tables, or the output of statistical models.

It is helpful to have familiarity with the R programming language.


SEMINAR OUTLINE

1. Basic principles of data visualization
     – Why look at data?
     – Beyond “good taste” in graphics
     – Object perception and misperception
     – Encoding data graphically
2. Using ggplot2 and R
     – How to think about R
     – How to think about ggplot
3. Understanding the grammar of graphics
     – Mapping data values to plot aesthetics
     – Building plots layer by layer
4. Plots of one, two, or more continuous or categorical variables
5. Grouped data, Faceting, and Small Multiples
6. Plots of estimates and effects
     – Plotting tables of results
     – Plotting directly from models
7. Plots in space and time
     – Drawing maps, and their alternatives
     – Animating plots
8. Refining plots for presentation
     – Adding annotations or highlighting features
     – Keys and labels
     – Controlling overall appearance with themes
     – Redrawing bad plots


RevieWs of Data Visualization

“This was an excellent, practical course for learning state of the art visualization methods in R. Dr. Healy also explained the newest data manipulation techniques, which will drastically reduce the amount of time I spend organizing and cleaning complex datasets.”
  Ella Foster-Molina, Swarthmore College

“So many outstanding examples! Dr. Healy did a terrific job. The coverage was comprehensive, from conceptual knowledge to the ‘nitty gritty’ of data viz.”
  Anonymous

“Dr. Healy is a very good instructor. He operates at the correct pace with an understanding that many of us are less experienced with R. He has been helpful with each of my questions.”
  Anonymous

“This is an excellent course. Dr. Healy is an excellent teacher, knowledgeable about different types of data visualization. He is very helpful in explaining smaller details. I generally take 5-7 workshops per year. This is the best workshop I’ve had so far. Learning is not incremental but substantial.”
  Towhidul Islam, University of Guelph

“Great introduction to the conceptual framework needed to produce many standard graphics in my field. I brought my own data to class and created very sophisticated plots in a matter of hours.” 
  Chris Delcher, University of Florida