Skip to content

Text as Data* - Online Course

A 3-Day Livestream Seminar Taught by

Amber Boydstun and Cory Struthers ,
Course Dates: Ask about upcoming dates
Schedule:

10:00am-12:30pm (convert to your local time)
1:30pm-3:30pm

Text is all around us: from archived court documents to this morning’s social media posts, from transcripts of political ads to terrorist manifestos. Text-as-data methods allow us to use this text to measure and discover phenomena that may be otherwise hard or impossible to represent quantitatively, such as ideological positions of court documents and emotional sentiment in manifestos.

There has never been a more exciting time to learn text-as-data methods. Digital advances have made available text content that even a few years ago would have been difficult to collect and computational text-as-data methods have advanced just as fast. However, because there are now countless text data to explore and a dizzying array of accessible text-as-data tools to apply, understanding which methods are appropriate for what contexts is critically important.

This course will provide an introduction to text-as-data methods, including how they work, how they can be applied, and common pitfalls to avoid. We will focus on linking concepts to measurement through textual data. Topics covered include: manual content analysis; text collection and pre-processing; advanced keyword queries and frequencies; dictionary analysis (including sentiment analysis); text similarity and reuse; topic modeling; and supervised machine learning.

Starting April 27, we are offering this seminar as a 3-day synchronous*, livestream workshop held via the free video-conferencing software Zoom. Each day will consist of two lecture sessions which include hands-on exercises, separated by a 1-hour break. You are encouraged to join the lecture live, but will have the opportunity to view the recorded session later in the day if you are unable to attend at the scheduled time.

*We understand that finding time to participate in livestream courses can be difficult. If you prefer, you may take all or part of the course asynchronously. The video recordings will be made available within 24 hours of each session and will be accessible for four weeks after the seminar, meaning that you will get all of the class content and discussions even if you cannot participate synchronously. 

Closed captioning is available for all live and recorded sessions.

More details about the course content

Computing

Who should register?

Seminar outline

Payment information