I am currently a fellow at Harvard's Kennedy School and Northeastern University. I am also an Assistant Professor of Social Science Informatics at the University of Iowa. My research uses novel quantitative, automated, and machine learning methods to analyze non-traditional data sources such as audio (or speech) data and video data. I use these to understand the causes and consequences of elite emotional expressions in a variety of institutional settings, with a particular emphasis on non-verbal cues, such as vocal pitch. More recently, I have also used audio and video analysis to explore issues related to descriptive representation and implicit gender/racial bias. Underlying each of these research agendas is a love for high performance computing and a genuine desire to make "big data" more accessible.
Words are spoken, not written. Most of my work using political texts emphasizes this point. For example, the word cloud you see above is from Democratic speeches delivered in the 111th and 112th U.S. House of Representatives. Unlike others, this word cloud is weighted by the speaker's vocal inflections (e.g., amplitude and pitch). Such an approach emphasizes the intersection between text and audio data. Even though they are not shown here, I have made similar calculations using video-based measures.
Like you and I, political elites are emotional beings. Whether it is Joe Wilson (R-SC) yelling “You Lie!” at President Obama or the late Antonin Scalia raising his tone during oral arguments, many elites seem to wear their hearts on their sleeves. In order to answer these questions, this research program introduces large-n audio analysis to political science. Using R, Python, and Perl, I scraped text and audio from over 7,000 floor speeches found in the C-SPAN Video Library and the House Video Archives. This project won the University of Illinois’ Kathleen L. Burkholder Prize for best dissertation in Political Science in 2013 or 2014.
My work with Maya Sen and Ryan Enos is a continuation of this project. Using audio data from over 240,000 questions and comments delivered during oral arguments between 1981-2014 we explore emotional expression on the Supreme Court. The figures to the left (not emotionally activated) and right (emotionally activated) are from this work. Both are samples of audio from Justice Scalia. Yellow and red areas indicate higher amplitude. The vocal pitch is shown at the bottom. We currently have one paper under review from this project.
We are surrounded by videos. 500 hours of video is uploaded to YouTube every minute. There are an estimated 30 million surveillance cameras in the United States shooting 4 billion hours of video a week. The C-SPAN video library has 221,898 hours of video and counting. Using computer vision techniques, I use these videos to understand behavior inside and outside Captiol Hill.
In my paper, "Using Motion Detection to Measure Social Polarization in the U.S. House of Representatives," I use motion detection to understand the degree to which Democrats and Republicans talk to one another on the House floor. Using 6,526 C-SPAN videos and OpenCV, I fine members of Congress are increasingly less likely to literally cross the aisle. This project has received grant support from C-SPAN, Amazon, and the University of Missouri.