This week we take a look at the topics associated with what Americans’ report that they have read, seen or heard about Hillary Clinton and Donald Trump since July 11th. Note that we have in past weeks focused on the specific words that respondents use in recalling news about the candidates. We now shift the focus to a set of broader topics and themes reflected in respondents’ comments. Doing so offers a more comprehensive account of the content of respondents’ recollections about information related to Clinton and Trump.
The results highlight a central aspect of the 2016 campaign: information about Trump has varied in theme, almost weekly, over the campaign – from Russia, to taxes, to women’s issues, etc; information about Clinton has in contrast been focused almost entirely on a single theme, email.
Trump and the Issue of the Day
Every week a new storyline about Trump emerges. The topics that Americans remember mirrored Trumps campaign activities during the summer. Then they transitioned to the debates, issues concerning his tax returns, and issues related to his view of women.
This figure highlights the changing topics that Americans remember about Trump since July. The x-axis shows the date and the y-axis the fraction of responses that fall into a particular topic. The graphic is interactive, scrolling over the topic shows the topic name in the upper left corner.
Clinton and Email
What Americans remember seeing, hearing, or reading about Clinton the most is email. Her email problems have stuck with her throughout the campaign. Only two times since the Democratic convention has email not been the main topic mentioned by Americans. The first time was when she had pneumonia and fell sick at a 9/11 Memorial Service; the predominant topic shifted to health. It was not until the debates began that the dominant topic finally shifted away from negative scandals related to email, the Clinton Foundation, and her health. Not surprisingly, the topic has shifted back to email given Director James Comey’s recent announcement.
This figure highlights the changing topics that Americans remember about Clinton since July. The x-axis shows the date and the y-axis the fraction of responses that fall into a particular topic. The graphic is interactive; scrolling over the topic shows the topic name in the upper left corner.
Because of Clinton’s extended public life and the continual criticism she has received in several areas, the disclosure of her use of a private email server became a symbol of public concerns about her. Coupled with repeated events in which the server issue was highlighted, including congressional hearings, a stream of State Department disclosures of retrieved emails, and the FBI investigation, this topic has dominated what people remember about Clinton for just about the entire campaign.
Donald Trump is a political novice about whom relatively little was known before the campaign. As a result, any topic receiving news coverage or entering the public sphere was interesting and memorable. This also gave Trump more control of issue space to which the public was exposed. The variety of such subjects is reflected in the variation of recalled topics throughout the campaign.
For more on the “Read, Seen, Heard” data, watch Stuart Soroka’s presentation “Read, Seen, Heard: A Text-Analytic Approach to Campaign Dynamics.”
Topic Generation Methods
A team at Gallup reviewed the daily data and from that created a set of broad topics that best summarized the words Americans were using when describing what they had read, seen or heard about the candidates. For each topic, the team identified a set of relevant words that could be grouped into each topic. Then for each term, frequent pattern mining was used to identify other terms that co-occurred frequently with each topic word. These words were manually evaluated by the Gallup team to determine whether or not the words should be added to the relevant topic. Other frequent words were also generated and served as the basis for new topics when appropriate.
Each word in a topic is given a weight based on how important the word is to the topic. This is computed as the word-frequency-inverse topic frequency (a variant of term frequency – inverse document frequency – tf-idf). A high weight indicates a word that appears more frequency with words within the topic than outside the topic. To determine the topic of a response, a weighted sum of matching words is calculated for each topic. The topic with the highest score is the topic assigned to the response.
Results are based on telephone interviews conducted beginning on July 11, 2016, on the Gallup U.S. Daily survey, with a daily random sample of 500 adults, aged 18 and older, living in all 50 U.S. states and the District of Columbia. For results based on each weekly samples of approximately 3,500 national adults, the margin of sampling error is ±2 percentage points at the 95% confidence level. All reported margins of sampling error include computed design effects for weighting.
Each sample of national adults includes a minimum quota of 60% cellphone respondents and 40% landline respondents, with additional minimum quotas by time zone within region. Landline and cellular telephone numbers are selected using random-digit-dial methods.
*Figures created by Yanan Zhu and Lisa Singh.