The GDELT Project

Machines Watching Television News: A Visual Entity Graph Over Evening News Broadcasts 2010-2019

 

UPDATE (2/3/2020): This has been succeeded by Visual Global Entity Graph 2.0.

 

What would it look like to have machines watch a decade of ABC, CBS and NBC evening television news broadcasts and describe what they see, creating a minute-by-minute chronology of the visual themes and narratives that have defined global events 2010-2019? Today we are excited to unveil the results: a powerful new non-consumptive visual chronology, using data from the Internet Archive’s Television News Archive and Google’s Cloud Video API and building on our pilot one week analysis from earlier this year to watch more than 11,000 broadcasts and catalog the dominant visual themes and activities depicted over almost a decade of evening television news broadcasts.

Despite its rich visual-first nature, television news today is primarily explored through the modality of text. The Internet Archive’s Television News Archive has been a leader in this space, helping to popularize timecoded keyword search of closed captioning transcripts in the library context and exploring new research interfaces to television. While these keyword search systems provide incredible opportunities for exploring coverage trends and can be read by textual deep learning systems to catalog the things and themes they mention, at the end of the day, the very visual world that separates television from radio and the online world is absent from such analyses.

At the same time, the last few years have brought profound advances in machine vision, with algorithmic visual understanding moving from the research lab to production everyday use. What would it look like to have a state-of-the-art machine learning system watch a decade of television news broadcasts across “the big three” networks and catalog their visual themes?

Earlier this year we explored this vision using a single week of television news, demonstrating the wealth of insights that become visible. What happens when this same approach is applied over almost a decade?

To explore these questions further, the half-hour evening television news broadcasts of ABC, CBS and NBC for July 2009 – November 2019 from the Internet Archive’s Television News Archive were analyzed within a restricted access non-consumptive computational digital reference library using Google’s Cloud Video API, including its labeling feature in which it visually analyzes each second of footage and assigns a list of predefined labels describing the objects and activities depicted within.

The end result is a list of the visual themes (primarily objects and activities) found in each broadcast by minute, charting its visual narratives. Such a chronology offers a powerful and unique counterpart to the textual chronology of its closed captioning, allowing researchers for the first time to consider the visual dimension of television news.

What are the visual themes that are associated with the key narratives and events of the past decade? Through combining with the captioning chronology, it becomes possible to examine what kinds of imagery is typically depicted onscreen when a given topic is discussed and how that imagery has changed over the years. Longitudinal questions such as the amount of coverage by month over the past decade devoted to civil mobilizations like protests can also be explored for the first time. For example, using this new dataset it takes only a simple search to find that the two one-minute periods in the past decade with the most protest imagery were on NBC at 5:39PM PST on February 11, 2017 and ABC at 5:34PM PST on March 27, 2016, with 48 seconds and 46 seconds, respectively. This enables an entirely new approach to understanding visual narratives by using machines to sift through vast archives of video.

Perhaps most importantly, this new dataset will allow researchers to explore for the first time how a better understanding of the visual dimensions of news might help combat the spread of falsehoods and better assess the diffusion of contested narratives and inorganic campaigns. What might we learn from the rich visual processes of broadcast journalism that could be applied to help increase trust in digital journalism and contextualize the news in ways that combats misunderstanding and falsehood?

The dataset consists of one file per day, running from July 15, 2010 through November 30, 2019, similar to the transcript-based entity graph. Each file is in newline delimited JSON format, with each row representing one minute of a broadcast and containing an array of all of the visual entities identified by the Video API from that minute, ordered by the number of seconds they were depicted during that minute.

All annotations were performed using the "v1p3beta1" model of the Video API.

The complete set of daily files can be downloaded below (currently through November 30, 2019):

The dataset is also available in Google’s BigQuery:

Note that this is a preliminary dataset that may be missing some shows or have null or incomplete values for some fields, while some design decisions such as using a fixed camera definition may have unintended consequences depending on the balance of studio, fixed and mobile reporting composition in each broadcast. We are excited to explore this dataset alongside of you to learn how we can use machines to peer for the first time into the visual world of the news. Remember that this dataset was constructed entirely by machine, so you will undoubtedly encounter errors and all labels are the result of algorithmic decision, not human editorialization.

Stay tuned for a series of analyses and a human-friendly research interface to this data coming shortly!

We’re tremendously excited to see the kinds of advanced multimodal visual analyses you’re able to do with this powerful new non-consumptive dataset!