The GDELT Project

Visual Explorer: Master List Of ZIP Files For All 1.5 Million Broadcasts: Enabling At-Scale Non-Consumptive Visual Analysis Spanning 50 Countries Over 20 Years

The TV News Visual Explorer now encompasses selections from 98 channels spanning 50 countries and territories in 35 languages and dialects over 20 years on 5 continents. In all, 1.5 million broadcasts are now available in the Visual Explorer, with more coming online continually as we process the Internet Archive's Television News Archive historical backfile and its contemporary channels. Today we are releasing a master inventory file of all 1.5 million preview ZIP files to enable at-scale non-consumptive analysis.

For each broadcast, the Visual Explorer makes it "skimmable" by extracting one frame every 4 seconds at a fixed interval to represent the broadcast. These images are arrayed into a thumbnail grid in the Visual Explorer web interface. To enable at-scale non-consumptive visual analysis, each broadcast also makes available a ZIP file containing the full-resolution version of the images that make up the thumbnail grid.

You can download these ZIP files and analyze them through any off-the-shelf image analysis tool. Earlier this month we demonstrated running the ZIP file for a Russian television news broadcast through Google's Cloud Vision API and using the annotations to identify all of the clips from Fox News that were shown during the broadcast to examine how Russian state media is using Fox News coverage to advance its narratives about the invasion.

What if you want to scale up such an analysis, to look at all broadcasts from a given channel during a set of days? For some channels we have EPG program data that includes the name of each show, meaning you could filter to look just at all Tucker Carlson broadcasts, for example.

To help you with this, we've compiled a master inventory of the downloadable preview image ZIP files for all 1.5 million broadcasts as of yesterday:

You can download this file and filter by channel, date or show name (for channels that provide it) to compile a list of the matching ZIP files to download, making it trivial to curate collections to answer specific research questions.

Here are some tips for working with the collection at scale:

We are tremendously excited to see the kind of research that this immense and incredibly unique new collection enables!