Four Massive Datasets Charting The Global Climate Change News Narrative 2009-2020

Over the past two days we have released a set of four massive new datasets designed to fundamentally advance the study of the global climate change news narrative. The first catalogs the 95,000 television news mentions of climate change on CNN, MSNBC and Fox News 2009-2020 and BBC News London 2017-2020. Each mention includes all of the details of the broadcast, a 15 second snippet of the words spoken around the mention to lend it context and a URL to view the clip in the Internet Archive's Television News Archive. The second repurposes our 101-billion-word Web Part of Speech dataset to find all words whose randomly selected example snippets from global online news coverage 2016-2020 include climate change usages, totaling more than 6 million entries. Each example includes a rich array of linguistic indicators about the specific use case. The third charts the global perspective on climate change, tracing its coverage across 63 languages over the past half decade in more than 4.1 million articles. Finally, the more than 6.3 million worldwide English-language climate change articles monitored by GDELT 2015-2020 are cataloged, along with a brief 200-character snippet capturing the context of how climate change is referenced in each.

In all, these four datasets represent television, language, the global perspective and context dating back from half a decade to a decade in all. We're enormously excited to see the kinds of truly transformative new kinds of inquiry these datasets enable.

The GDELT Project

Four Massive Datasets Charting The Global Climate Change News Narrative 2009-2020

Archives