gdelt-global-knowledge-graph

GDELT Global Knowledge Graph

We are tremendously excited to announce the debut of the GDELT Global Knowledge Graph (GKG), which expands GDELT’s ability to quantify global human society beyond cataloging physical occurrences towards actually representing all of the latent dimensions, geography, and network structure of the global news. To sum up the Global Knowledge Graph in a single sentence, it attempts to connect every person, organization, location, count, theme, news source, and event across the planet into a single massive network that captures what’s happening around the world, what its context is and who’s involved, and how the world is feeling about it, every single day.

The Global Knowledge Graph actually consists of two parallel data streams. The first is the daily Counts File, which records mentions of counts of things with respect to a set of predefined categories such as a number of protesters, a number killed, or a number displaced or sickened. Such counts may occur independently of the CAMEO events in the primary GDELT event stream, such as mentions of those killed in industrial accidents (which are not captured in CAMEO) or those displaced by a natural disaster or sickened by a disease epidemic. In this way, the GKG Counts File can be used to produce a daily “Death Tracker” to map all mentions of death across the world each day, or an “Affected Tracker” to indicate how many persons were sickened/displaced/stranded each day (at least as recorded in the global news media). The second file is the GKG Graph File, which contains the actual graph connecting all persons, organizations, locations, emotions, themes, counts, events, and sources together each day into a single network structure and captures the cultural narratives that envelope the global information stream.

One of the most exciting elements of the Global Knowledge Graph is the vast array of non-event analyses it makes possible, from mapping themes, people, and terror groups over space, to looking at the connections among people and analyzing influencer networks around themes and space. Of special interest, take a look at the more than 150 themes currently available and look specifically at the “Taxonomy” themes. The GKG records any mention of a major terror group, major political party around the world, major infectious disease, etc, irrespective of any connection with an event. This can be used, for example, to create a media-based proxy of political competition through measuring how often each political party receives media coverage. Other applications include mapping a theme over space to capture change over time, plotting movements of political candidates during an election (along with the themes they are associated with at each location), and constructing influencer networks around the people and organizations associated with specific topics and locations.

Global Knowledge Graph files are currently available starting October 1, 2013, but we will be making historical files back to the start of daily GDELT event stream files (April 1, 2013) available by the end of November 2013.

The Global Knowledge Graph is extremely complex and requires highly advanced technical expertise to work with, so is recommended only for the most advanced users. Please see the codebook for an introduction to the Global Knowledge Graph, the methodologies behind it, an overview of how to think about the view it captures of the world, how to work with it, and a technical readout on the data format and its fields. Also, please note that unlike the primary GDELT event stream, the GDELT Global Knowledge Graph is a highly experimental new capability that is still undergoing active development and is currently made available as an ALPHA EXPERIMENTAL version release, meaning specifics, especially the output format, may change in the future with the next version release as the system evolves to incorporate community feedback and needs.