The GDELT Project

Announcing Our First API: GKG GeoJSON!

Today we're incredibly excited to announce the official debut of our new GDELT API suite, with our very first API endpoint being a tool to generate GeoJSON files from the GDELT Global Knowledge Graph (GKG) 2.0!  Using this API, you can now create live maps, updated every hour, of any of GDELT's thousands of themes, of a particular person (such as a head of state) or organization, of a particular news outlet, of a particular language, or any combination therein – the sky is the limit!

For those of you who aren't familiar with the GDELT Global Knowledge Graph, it processes all of the worldwide news coverage that GDELT monitors every 15 minutes and compiles a list of all of the people, organizations, locations, themes, counts, and emotions, across 65 languages, into a powerful realtime metadata index over global society.  Of course, the problem with so much power is that it can be incredibly intimidating to try and actually use this massive firehose, so we've released our first API to make it possible to create quick maps of the world's news with just a few mouse clicks!

For those of you using CartoDB to map GDELT, this new API now gives you a URL to copy-paste into CartoDB's import dashboard to instantly create a new map from GDELT, updated to the last 15 minutes and covering up to the last 24 hours!  If you have CartoDB's "sync tables" feature enabled on your account (John Snow or greater accounts and all education/research accounts), you can simply click the "sync every hour" button in CartoDB when you import the table to instantly create a map that live updates every hour on the hour without you having to do anything!  We're incredibly excited by the awesome power of our debut API offering to enable a whole new way of creating live rich interactive animated maps of global society as seen through the world's news media, all with just a few mouse clicks – no programming needed!

The API allows you to filter the GKG by keyword/keyphrase of the themes and names fields, by source website domain, and by language, and any combination therein!  You can make a map of Arabic coverage of food security, a map comparing the BBC to the New York Times, or a map of today's coverage from a specific news outlet or on a specific topic!

To make your searches as relevant as possible, we do an incredible amount of processing behind the scenes.  Using the proximity information contained in the GKG 2.0 files, we assign every mention of a recognized theme, person, or organization to the location mentioned in the article closest to it, arbitrating in the case of multiple mentions in close proximity to multiple locations, and performing windowing and falloff filtering.  What does this mean to you?  In a nutshell, it means that when you search for the GDELT Theme "FOOD_SECURITY", the locations that are returned are those that were mentioned in closest proximity and context with the topic, meaning your map should have as few false positives as possible.  You will still find a certain number of false positives and this approach will eliminate some valid matches, but in the general case should ensure that you get highly relevant results from your searches!

Some Quick Examples To Get You Started

If you're eager to get started and don't care about the technical details, here are some simple queries to get you started creating your first maps!  If you're using CartoDB, just go to your CartoDB "Datasets" dashboard on cartodb.com and click on the big green "New Dataset" button in the upper right, paste in one of the URLs below, click "Submit", check off the option to sync every hour (if your account has "sync tables" enabled) and then click the big green "Connect Dataset" button and a few seconds or tens of seconds (depending on the query), you'll have yourself a live-updating table ready for mapping and which automatically updates every hour on the hour from now until the end of time!

Hopefully these queries have gotten you off to a quick start!  The rest of this blog post outlines all of the technical detail and the full capabilities of the API.

 

The Technical Details: How to Use the GKG GeoJSON API

NOTE: This section is for the technical folks that want to dig deeply into the API and understand how to use all of its features.

To use the new GKG GeoJSON API, you simply fetch the URL "https://api.gdeltproject.org/api/v1/gkg_geojson" into a tool like CartoDB, adding on the parameters you desire from below and a few seconds later it will return a GeoJSON file containing the requested results.  There is no authentication or fancy footwork needed!  For example, to search for coverage of the GDELT Theme "FOOD_SECURITY" over the past hour, just use the URL "https://api.gdeltproject.org/api/v1/gkg_geojson?QUERY=FOOD_SECURITY".  To search for only Arabic-language FOOD_SECURITY coverage, use the URL "https://api.gdeltproject.org/api/v1/gkg_geojson?QUERY=lang:Arabic,FOOD_SECURITY".  Its that easy!

You can paste the URL into your browser to see what the GeoJSON stream looks like and when you're happy, just paste into CartoDB using the "New Dataset From URL" option!  If you're wanting to download the GeoJSON into your own application, just fetch the URL above using any standard download tool (you access it via a standard HTTP GET).

There are two primarily output modes.  The "article" mode operates at the article level, with each record representing a location mentioned in a specific article.  This is most useful for creating clickable map layers where you want a user to be able to click on a location and get back a link to the article mentioning that location.  Conversely, the "location+time" mode is optimized for creating animation layers.  It collapses all coverage in a given 15 minute interval by location, with each record representing a specific location in a specific 15 minute time period.  Thus, if 50 articles all mentioned Paris, France in a given 15 minute interval, there will be a single record in the GeoJSON for "Paris, France" with that timestamp and details about all of the coverage that mentioned Paris during that time interval.  This can be used for clickable maps with creative SQL, but is primarily aimed at making minimized GeoJSON files highly optimized for animation use where the goal is to show change over time rather than creating a clickable interactive map layer.

The available parameters are listed below.  Note that they must be specified in all capital letters.