The GDELT Project

Announcing The GDELT Full Text Search API

We are extraordinarily excited to announce today the public unveiling of what has been perhaps the most requested feature of 2015: the ability to perform full text searches through the debut of the new GDELT Full Text Search API! The new Full Text Search API allows you to search the full text of all monitored coverage from the last 24 hours and return a list of matching articles sorted by relevance, date, or even sentiment, a timeline of media coverage, a timeline of the tone of that coverage, or even a word cloud of the top words appearing in matching coverage (using either the coverage’s original native language or the English translations).

As an alpha release, you may encounter a few bugs as you use the new GDELT Full Text Search API. Please bear with us as we work to constantly improve and enhance the new API and let us know about any particularly significant errors you encounter.

Searching Across The World’s Languages

Perhaps most powerfully, and utterly unlike any other news search system today, when you search using the GDELT Full Text Search API you are not just searching English news coverage: you are searching the English translations of coverage from 65 languages. Search for “genocide” and you see not just English-language Western news coverage, but rather perspectives from outlets across the entire globe in the world’s languages. GDELT today operates one of the largest streaming machine translation deployments in the world, live translating every monitored article in realtime from 65 different languages into English. Using these translations, the GDELT Full Text Search API is able to seamlessly and completely transparently search across languages, breaking down the language barrier to accessing the world’s events, narratives, and perspectives. With a traditional news search engine, you would have to manually translate your search term into all 65 languages using various dictionaries and translation tools, and then conduct 65 separate searches and merge all of the results together. In addition, few search engines have comprehensive catalogs of the non-Western and non-English news landscape, meaning even if you did all that, you would get back only a very limited non-Western perspective.

With the new GDELT Full Text Search API, a single English keyword transparently searches across the world’s languages. Currently 65 languages are live translated by GDELT: Afrikaans, Albanian, Arabic (MSA and many common dialects), Armenian, Azerbaijani, Bengali, Bosnian, Bulgarian, Catalan, Chinese (Simplified), Chinese (Traditional), Croatian, Czech, Danish, Dutch, Estonian, Finnish, French, Galician, Georgian, German, Greek, Gujarati, Hebrew, Hindi, Hungarian, Icelandic, Indonesian, Italian, Japanese, Kannada, Kazakh, Korean, Latvian, Lithuanian, Macedonian, Malay, Malayalam, Marathi, Mongolian, Nepali, Norwegian (Bokmal), Norwegian (Nynorsk), Persian, Polish, Portuguese (Brazilian), Portuguese (European), Punjabi, Romanian, Russian, Serbian, Sinhalese, Slovak, Slovenian, Somali, Spanish, Swahili, Swedish, Tamil, Telugu, Thai, Turkish, Ukrainian, Urdu, and Vietnamese. Or, if there is a particular word or phrase you want to search for in another language, you can search for that word natively in any of the languages above.

Best of all, this new capability is designed as an embeddable machine-friendly API, meaning you can use it to generate a live CSV-formatted list of URLs to cross reference against GDELT 2.0’s Global Knowledge Graph or Event Database, or output article lists, timelines, or word cloud visualizations that you can embed right on your own website!

AVAILABLE VISUALIZATIONS AND OUTPUTS

The GDELT Full Text Search API currently offers a number of different output formats:

EMBEDDING THE API

You can embed any of the visualizations/outputs above into your own website, using the options in the next two sections to configure its display.

For example, to embed a live word cloud of the top words appearing in Nigerian news media:

 

You can embed this word cloud in your own website using a simple iframe via the HTML code below:

<iframe src="https://api.gdeltproject.org/api/v1/search_ftxtsearch/search_ftxtsearch?query=sourcecountry:nigeria&output=wordcloud&sort=desc" height="500" scrolling="no" width=500></iframe>

 

QUERY COMMANDS

The API currently recognizes the following query commands that you can pass in as part of the query itself (all commands MUST be lowercase and all commands are AND’d together):

 

URL PARAMETERS

There are also several configuration options you can pass as part of the URL. These must be passed as part of the URL, not as part of the query:

EXAMPLES

Here are some examples to help you get started using the new API:

 

Happy Searching!