The GDELT Project

Television 2.0 AI API Debuts!

We're tremendously excited to announce today the debut of the new Television 2.0 AI API, which uses computer vision through Google's Cloud Video API and computer natural language understanding through Google's Cloud Natural Language API to let you visually search a decade of television evening news broadcasts and a month of CNN to search by the objects, activities and text visually depicted onscreen as annotated through and a topical analysis of their captioning. In short, instead of keyword searching closed captioning, you can for the first time actually visually search what is shown onscreen!

The TV 2.0 AI API offers a rich interactive API interface to the Visual Global Entity Graph 2.0, in which Google's Cloud Video API non-consumptively analyzes the evening news broadcasts of ABC, CBS and NBC from July 2010 to present and CNN from January 25, 2020 to present (updated continually with a rolling 24-48 hour delay) from the Internet Archive's Television News Archive in a secure research environment and describes what it sees second by second, transcribes all of the onscreen text through OCR, generates an automatic transcript and analyzes the station-provided human transcript through Google's Natural Language API as part of the Television Global Entity Graph 2.0. Working with this enormous dataset of more than 400 million annotations over more than 9,000 broadcasts has until now required analyzing a massive archive of JSON files. Today the new TV 2.0 AI API allows you to interactively search this massive dataset as easily as you search television closed captioning using the TV 2.0 API.

See Television News Through The Eyes Of AI

Using this API you can search for all news airtime depicting everything from vacuum cleaners to volcanic eruptions, protesters to police, goaltenders to golden retrievers. Rather than searching for what was said, you can search for what was seen. You can also search all of the onscreen text, searching everything from onscreen tweets to infographics. Even captioning can be searched by the underlying concepts it discusses rather than the literal words spoken. This is how AI sees the world of television news!

Difference Between TV 2.0 API & TV 2.0 AI API

What is the major difference between this API and the existing TV 2.0 API? The existing TV 2.0 API allows keyword searching of the station-provided closed captioning of more than a million broadcasts across samples of 150 stations over a decade. Its focus is on simple keyword search with advance visualizations and analytics of the results. In contrast, the TV 2.0 AI API uses Google's Cloud Video AI API to "watch" television broadcasts and describe what it sees, transcribe the onscreen text, generate its own more complete spoken word transcript and even topically analyze the captioning using Google's Cloud Natural Language API. It is a far more powerful API, but is limited to just ABC/CBS/NBC evening news broadcasts July 2010 to present and CNN from January 25, 2020 to present. It also uses a more advanced query syntax and is intended for advanced users wishing to explore the frontiers of AI-assisted visual content understanding.

Human + Machine Output

The API is designed to both generate machine-friendly CSV and JSON output, suitable for analysis in any major platform and beautiful visualizations for human consumption, optimized for embedding on your own website. The API is available with both HTTP and HTTPS endpoints, meaning it can be embedded in an iframe in any website.

Quick Start Examples

To get you started, here are a couple of example queries:

 

Full Documentation

The GDELT TV 2.0 AI API is accessed via a simple URL with the following parameters. Under each parameter is the list of operators that can be used as the value of that parameter.