Television 2.0 AI API Debuts!

Kalev Leetaru

5 years ago

We're tremendously excited to announce today the debut of the new Television 2.0 AI API, which uses computer vision through Google's Cloud Video API and computer natural language understanding through Google's Cloud Natural Language API to let you visually search a decade of television evening news broadcasts and a month of CNN to search by the objects, activities and text visually depicted onscreen as annotated through and a topical analysis of their captioning. In short, instead of keyword searching closed captioning, you can for the first time actually visually search what is shown onscreen!

The TV 2.0 AI API offers a rich interactive API interface to the Visual Global Entity Graph 2.0, in which Google's Cloud Video API non-consumptively analyzes the evening news broadcasts of ABC, CBS and NBC from July 2010 to present and CNN from January 25, 2020 to present (updated continually with a rolling 24-48 hour delay) from the Internet Archive's Television News Archive in a secure research environment and describes what it sees second by second, transcribes all of the onscreen text through OCR, generates an automatic transcript and analyzes the station-provided human transcript through Google's Natural Language API as part of the Television Global Entity Graph 2.0. Working with this enormous dataset of more than 400 million annotations over more than 9,000 broadcasts has until now required analyzing a massive archive of JSON files. Today the new TV 2.0 AI API allows you to interactively search this massive dataset as easily as you search television closed captioning using the TV 2.0 API.

See Television News Through The Eyes Of AI

Using this API you can search for all news airtime depicting everything from vacuum cleaners to volcanic eruptions, protesters to police, goaltenders to golden retrievers. Rather than searching for what was said, you can search for what was seen. You can also search all of the onscreen text, searching everything from onscreen tweets to infographics. Even captioning can be searched by the underlying concepts it discusses rather than the literal words spoken. This is how AI sees the world of television news!

Difference Between TV 2.0 API & TV 2.0 AI API

What is the major difference between this API and the existing TV 2.0 API? The existing TV 2.0 API allows keyword searching of the station-provided closed captioning of more than a million broadcasts across samples of 150 stations over a decade. Its focus is on simple keyword search with advance visualizations and analytics of the results. In contrast, the TV 2.0 AI API uses Google's Cloud Video AI API to "watch" television broadcasts and describe what it sees, transcribe the onscreen text, generate its own more complete spoken word transcript and even topically analyze the captioning using Google's Cloud Natural Language API. It is a far more powerful API, but is limited to just ABC/CBS/NBC evening news broadcasts July 2010 to present and CNN from January 25, 2020 to present. It also uses a more advanced query syntax and is intended for advanced users wishing to explore the frontiers of AI-assisted visual content understanding.

Human + Machine Output

The API is designed to both generate machine-friendly CSV and JSON output, suitable for analysis in any major platform and beautiful visualizations for human consumption, optimized for embedding on your own website. The API is available with both HTTP and HTTPS endpoints, meaning it can be embedded in an iframe in any website.

Quick Start Examples

To get you started, here are a couple of example queries:

ABC/CBS/NBC airtime with the text "@realDonaldTrump" appearing somewhere on the screen. [View Live]
ABC/CBS/NBC airtime with the text "gas prices" appearing somewhere on the screen. [View Live]
ABC/CBS/NBC airtime where the Google Knowledge Graph concept "Climate Change" was discussed in the closed captioning. [View Live]
ABC/CBS/NBC airtime where the visual concept "Volcano" is seen onscreen. [View Live]
ABC/CBS/NBC airtime where the visual concept "Police" is seen onscreen. [View Live]
ABC/CBS/NBC airtime where the visual concept "Protest" or "Riot" is seen onscreen. [View Live]
ABC/CBS/NBC airtime where the visual concept "Golden Retriever" is seen onscreen. [View Live]

Full Documentation

The GDELT TV 2.0 AI API is accessed via a simple URL with the following parameters. Under each parameter is the list of operators that can be used as the value of that parameter.

QUERY. This contains your search query and supports keyword and keyphrase searches, OR statements and a variety of advanced operators. NOTE – all of the operators below must be used as part of the value of the QUERY field, separated by spaces, and cannot be used as URL parameters on their own.
- Asr. Google's Cloud Video API generates a machine transcript for the broadcast. This is typically more accurate and more complete than the human-transcribed station-provided closed captioning and includes all commercials, while the closed captioning does not include many commercials. Some names may be misspelled in this field. You can use this field to keyword/phrase search this field using words or phrases. All values must be enclosed in quote marks, even single words. Only phrases up to a maximum of 5 words are permitted and are counted towards the airtime second the phrase begins to be uttered, even if it spans multiple seconds of airtime.
  - asr:"trump"
  - asr:"donald trump"
- Cap. This allows keyword/phrase searching of the station-provided human-transcribed closed captioning just like the main TV 2.0 API. Closed captioning transcripts are automatically timecode-corrected using the ASR results so words are recorded at the precise second they are spoken and thus are directly comparable at the second level to ASR search results.
  - cap:"trump"
  - cap:"donald trump"
- Capnlp. The station-provided human-transcribed closed captioning is analyzed using Google's Cloud Natural Language API to create the Television News Global Entity Graph 2.0. This allows you to abstract away from the literal words spoken to search the underlying high-order concepts being described, including resolving pronouns back to the people they refer to. Only entities with an assigned Google Knowledge Graph MID code are included, see the complete list. To search, place the MID code within quote marks. NOTE: unlike the "cap" searches above these entities are computed on the original closed captioning prior to timecode correction and thus may be recorded as being mentioned up to 10 seconds after they actually were mentioned due to uncorrected closed captioning having a delay of 1-10 seconds.
  - capnlp:"/M/0D063V"
- Dow. By default all days within your selected timeframe are searched. You can limit to just specific days of the week, such as just Mondays or just weekends by specifying "dow:" and one of the numbers 0-7 representing the days of the week, starting with Sunday=0 t0 Saturday=7 (week starts on Saturday). Days are computed in PST timezone. Thus, to search just weekend shows you would specify "(dow:0 OR dow:7)" since Sunday is 0 and Saturday is 7.
  - dow:0
- Show. This narrows your search to a particular television show. This must be the complete show name as returned by the TV 2.0 AI API. To find a particular show, search the API and use the "clipgallery" mode to display matching clips and their source show. For example, to limit your search to the show Hardball With Chris Matthews, you'd search for " show:"Hardball With Chris Matthews" ". Note that you must surround the show name with quote marks. Remember that unlike the main TV 2.0 API, the TV 2.0 AI API only searches a small number of news shows. Note that the TV 2.0 AI API includes only a small subset of the total programming included in the main TV 2.0 API.
  - show:"Hardball With Chris Matthews"
- Station. This narrows your search to a particular television station. Remember that unlike the main TV 2.0 API, the TV 2.0 AI API only searches a small number of stations. Do not use quote marks around the name of the station. At present this value can be one of "CNN", "KGO" (for ABC), "KPIX" (for CBS) or "KNTV" (for NBC).
  - station:CNN
- OCR. Search for a specific keyword or phrase within the OCR'd onscreen text. All values must be contained within quote marks, even individual words and only phrases up to five words are searchable.
  - ocr:"realdonaldtrump"
  - ocr:"gas prices"
- Visual. Search for a specific Google Cloud Video API "label".
  - visual:"volcano"
  - visual:"police"
- (a OR b). You can specify a list of operators to be boolean OR'd together by enclosing them in parentheses and placing the capitalized word "OR" between each keyword or phrase. Boolean OR blocks cannot be nested at this time. For example, to search for mentions of Clinton, Sanders or Trump, in captioning you would use:
  - (cap:"clinton" OR cap:"sanders" OR cap:"trump")
- -. You can place a minus sign in front of any operator, word or phrase to exclude it. For example " -cap:"sanders" " would exclude results that contained "sanders" in the station-provided captioning from your results. NOTE that this MUST be combined with at least one non-negative operator.
  - cap:"biden" -cap:"sanders"
MODE. This specifies the specific output you would like from the API, ranging from timelines to word clouds to clip galleries.
- - CapNlpEntityDetails. This is a special mode that outputs the complete list of entries suitable for using with the "capnlp" search operator sorted by the number of times they appear in the data. Remember that you must use the "MID" field as your query – the "Label" field is just for display for human consumption. At this time Google's Cloud Natural Language API does not return a canonical label for each MID, so this is the most common label applied by the API over the television transcripts. Use the "MAXRECORDS" URL parameter to control the number of returned entries (ie "&maxrecords=100" will return the top 100 entries). Note that this mode is only available with CSV and JSON/JSONP output.
  - ClipGallery. This displays up to the top 50 most relevant seconds of airtime matching your search. Each returned second includes a thumbnail image captured at the beginning of that second, the name of the source show and station, the time the clip aired, the captioning, OCR'd text and visual labels for the second of airtime and a link to view a one minute clip beginning at that second on the Internet Archive's website. This allows you to see what kinds of clips are matching and view the full clip to gain context on your search results. In HTML output, this mode displays a "high design" visual layout suitable for creating magazine-style collages of matching coverage. When embedded as an iframe, the API uses the same postMessage resize model as the DOC 2.0 API.
  - ShowChart. This determines what percent of your search results were from each specific television show and displays the top 10 shows.
  - StationChart. This compares how many seconds of airtime your search generates from each of the selected stations over the selected time period, allowing you to assess the relative attention each is paying to the topic. Note that in HTML mode, you can use the button at the top right of the display to save it as a static image or save its underlying data.
  - StationDetails. This is a special mode that outputs the complete list of all stations that are available for searching. Note that this mode is only available with JSON/JSONP output.
  - TimelineVol. This tracks how many seconds of airtime matches your search by day/hour over the selected time period, allowing you to assess the relative attention each is paying to the topic and how that attention has varied over time. By default, the timeline will not display the most recent 24 hours, since those results are still being generated (it can take up to 24-48 hours for a show to be processed by the Internet Archive and ready for analysis), but you can include those if needed via the LAST24 option. You can also smooth the timeline using the TIMELINESMOOTH option and combine all selected stations into a single time series using the DATACOMB option. Note that in HTML mode, you can toggle the station legend using the button that appears at the top right of the display or export the timeline as a static image or save its underlying data. This displays a traditional line-based timeline.
  - TimelineVolHeatmap. Displays an hourly timeline of total airtime seconds matching the query, rapidly pinpointing hourly trends, where the X axis is days and Y axis is hours from 0 to 23. Each cell is color-coded from white (0) to dark blue (maximum value). Note that this visualization is very computationally expensive and thus may take several seconds or longer to return.
  - TimelineVolStream. Same as "TimelineVol" but displays as a streamgraph rather than a line-based timeline.
  - TimelineVolNorm. This displays the total airtime in seconds monitored from each of the stations in your query. It must be combined with a valid query, since it displays the airtime for the stations queried in the search. This mode can be used to identify brief monitoring outages or for advanced normalization, since it reports the total amount of clips monitored overall from each station in each day/hour.
  - VisualEntityDetails. This is a special mode that outputs the complete list of entries suitable for using with the "visual" search operator sorted by the number of times they appear in the data. Use the "MAXRECORDS" URL parameter to control the number of returned entries (ie "&maxrecords=100" will return the top 100 entries). Note that this mode is only available with CSV and JSON/JSONP output.
  - WordCloudAsr. This mode returns the top words that appear most frequently in the automated machine-generated transcript of the airtime seconds matching your search. It takes the 200 most relevant clips that match your search and displays a word cloud of up to the top 200 most frequent words that appeared in the ASR transcripts of those seconds (common stop words are automatically removed). This is a powerful way of understanding the topics and words dominating the relevant coverage and suggesting additional contextual search terms to narrow or evolve your search. Note that if there are too few matching clips for your query, the word cloud may be blank. Note that in HTML mode, you can use the options at the bottom right of the display to save it as a static image or save its underlying data.
  - WordCloudCap. This mode returns the top words that appear most frequently in the human-transcribed station-provided closed captioning of the airtime seconds matching your search. It takes the 200 most relevant clips that match your search and displays a word cloud of up to the top 200 most frequent words that appeared in the station closed captioning of those seconds (common stop words are automatically removed). This is a powerful way of understanding the topics and words dominating the relevant coverage and suggesting additional contextual search terms to narrow or evolve your search. Note that if there are too few matching clips for your query, the word cloud may be blank. Note that in HTML mode, you can use the options at the bottom right of the display to save it as a static image or save its underlying data.
  - WordCloudCapNLP. This mode returns the top Google Knowledge Graph concepts that appear most frequently in the human-transcribed station-provided closed captioning of the airtime seconds matching your search as annotated by Google's Natural Language API. It takes the 200 most relevant clips that match your search and displays a word cloud of up to the top 200 most frequent concepts that appeared in the station closed captioning of those seconds. This is a powerful way of understanding the topics and words dominating the relevant coverage and suggesting additional contextual search terms to narrow or evolve your search. Note that if there are too few matching clips for your query or if Google's API did not identify any entries with MID codes, the word cloud may be blank. Note that in HTML mode, you can use the options at the bottom right of the display to save it as a static image or save its underlying data.
  - WordCloudOCR. This mode returns the top words that appear most frequently in the OCR'd onscreen text of the airtime seconds matching your search. It takes the 200 most relevant clips that match your search and displays a word cloud of up to the top 200 most frequent words that appeared onscreen. This is a powerful way of understanding the topics and words dominating the relevant coverage and suggesting additional contextual search terms to narrow or evolve your search. Note that if there are too few matching clips for your query, the word cloud may be blank. Note that in HTML mode, you can use the options at the bottom right of the display to save it as a static image or save its underlying data.
  - WordCloudVisual. This mode returns the top words that appear most frequently in the OCR'd onscreen text of the airtime seconds matching your search. It takes the 200 most relevant clips that match your search and displays a word cloud of up to the top 200 most frequent words that appeared onscreen. This is a powerful way of understanding the topics and words dominating the relevant coverage and suggesting additional contextual search terms to narrow or evolve your search. Note that if there are too few matching clips for your query, the word cloud may be blank. Note that in HTML mode, you can use the options at the bottom right of the display to save it as a static image or save its underlying data.
FORMAT. This controls what file format the results are displayed in. Not all formats are available for all modes. To assist with website embedding, the CORS ACAO header for all output of the API is set to the wildcard "*", permitting universal embedding.
- HTML. This is the default mode and returns a browser-based visualization or display. Some displays, such as word clouds, are static images, some, like the timeline modes, result in interactive clickable visualizations, and some result in simple HTML lists of images or articles. The specific output varies by mode, but all are intended to be displayed directly in the browser in a user-friendly intuitive display and are designed to be easily embedded in any page via an iframe.
- CSV. This returns the requested data in comma-delimited (CSV) format. The specific set of columns varies based on the requested output mode. Note that since some modes return multilingual content, the CSV is encoded as UTF8 and includes the UTF8 BOM to work around Microsoft Excel limitations handling UTF8 CSV files.
- JSON. This returns the requested data in UTF8 encoded JSON. The specific fields varies by output mode.
- JSONP. This mode is identical to "JSON" mode, but accepts an additional parameter in the API URL "callback=XYZ" (if not present defaults to "callback") and wraps the JSON in that callback to return JSONP compliant JavaScript code.
- RSS. This is only available in ClipGallery mode and returns the results in RSS 2.0 format.
- JSONFeed. This is only available in ClipGallery mode and returns the results in JSONFeed 1.0 format.
DATACOMB. By default, both timeline and station chart modes separate out each matching station's data to make it possible to compare the relative attention paid to a given topic by each station. Sometimes, however, the interest is in overall media attention, rather than specific per-station differences in that coverage. Setting this parameter to "combined" will collapse all matching data into a single "Combined" synthetic station and make it much easier to understand macro-level patterns.
DATERES. By default results are returned in hourly resolution (for <7 day timespans), daily resolution for <3 year timespans and monthly resolution otherwise. You can override these settings and manually set the time resolution to any of the following values. NOTE that this will automatically adjust the STARTDATETIME and ENDDATETIME parameters to their start/stop dates/times at the given date resolution.
- Hour. This is only available for timespans of 7 or fewer days. Rounds start/stop dates to their nearest full hours.
- Day. Rounds start/stop dates to their nearest full day.
- Week. Rounds start/stop dates to their nearest full week.
- Month. Rounds start/stop dates to their nearest full month.
- Year. Rounds start/stop dates to their nearest full year.
LAST24. It can take the Internet Archive up to 24 hours to process a broadcast once it concludes. Thus, by default the TV API does not return results from the most recent 24 hours to ensure that analyses are not skewed by partial results. However, when tracking breaking news events, it may be desirable to view partial results with the understanding that any time or station-based trends may not accurately reflect the totality of their coverage. In such cases, ClipGallery and WordCloud modes may be particularly insightful. To include results from the most recent 24 hours, set this URL parameter to "yes".
MAXRECORDS. This option only applies to ClipGallery, CapNlpEntityDetails and VisualEntityDetails mode. By default 50 clips are displayed in HTML format for ClipGallery mode. In JSON, JSONP or CSV formats, up to 3,000 clips can be returned.
SORT. By default results are sorted by relevance to your query. Sometimes you may wish to sort by date or tone instead.
- DateDesc. Sorts matching clips by broadcast date, displaying the most recent clips first.
- DateAsc. Sorts matching clips by broadcast date, displaying the oldest clips first.
STARTDATETIME/ENDDATETIME. These parameters allow you to specify the precise start and end date/times to search, instead of using an offset like with TIMESPAN.
- STARTDATETIME. Specify the precise date/time in YYYYMMDDHHMMSS format to begin the search – only articles published after this date/time stamp will be considered. The earliest available date is July 2, 2009. If you do not specify an ENDDATETIME, the API will search from STARTDATETIME through the present date/time.
- ENDDATETIME. Specify the precise date/time in YYYYMMDDHHMMSS format to end the search – only articles published before this date/time stamp will be considered. The earliest available date is July 2, 2009. If you do not specify a STARTDATETIME, the API will search from July 2, 2009 through the specified ENDDATETIME.
TIMELINESMOOTH. This option is only available in timeline mode and performs moving window smoothing over the specified number of time steps, up to a maximum of 30. Timeline displays can sometimes capture too much of the chaotic noisy information environment that is the television landscape, resulting in jagged displays. Use this option to enable moving average smoothing up to 30 time steps to smooth the results to see the macro-level patterns. Note that since this is a moving window average, peaks will be shifted to the right, up to several days or weeks at the heaviest smoothing levels.
TIMESPAN. By default the TV 2.0 AI API searches the entirety of the analyzed broadcasts, which currently extends back to July 2010. You can narrow this range by using this option to specify the number of months, weeks, days or hours (minimum of 1 hour). The API then only searches airtime within the specified timespan backwards from the present time. If you would instead like to specify the precise start/end time of the search instead of an offset from the present time, you should use the STARTDATETIME/ENDDATETIME parameters.
- Hours. Specify a number followed by "h" or "hours" to provide the timespan in hours.
- Days. Specify a number followed by "d" or "days" to provide the timespan in days.
- Weeks. Specify a number followed by "w" or "weeks" to provide the timespan in weeks.
- Months. Specify a number followed by "m" or "months" to provide the timespan in months.
- Years. Specify a number followed by "y" or "years" to provide the timespan in years.
TIMEZONEADJ. By default STARTDATETIME, ENDDATETIME and all returned results are interpreted and reported in the UTC timezone. Use this parameter to adjust to any of the following major timezones: -12:00, -11:00, -10:00, -09:30, -09:00, -08:00, -07:00, -06:00, -05:00, -04:00, -03:30, -03:00, -02:30, -02:00, -01:00, +00:00, +01:00, +02:00, +03:00, +03:30, +04:00, +04:30, +05:00, +05:30, +05:45, +06:00, +06:30, +07:00, +08:00, +08:45, +09:00, +09:30, +10:00, +10:30, +11:00, +12:00, +12:45, +13:00, +13:45 or +14:00. Note that UTC offsets are treated as-is over the entire timespan, meaning that the offset is not adjusted to account for periods in which daylight savings is honored.
TIMEZOOM. This option is only available for timeline modes in HTML format output and enables interactive zooming of the timeline using the browser-based visualization. Set to "yes" to enable and set to "no" or do not include the parameter, to disable. By default, the browser-based timeline display allows interactive examination and export of the timeline data, but does not allow the user to rezoom the display to a more narrow time span. If enabled, the user can click-drag horizontally in the graph to select a specific time period. If the visualization is being displayed directly by itself (it is the "parent" page), it will automatically refresh the page to display the revised time span. If the visualization is being embedded in another page via iframe, it will use postMessage to send the new timespan to the parent page with parameters "startdate" and "enddate" in the format needed by the STARTDATETIME and ENDDATETIME API parameters. The parent page can then use these parameters to rewrite the URLs of any API visualizations embedded in the page and reload each of them. This allows the creation of dashboard-like displays that contain multiple TV API visualizations where the user can zoom the timeline graph at the top and have all of the other displays automatically refresh to narrow their coverage to that revised time frame.
ARTRETRES. This option applies only to ClipGallery mode and instructs the API to group results and return a single result from each group. By default, the API returns results at the resolution of a second of airtime, meaning that if a given query matches 3 hours of airtime across two shows, the API will display 3 hours * 60 minutes * 60 seconds = 10,800 results. For many queries this may be the desired behavior, but in other cases you may wish to group the results by a particular metric and see just a single matching result from each.
- byshow. This groups results by show such that only a single second of airtime from each distinct matching show is returned. The specific second of airtime returned for each show is non-deterministic and may change between queries.