When Google's Cloud Video API watches a decade of evening television news broadcasts on ABC, CBS and NBC using the Internet Archive's Television News Archive, what are the visual themes it identifies in all that coverage?
In all, the Video API recognized 10,592 distinct entities across the decade of television, with top themes being "professional", "product", "official", "mode of transport" and "public relations". Others like "newscaster", "newsreader" and "journalist" refer to common themes in the studio and field-based coverage of evening news broadcasts.
The complete list can be downloaded below. Remember that the Video API works by identifying appearances of a predefined collection of a few tens of thousands of topics, so the labels in this list represent entries in that taxonomy.
Creating the list above took only a single query in BigQuery.
SELECT entity.name EntityName, count(1) TotalClips, count(distinct iaShowId) DistinctShows, sum(entity.numSeconds) TotAirtime FROM `gdelt-bq.gdeltv2.vgeg_iatv`,UNNEST(entities) AS entity group by entity.name order by TotAirtime desc