Excluding Advertisements From VGEG Video AI Analysis Of Television News

Last April we showed how to analyze top visual trends by day across television news using the Visual Global Entity Graph 2.0. One of the findings was that bursty advertising campaigns like "chicken nuggets" could skew the results if an ad campaign contained very distinctive imagery and was run heavily on a few days without appearing much on the days prior. In December of last year we announced the Advertising Inventory Files (AIF) dataset that precisely and with apparently 100% accuracy identifies each and every second of airtime as either news or advertising, using a special data channel provided by television stations themselves. Unfortunately for visual analysis, this information is recorded in "caption time," meaning it lags the onscreen imagery by a variable amount of several seconds. Earlier this year we announced a special "video time" version of this dataset that uses ASR data to align it with the onscreen visuals.

Using this "video time" AIF dataset, it is trivial to exclude advertisements from VGEG analyses. For example, the query below tallies the top visual labels on February 24, 2020 (the day we saw a surge in chicken nuggets) without using advertising filtering:

SELECT entity.name entity, count(1) cnt FROM `gdelt-bq.gdeltv2.vgegv2_iatv`, UNNEST(entities) entity WHERE entity.name is not null and station='CNN' and DATE(date) = "2020-02-24" group by entity order by cnt desc

The top ten most common entities are:

entity cnt
person 61265
people 53745
phenomenon 48166
snapshot 43073
photography 37504
photograph 36640
news 34063
facial expression 33217
font 30390
public event 29010

In contrast, the query below uses the video time AIF dataset to remove ads:

SELECT entity.name entity, count(1) cnt FROM `gdelt-bq.gdeltv2.vgegv2_iatv`, UNNEST(entities) entity WHERE entity.name is not null and station='CNN' and DATE(date) = "2020-02-24"
and date not in (SELECT date FROM `gdelt-bq.gdeltv2.iatv_aif_vidtime` WHERE DATE(date) = "2020-02-24" and station='CNN' and type!='NEWS')
group by entity order by cnt desc

It yields the following ten most common entities:

entity cnt
person 48182
people 45040
phenomenon 40471
news 33075
snapshot 31364
facial expression 27636
public event 27277
photography 27063
event 26129
photo caption 25215

Note that "photograph" is no longer in the top ten, while "public event" has displaced "photography," "news" has moved up and "font" has been replaced by "photo caption." Meanwhile, "chicken nuggets" received 142 seconds of airtime in the unfiltered results, whereas in the filtered results it is seen for just 10 seconds. The reason it is not zero is that using ASR to realign the caption time dataset into video time relies on using the timecode of each spoken word and the grouping size of closed captioning lines, meaning if a chicken nugget appears onscreen without any spoken words nearby and with a suitable silent period coupled with longer captioning lines, a second here and there may slip through, since the ad data channel is natively encoded in captioning time without thought to video time analysis.

Nevertheless, this analysis shows how easy it is to remove advertisements from visual analyses now!