Yesterday we showcased how Google's Cloud Vision API could be used to rapidly and non-consumptively annotate a television news broadcast using the TV News Visual Explorer's downloadable preview image ZIP file that contains the full resolution version of the images in the thumbnail grid that sample the broadcast one frame every 4 seconds. In our demo, we asked the Vision API to identify appearances of major logos anywhere in the frames. How might we use this to identify appearances of Fox News clips on Russian Television News?
In yesterday's demo, we used the broadcast "Факты" that aired on Russia24 this past Tuesday at 7PM Moscow time in which a quick visual skim shows that Fox News clips appeared in two different places. How might we locate those two appearances using a fully automated workflow instead?
Download the ZIP file containing all of the Vision API's annotations of yesterday's broadcast:
And make sure you have the jq JSON query utility installed:
apt-get install -y jq
Now unzip the file and compile a master inventory of all of the logos that Cloud Vision recognized in this broadcast:
unzip RUSSIA24_20220830_160000_Fakti-CVAPI.zip cd RUSSIA24_20220830_160000_Fakti-CVAPI cat *.json | jq -r .responses[0].logoAnnotations[]?.description | sort | uniq
This yields 83 unique logos:
1492 Pictures AEG Aeroflot African Development Bank American Airlines Aptara Arizona Department of Transportation Association of Asia Pacific Airlines BMR Group Bangalore Institute of Technology Bangor Savings Bank Big 5 Sporting Goods Blurb, Inc. Bohemians 1905 Breitling SA Brilliance Auto CNN Charms Blow Pops Clandestine Colombian Communist Party Corner Bakery Cafe Croker Deccan TV Decon Delta Air Lines DiMarzio Dover Street Market DuPont EFAO Zografou B.C. EVA Air Eider Eindhoven University of Technology Emirates Transport Endless Computer Engelbert Strauss Equinix European Union FK Poprad Father Ryan High School Fox News Gazprom Gazprom Neft Houghton International IndyCar LHV Pank Lewis Road Creamery Lightspeed MNC News MNP LLP Mad for Garlic Magnit Mahatma Jyotiba Phule Rohilkhand University, Bareilly Metka Mindshare Mission Federal Credit Union NBC News NOS National Anti-Corruption Bureau of Ukraine National Grid Corporation of the Philippines Nexity Paccar PhosAgro PlayStation Qwirkle Rosatom Siam Commercial Bank Slok Air International SolarEdge Spar Stada Arzneimittel Tata Motors Tele2 Tesla, Inc. Texas Rangers The Beck Group United Nations Global Compact University of Huelva VTB Bank Vejle Idrætshøjskole Volvo Wesley College Whittier College WikiLeaks Yokohama Rubber Company
Now let's create a lookup file for each frame that contains just the list of logos found in that frame, which will make it easier for us to work with the list:
find *.json | parallel --eta 'cat {} | jq -r .responses[0].logoAnnotations[]?.description > {}.logos'
Now let's grep those logo lookups to find all of the Fox News appearances:
grep 'Fox News' *.logos
This returns 12 sample frames, each representing 4 seconds of airtime, meaning that roughly 12 * 4 = 48 seconds of airtime in this hour-long broadcast featured a Fox News clip:
RUSSIA24_20220830_160000_Fakti-000087.json.logos:Fox News RUSSIA24_20220830_160000_Fakti-000088.json.logos:Fox News RUSSIA24_20220830_160000_Fakti-000090.json.logos:Fox News RUSSIA24_20220830_160000_Fakti-000091.json.logos:Fox News RUSSIA24_20220830_160000_Fakti-000092.json.logos:Fox News RUSSIA24_20220830_160000_Fakti-000428.json.logos:Fox News RUSSIA24_20220830_160000_Fakti-000430.json.logos:Fox News RUSSIA24_20220830_160000_Fakti-000431.json.logos:Fox News RUSSIA24_20220830_160000_Fakti-000432.json.logos:Fox News RUSSIA24_20220830_160000_Fakti-000433.json.logos:Fox News RUSSIA24_20220830_160000_Fakti-000434.json.logos:Fox News RUSSIA24_20220830_160000_Fakti-000435.json.logos:Fox News
We can see that the first span runs from sample frame 87 to sample frame 92 (note that despite the logo being cut off, the API still recognized it), while the second clip runs from sample frame 428 to sample frame 435 and is from a Tucker Carlson episode.
We can similarly search for excerpted CNN clips:
grep 'CNN' *.logos
Which yields 5 sample frames across 3 separate segments (72-74, 135 and 405):
RUSSIA24_20220830_160000_Fakti-000072.json.logos:CNN RUSSIA24_20220830_160000_Fakti-000073.json.logos:CNN RUSSIA24_20220830_160000_Fakti-000074.json.logos:CNN RUSSIA24_20220830_160000_Fakti-000135.json.logos:CNN RUSSIA24_20220830_160000_Fakti-000405.json.logos:CNN
You can see the three clips here:
Similarly, searching for NBC News clips:
grep 'NBC News' *.logos
Yields:
RUSSIA24_20220830_160000_Fakti-000465.json.logos:NBC News RUSSIA24_20220830_160000_Fakti-000466.json.logos:NBC News RUSSIA24_20220830_160000_Fakti-000467.json.logos:NBC News RUSSIA24_20220830_160000_Fakti-000471.json.logos:NBC News
Which can be seen:
Performing logo detection at scale like this costs just $1.50 per hour-long broadcast or $0.60 per broadcast at scale, making it imminently tractable for researchers looking to identify how American news coverage is being repurposed by the Russian state to advance its war propaganda efforts. Not all logos may be recognized (though custom logos can be added through AutoML Vision) and if logos are overly clipped or obscured they may not be recognizable, meaning this may not return 100% of the appearances of clips from a given media outlet, but overall offers an exceptionally powerful and low-cost method of rapidly scanning global media coverage at scale
You could also easily construct your own bespoke models using TensorFlow or other modeling environments and run on your own CPU or GPU hardware to perform customized recognition.
We are tremendously excited about the kinds of pioneering new forms of at-scale media analysis the Visual Explorer's preview images make possible and would love to hear from you with your own creative applications using the Visual Explorer!