Kalev spoke today as part of the DOD SMA (Strategic Multilayer Assessment) EUCOM speaker series in a talk titled "Exploring A Year Of Russian Television News’ Propaganda Landscape Through AI: From Speech & Visual Analysis To ChatGPT":
The Internet Archive’s TV News Archive today encompasses more than 5.2 million television news broadcasts totaling 3.4 million hours from 108 channels spanning 50 countries and territories in 35 languages and dialects over 20 years on 5 continents. Last year the Archive expanded this massive archive to encompass a selection of Belarusian, Iranian, Russian and Ukrainian channels to enable journalists and scholars to better understand the domestic narratives surrounding Russia’s invasion of Ukraine and Iran’s protests. In collaboration with the Archive, the GDELT Project is machine transcribing and translating these channels, resulting in more than a quarter-million broadcasts totaling 1.1 billion words of English text. We explore a year of the propaganda landscape of Russian television news using a range of AI tools, from OCR text extraction, speech recognition and machine translation to object and activity detection, visual geocoding and logo detection to at-scale clip and scene tracking to visual thematic analysis (such as how the prevalence of military imagery has changed dramatically over the past year in concert with battlefield performance) and even constructing at-scale network analyses of who appears alongside of whom to visual search and automatic summarization and Q&A through OpenAI’s CLIP and ChatGPT and Google’s BARD.