CNS Research Showcase: Television As Data: Opening The Internet Archive’s Two Decade Archive Of Global Television News Spanning 50 Countries To Journalists & Scholars Through AI, Analytics, Search & Visualization

Kalev will be speaking at Indiana University's Cyberinfrastructure for Network Science Center (CNS) today on GDELT's collaborations with the Internet Archive's TV News Archive around interface development for interacting with large video archives:

How can treating television news as data create fundamentally new kinds of opportunities for journalists and scholars to conduct at-scale computational analysis of the global narrative landscape and the creation of new kinds of search and analytic tools to render the traditionally impenetrable linear format of video into a rich source of insights on human society? How can AI tools like OCR, object detection, embeddings, language understanding, knowledge graphs, transcription, translation and visual search make it possible to search television news in powerful new ways aligned with the needs of journalists, fact checkers and scholars? How can such tools help connect television news to social media and online and radio news, allowing narratives to be traced as they move across the media ecosystem and even help visualize them at scale? How can video be made “skimmable” and the hundreds of terabytes of annotations from such tools turned into actionable insights and tools useable by everyone from data scientists to journalists, fact checkers and even ordinary citizens? 

In collaboration with the Internet Archive’s TV News Archive, GDELT is working to help scholars and journalists understand and visualize the Archive’s extraordinary public interest library of global television news spanning more than 5.2 million broadcasts totaling 3.4 million hours of airtime from over 100 channels representing 50 countries in 35 languages over 20 years on 5 continents. Through an ever-growing landscape of non-consumptive algorithms, datasets and interfaces, from keyword search of closed captioning and OCR’d onscreen text, to experiments with cutting-edge visual search from logo detection to YOLO’s object detection and OpenAI’s natural language CLIP search, to 3 billion “video ngrams” totaling 1 quadrillion pixels, to making television “skimmable” through the Visual Explorer’s unique user interface, we are exploring how millions of hours of television can become unprecedented insights into the heartbeat of Planet Earth.