Snopes: Fact Check: Trump WH Falsely Claimed USAID Funded 'Transgender Comic Book' In Peru
Snopes uses the TV News Archive in its fact check of claims about USAID funding. Read The Full Article.
Behind The Scenes: A First Glimpse At ASR Statistics From 2.5 Million Hours Of Global TV News Spanning 50 Countries & A Quarter Century
Last year we announced the successful completion of Large Speech Model (LSM)-powered ASR over the totality of the uncaptioned Television…
Behind The Scenes: A Look At 16 Years Of Advertising Density On Television News
We are tremendously excited to announce today the completion of our analysis of captioning mode information across the totality of…
Behind The Scenes: 1.9 Million Hours & 13.8 Billion Words Of Closed Captioning Spanning 17 Years Of Television News
Yesterday we previewed some initial statistics from our work identifying and removing advertisements from closed captioning transcripts across the TV…
Behind The Scenes: Some Initial Archive-Scale Closed Captioning Statistics
Only a portion of the TV News Archive's broadcasts contain broadcaster-provided closed captioning, but by virtue of being largely human-transcribed…
At-Scale OCR Of Television News: 18.8 Billion Seconds Of Global Television News OCR'd For $71K Vs $47M
We are tremendously excited to announce today that in collaboration with the Internet Archive's Television News Archive, we have completed…
Behind The Scenes: Identifying Mismatches Between Expected And Real Video File Durations & Single Version Of The Truth (SVOT)
One of the most complex and time-consuming aspects of working with vast historical archives is diagnosing and addressing the myriad…
At-Scale OCR Of Television News Experiments: OCR Of Interlaced Video Using GCP's Cloud Vision
Amongst the TV News Archive's quarter-century of global broadcasts are interlaced broadcasts, which produce the tell-tale jagged ghosting seen below…
At-Scale OCR Of Television News Experiments: Optimizing The Still Frame File Storage Format
Analyzing petascale video archives poses unique computational challenges, from the underlying processor and accelerator requirements to simply moving that much…
From LSM's To LMMs For ASR: Evaluating Gemini's Performance At Transcribing An Evening News Broadcast
As we continue to evaluate the rapid progress of large model ASR systems, from lightly to heavily generative LSMs to…
Comparing GCP's Chirp & Chirp 2 ASR Models: Dropping Entire Passages
Yesterday we examined how GCP's new Chirp 2 ASR model hallucinates speech during non-verbal musical interludes in news broadcasts, resulting…
Comparing GCP's Chirp & Chirp 2 ASR Models: Hallucinating Speech During Music
Over the past six months we have continued to compare GCP's Chirp and Chirp 2 ASR models, each time finding…
Audience-Specific Podcasts: Customizing Our Daily "Top Stories" Biosurveillance Podcast Concept For Experts, Policymakers & The American Public
Yesterday we demonstrated feeding a daily roundup of global disease outbreak news headlines from around the world into a "thinking"…
A Daily "Top Stories" Global Disease Outbreak Podcast Concept Using GCP's Gemini 2.0 Thinking + Text-to-Speech API
What might it look like to feed a daily roundup of global disease outbreak news headlines in all the world's…
Using GCP's Chirp + Gemini 1.5 Pro + Speech-To-Text API To Summarize A Day Of Russian TV News Into A 3 Minute "Top Stories" Podcast
What might it look like to use GCP's Speech-to-Text API's Chirp LSM model to machine transcribe a full day of…
A Daily "Top Stories" Global Investment News Podcast Concept Using GCP's Gemini 2.0 Thinking + Text-to-Speech API
What might it look like to feed a daily roundup of global investment news headlines in all the world's languages…
A Daily "Top Stories About NVIDIA" News Podcast Concept Using GCP's Gemini 2.0 Thinking + Text-to-Speech API
What might it look like to feed a daily roundup of news headlines about NVIDIA from across the world in…
At-Scale OCR Of Television News Experiments: OCR'ing 10 Billion Seconds Of Global TV News For Just $47.5K Vs $26.9M
In collaboration with the Internet Archive's Television News Archive, we have successfully OCR'd 4.2 million television news broadcasts from around…
Behind The Scenes: Identifying Failed Recordings: Using Large Multimodal Modals Like ChatGPT & Gemini: Part 3
Continuing our series examining whether Large Multimodal Models (LMMs) like ChatGPT and Gemini might be able to help us identify…
Behind The Scenes: Identifying Failed Recordings: Using Large Multimodal Modals Like ChatGPT & Gemini: Part 2
Earlier this week we demonstrated the limitations of using Large Multimodal Models (LMMs) like ChatGPT and Gemini to detect corrupted…
Behind The Scenes: Identifying Failed Recordings: Using Large Multimodal Modals Like ChatGPT & Gemini
As we continue our efforts to scan the TV News Archive for failed recordings, how might Large Multimodal Models (LMMs)…
The Influence of Media Propaganda on Green Housing Consumption in China Based on GDELT Big Data
As part of China's two-carbon strategy, green buildings are a vital component in addressing climate change.Formulating a media propaganda strategy to…
Behind The Scenes: Identifying Failed Recordings: Examining Curation Metadata
Any large longitudinal audiovisual archive will have some number of recordings that suffer from technical errors, ranging from minor audio…
Behind The Scenes: API Quotas & The Impact Of A Fraction Of A QPS
All hosted APIs have rate-limited quotas of some form to protect them from abuse and to ensure equal sharing of…