Author: Kalev Leetaru
Behind The Scenes: Using Our Bigtable + BigQuery + GCS Digital Twin To Queue Missing Broadcasts
Our massive new collaboration with the Internet Archive to OCR its complete quarter-century Television News Archive spans ten million broadcasts…
At-Scale OCR Of Television News Experiments: Comparing 1FPS Frame Extraction Quality Levels
One of the most remarkable and difficult challenges of scaling novel workflows from one-off experiments to massive archive-scale production implementations…
At-Scale OCR Of Television News Experiments: Ornamental Vs Chyron Text
One of the great tradeoffs in using image montaging to achieve a 100-200x performance increase and cost reduction for at-scale…
At-Scale OCR Of Television News Experiments: Vertical + Horizonal Text – Results From A Taiwanese Broadcast
Unlike most of the world, television news broadcasts in Taiwan make regular use of vertical onscreen text that appears alongside…
At-Scale OCR Of Television News Experiments: Results From A Sample Circa-2018 South Sudan Broadcast
Below is a transcribed OCR excerpt from a circa-2018 South Sudanese broadcast. As with our previous examples, the difficulties of…
Politifact: “After meeting with Elon Musk, Republican leader Sen. John Thune announces plans to cut Social Security.”
Politifact examines coverage of Sen. Thune's comments. Read The Full Article.
Behind The Scenes: Managing The Unpredictability Of Cloud AI API Latency At Archive Scale
While all APIs can exhibit variable response times and error rates, AI APIs demonstrate uniquely complex response behaviors. The scarcity…
At-Scale OCR Of Television News Experiments: Results From A Sample Circa-2018 Congolese Broadcast – Part 2
In contrast to yesterday's Congolese television news example, in which Cloud Vision API was unable to transcribe the onscreen chyron…
The Guardian: The unhinged presentation of Muslims on GB News has been exposed. What will Ofcom do about it?
An analysis of GB News coverage of Muslims. Read The Full Article.
At-Scale OCR Of Television News Experiments: Results From A Sample Circa-2018 Congolese Broadcast
While our Cloud Vision montaging workflow has yielded highly robust results across the majority of broadcasts we've tested it on,…
At-Scale OCR Of Television News Experiments: Results From A Sample Circa-2018 Nigerian Broadcast
Below is an OCR excerpt from a circa-2018 Nigerian news broadcast, with a fast-paced white textual "crawl" over a red…
At-Scale OCR Of Television News Experiments: Results From A Sample Circa-2012 Venezuelan Broadcast
How does our OCR workflow perform on a circa-2012 Venezuelan broadcast? Of interest, it even captures the "Reuters" byline of…
At-Scale OCR Of Television News Experiments: Results From A Sample Circa-2012 Vietnamese Broadcast
Continuing our OCR series, below is a circa-2012 Vietnamese broadcast. In keeping with the highly multilingual world of global broadcast…
At-Scale OCR Of Television News Experiments: Results From A Sample Circa-2018 Amharic Broadcast – Part 2: Onscreen Documents
Continuing yesterday's examination of OCRing an Amharic-language broadcast, towards the end of that broadcast is a fascinating example of a…
At-Scale OCR Of Television News Experiments: Results From A Sample Circa-2018 Amharic Broadcast – Part 1
Continuing our OCR experiments applying GCP's Cloud Vision API to global television news broadcasts using montaging, below is an excerpt…
At-Scale OCR Of Television News Experiments: What Have We Learned So Far?
In collaboration with the Internet Archive's TV News Archive, we are working to OCR the Archive's entire 7-million-hour quarter-century archive…
HuffPost: Now There's Resurfaced Video Of Pete Hegseth Completely Trashing Trump
A HuffPost analysis of Pete Hegseth. Read The Full Article.
AFP: Teary-Eyed Trudeau Video is Years Old, Unrelated to Trump Tariffs
AFP Fact Check on Trudeau. Read The Full Article.
CNN: Hegseth Has A History of Supporting Controversial Policies Involving The Military
A CNN deep dive on Pete Hegseth. Read The Full Article.
Digital Innovation Towards The Sustainable Development Goals: A Mass Media Analysis
Innovation and technology are essential to reduce the environmental impact of human activities and face the derived environmental and social…
Behind The Scenes: The Perils Of AI-Powered Autonomous Agents In The Real World
We continue to explore the landscape of AI-powered autonomous agents. Despite their immense hype and ubiquitous mediagenic demos on social…
GCP Tips & Tricks: Using The Cloud Monitoring API To Track AI API Usage In Realtime
Yesterday we discussed how we massively optimized our archive-scale OCR throughput by splitting montaging and OCR workloads. When working at the…
Behind The Scenes: Splitting Workloads: OCR Montage Generation Vs API Calls
As we continue to scale up our work OCR'ing a quarter century of global television news broadcasts, one of the…
At-Scale OCR Of Television News Experiments: Results From A Tunisian Broadcast
Below is an example of our Cloud Vision montage pipeline's transcription of a Tunisian broadcast from 2014, showing how it…