Continue Reading

A Daily "Top Stories" Global Investment News Podcast Concept Using GCP's Gemini 2.0 Thinking + Text-to-Speech API

What might it look like to feed a daily roundup of global investment news headlines in all the world's languages…

Continue Reading

A Daily "Top Stories About NVIDIA" News Podcast Concept Using GCP's Gemini 2.0 Thinking + Text-to-Speech API

What might it look like to feed a daily roundup of news headlines about NVIDIA from across the world in…

Continue Reading

AFP: US Conservatives Baselessly Tie New Orleans Attacker to Illegal Immigration

AFP Fact Check about the New Orleans attack. Read The Full Article.

Continue Reading

At-Scale OCR Of Television News Experiments: OCR'ing 10 Billion Seconds Of Global TV News For Just $47.5K Vs $26.9M

In collaboration with the Internet Archive's Television News Archive, we have successfully OCR'd 4.2 million television news broadcasts from around…

Continue Reading

Behind The Scenes: Identifying Failed Recordings: Using Large Multimodal Modals Like ChatGPT & Gemini: Part 3

Continuing our series examining whether Large Multimodal Models (LMMs) like ChatGPT and Gemini might be able to help us identify…

Continue Reading

Behind The Scenes: Identifying Failed Recordings: Using Large Multimodal Modals Like ChatGPT & Gemini: Part 2

Earlier this week we demonstrated the limitations of using Large Multimodal Models (LMMs) like ChatGPT and Gemini to detect corrupted…

Continue Reading

Behind The Scenes: Identifying Failed Recordings: Using Large Multimodal Modals Like ChatGPT & Gemini

As we continue our efforts to scan the TV News Archive for failed recordings, how might Large Multimodal Models (LMMs)…

Continue Reading

The Influence of Media Propaganda on Green Housing Consumption in China Based on GDELT Big Data

As part of China's two-carbon strategy, green buildings are a vital component in addressing climate change.Formulating a media propaganda strategy to…

Continue Reading

Behind The Scenes: Identifying Failed Recordings: Examining Curation Metadata

Any large longitudinal audiovisual archive will have some number of recordings that suffer from technical errors, ranging from minor audio…

Continue Reading

Behind The Scenes: API Quotas & The Impact Of A Fraction Of A QPS

All hosted APIs have rate-limited quotas of some form to protect them from abuse and to ensure equal sharing of…

Continue Reading

Behind The Scenes: A Look Back At A Month Of Real-World AI API Latency At Scale

Last week we examined a 24-hour period of real-world AI API latency and error rates as an illustration of the…

Continue Reading

Behind The Scenes: GCP's Network Intelligence Performance Dashboard

The sheer massiveness of GCP's core service offerings is such that there are a wealth of hidden gems buried within…

Continue Reading

Behind The Scenes: GCP Network Intelligence Topology Mapping Of Our OCR Cluster At Startup

GCP's Network Intelligence service offers an incredibly powerful Network Topology visualization that shows all of the various GCP services being…

Continue Reading

Behind The Scenes: Using Our Bigtable + BigQuery + GCS Digital Twin To Queue Missing Broadcasts

Our massive new collaboration with the Internet Archive to OCR its complete quarter-century Television News Archive spans ten million broadcasts…

Continue Reading

At-Scale OCR Of Television News Experiments: Comparing 1FPS Frame Extraction Quality Levels

One of the most remarkable and difficult challenges of scaling novel workflows from one-off experiments to massive archive-scale production implementations…

Continue Reading

At-Scale OCR Of Television News Experiments: Ornamental Vs Chyron Text

One of the great tradeoffs in using image montaging to achieve a 100-200x performance increase and cost reduction for at-scale…

Continue Reading

At-Scale OCR Of Television News Experiments: Vertical + Horizonal Text – Results From A Taiwanese Broadcast

Unlike most of the world, television news broadcasts in Taiwan make regular use of vertical onscreen text that appears alongside…

Continue Reading

At-Scale OCR Of Television News Experiments: Results From A Sample Circa-2018 South Sudan Broadcast

Below is a transcribed OCR excerpt from a circa-2018 South Sudanese broadcast. As with our previous examples, the difficulties of…

Continue Reading

Behind The Scenes: Managing The Unpredictability Of Cloud AI API Latency At Archive Scale

While all APIs can exhibit variable response times and error rates, AI APIs demonstrate uniquely complex response behaviors. The scarcity…

Continue Reading

At-Scale OCR Of Television News Experiments: Results From A Sample Circa-2018 Congolese Broadcast – Part 2

In contrast to yesterday's Congolese television news example, in which Cloud Vision API was unable to transcribe the onscreen chyron…

Continue Reading

At-Scale OCR Of Television News Experiments: Results From A Sample Circa-2018 Congolese Broadcast

While our Cloud Vision montaging workflow has yielded highly robust results across the majority of broadcasts we've tested it on,…

Continue Reading

At-Scale OCR Of Television News Experiments: Results From A Sample Circa-2018 Nigerian Broadcast

Below is an OCR excerpt from a circa-2018 Nigerian news broadcast, with a fast-paced white textual "crawl" over a red…

Continue Reading

At-Scale OCR Of Television News Experiments: Results From A Sample Circa-2012 Venezuelan Broadcast

How does our OCR workflow perform on a circa-2012 Venezuelan broadcast? Of interest, it even captures the "Reuters" byline of…

Continue Reading

At-Scale OCR Of Television News Experiments: Results From A Sample Circa-2012 Vietnamese Broadcast

Continuing our OCR series, below is a circa-2012 Vietnamese broadcast. In keeping with the highly multilingual world of global broadcast…