Continue Reading

What I Learned At Google's Cloud Next 25

I was in Las Vegas two weeks ago for Google's Cloud Next 25. What few people realize about Next is…

Continue Reading

Using Gemini 2.5 Pro As A Persona-Based News Recommender Service: A US Government Policy Analyst Focused On China's Economy

Following our work yesterday using Gemini to create a persona-based news recommender service, let's try an even more powerful model:…

Continue Reading

Using Gemini 2.5 Pro As A Persona-Based News Recommender Service: A Day Of Automotive Supply Chains & Tariffs

How do we learn about the major news stories of each day? Increasingly this is through algorithmic filters ranging from…

Continue Reading

At-Scale OCR Of Television News Experiments: How OCR And Captioning Tell Different Stories About PBS In One Broadcast

Yesterday we offered the first statistics of just how much onscreen text there can be in a single hour-long American…

Continue Reading

At-Scale OCR Of Television News Experiments: First Results & Broadcast-Level Statistics

To date, we have OCR'd more than 18.8 billion seconds of global television news spanning 300 channels from 50 countries…

Continue Reading

Frontier AI Grand Challenge Problems: Grounding Vs Recency In The Hallucination Fight

As the existential challenges of AI hallucination have become ever more apparent, model vendors have increasingly moved to offer "grounding"…

Continue Reading

At-Scale OCR Of Television News Experiments: Using SRT Files For Scholarly Analysis Of OCR Text Of Video

GDELT represents one of the largest initiatives in the world devoted to understanding global society through data. The sheer magnitude…

Continue Reading

At-Scale OCR Of Television News Experiments: You Only Get What's In The Frame

At the top of this page you can see an interesting frame from our efforts to index the complete onscreen…

Continue Reading

WashPost: Trump’s D.C. U.S. attorney pick appeared on Russian state media over 150 times

The Washington Post uses the TV News Archive's Russia Today archives to identify more than 150 appearances of Ed Martin…

Continue Reading

Behind The Scenes: Comparing Bigtable's Python & Go Libraries & Using Gemini 2.5 Pro To Translate Python To Go

GDELT brings together myriad tools, APIs, libraries, scripts and binaries written across a range of programming languages, of which only…

Continue Reading

LMMs & Gemini 2.5 Pro Watching Television News: Visually Summarizing & Segmenting TV News Into Stories: A Year Later Part 3

While false positives in Gemini 2.5 Pro's safety filters prevented us from examining how well it could identify the major…

Continue Reading

LMMs & Gemini 2.5 Pro Watching Television News: Visually Summarizing & Segmenting TV News Into Stories: A Year Later Part 2

Yesterday we found that Gemini's visual understanding capabilities were roughly where we left them a year ago. However, we were…

Continue Reading

LMMs & Gemini Watching Television News: Visually Summarizing & Segmenting TV News Into Stories: A Year Later Part 1

Just over a year ago we explored having the then state of the art LMM Gemini 1.5 Pro "watch" an…

Continue Reading

This Week: Google Cloud Next 25

We'll be at Google Cloud Next 25 this week and look forward to talking, drop us a line if you're…

Continue Reading

Frontier AI Grand Challenge Problems: Corpus-Scale Reasoning Over A Global 200GB 150+ Language Archive

The most powerful generally available production AI models today max out at around 1-2M total context window tokens, with an…

Continue Reading

Television News Visual Explorer: Continual Ongoing ASR Now Live

We are excited to announce today that continual ongoing ASR of the entire TV News Archive is now live! As…

Continue Reading

Television News Visual Explorer: ASR Of All Uncaptioned Broadcasts 2001-Present Complete

We are excited today to announce that we have completed machine transcription of every single uncaptioned broadcast in the entire…

Continue Reading

Behind The Scenes: Detecting Broadcasts Missing Audio Streams & The Inadvertent Challenges Of Improved Resilience

Our longstanding ASR workflow was based on a submit-once model in which each broadcast needing transcription was submitted a single…

Continue Reading

Behind The Scenes: Speeding GCS Inventories By 10X By Dropping Wildcards

A key part of our ASR pipeline involves determining which of the inflight broadcasts has completed processing through GCP Chirp….

Continue Reading

CNN: ‘Nobody is above the law’: Trump officials who criticized Clinton’s emails now under scrutiny for leaked war plans

Several top Trump administration officials are facing scrutiny for sending detailed operational plans and other likely highly classified information about…

Continue Reading

Death Of A Unicorn & Why Search Quality Is The Make-Or-Break Centerpiece Of The AI Era

Last night I went to AMC's early access screening of Death of a Unicorn (WARNING: spoilers below) and had some…

Continue Reading

Washington Post: The new boundary of dangerous DEI in the military: Jackie Robinson

The Washington Post's Philip Bump examines media coverage of the DOD and DEI. Read The Full Article.

Continue Reading

Behind The Scenes: The Strange Case Of An Unhappy Service Account & The Unique Complexities Of AI Infrastructure

Modern cloud-based computing infrastructure has grown so reliable over the years that even larger-scale workflows rarely encounter error rates sufficiently…

Continue Reading

Behind The Scenes: Scaling From Proof Of Concept To Realtime Production To Archive Scale Retrospective

Migrating a new idea from proof of concept one-off experiments powered by a hodgepodge of scripts, notebooks and cobbled-together tools…