A Transformative New Chapter: Translating The Entire Quarter-Century TV News Archive Through Gemini For Just $54K + All Channels Now Translated

Today we are announcing a transformative new chapter in planetary-scale translation and towards our vision of making it possible to see the world through others' eyes. In collaboration with the Internet Archive's Television News Archive, we have completed the Gemini-powered machine translation of the Archive's entire quarter-century archive of more than 2.4 million non-English broadcasts and we are now translating all non-English broadcasts from all monitored channels each day. For the first time since its creation a quarter-century ago, you can now view machine translated English captioning for any broadcast from any country in any language across the entire TV News Archive!

In total, we translated more than 2.4M broadcasts we previously transcribed into their original languages using GCP's Chirp ASR, totaling 4.5B seconds (75M min / 1.2M hours) of airtime spanning more than 6.8 billion words across 38 billion characters (46GB of text). Using Google Translate this would have cost more than $760K, but using Gemini 2.5 Flash Non-Thinking, this cost just $53,866 (consuming 79 billion input + output tokens). Only the public enterprise Vertex AI Gemini API was used and no data was used to train or tune any model.

The end result is that as of today decades of global television, once locked behind language barriers, are now searchable, readable, and understandable by journalists, researchers, historians, policymakers, and the public at large. Events that shaped societies, narratives that influenced nations, and voices that were previously inaccessible can now be examined, compared, and understood across languages and borders.

Moreover, the ability to translate 1.2 million hours of speech in over 150 languages and dialects for just under $54,000 brings the cost of at-scale translation down to the level where libraries and archives can now begin to tractably contemplate translating their holdings at scale into the languages of their patrons, making voices from the rest of the world vastly more accessible to journalists and scholars.

As we discussed earlier this month, this milestone required teaching Gemini how to understand the concept of time in ASR broadcast transcripts to allow the translations to be positioned in time at 2s resolution and aligning sentences in which word order is very different between the source language and English (such as languages with ending negation), while the move to Gemini has enabled us for the first time to translate highly multilingual and codeswitching broadcasts and "sounds like" ASR mistranscriptions. Critically, the unique prompting and architecture we use here has managed to effectively eliminate hallucination for the narrow task of time-coded translation below measurable levels in our extensive testing over the last several months. The end result is human-like fluency in translations across more than 150 languages spanning 300 channels from 50 countries on 5 continents over more than 25 years.

Explore The Visual Explorer.