As demand for our APIs has grown exponentially—powering an increasingly diverse, global ecosystem of applications – we have reached the practical limits of our legacy Elasticsearch-based architecture. Despite years of optimization, the sheer scale of incoming requests has begun to outpace what this system can sustainably support. The result has been an increasing need to shed requests under peak load, degrading both programmatic access and interactive user experience.
With the upcoming launch of GDELT 5, we are ushering in a fundamentally new era for our infrastructure. In the coming weeks, we will begin migrating our services to Spanner: Google’s globally distributed hyperdatabase platform. This transition will begin with the relaunch of the Television News Explorer, followed by the Online News Explorer, and ultimately our entire API stack, all powered by Spanner’s massive global scalability. Beyond pure scale, Spanner offers something even more transformative: seamless integration of semantically enriched keyword search, vector search, and graph-based querying. This convergence positions us to introduce a new generation of interfaces that enable agentic, personalized, and interactive search – allowing researchers to dynamically refine queries, explore evolving narratives, and uncover deeper thematic structures across global media.
Rethinking Search at Planetary Scale
The relaunch of the Television News Explorer marks the first major deployment of this new architecture. Unlike traditional keyword search engines, which simply return a list of documents containing a given keyword, our research-oriented platform requires a far more granular capability: the ability to identify and count every individual mention of a keyword or phrase, precisely timestamped, spanning:
- More than 7 million hours from 300 channels from 50 countries
- Over 150 languages and dialects
- Both native-language and translated transcripts
- Future layers of semantic annotation, vector embeddings, and graph relationships
Designing an architecture that can handle this level of detail at global scale required rethinking nearly every assumption of traditional search infrastructure. While we have more than a decade of experience running Elasticsearch under high load and more than 30 years of high performance database design on classic "big iron" NUMA HPC systems, Spanner's extreme scale and globally distributed architecture posed unique opportunities to reimagine how we think about search.
Gemini As Our Inhouse Spanner Expert
Traditionally, even with extensive Spanner experience, a transition of this scale would involve months of deep documentation review, prototyping, and iterative experimentation. Instead, given the massive scale of the architectural design we needed to create, the nearly infinite number of possible approaches and the fact that we had never used Spanner before, we wanted to see how we might use Gemini 3 to act as our inhouse Spanner expert.
Initially, we tried a traditional LLM-assisted coding approach: we worked interactively with Gemini to translate our existing Elasticsearch-based design into Spanner equivalents. However, this approach quickly proved limiting. By anchoring too closely to legacy assumptions, the resulting architectures failed to fully exploit Spanner’s strengths.
We then shifted strategy. Rather than describing how the system worked, we described what the data looked like and what queries needed to be supported.
Phase 1: Gemini-Assisted Design
The first phase involved iterative, human-in-the-loop refinement: reviewing Gemini’s proposals, providing feedback, and guiding its design decisions.
A key challenge quickly emerged: Gemini consistently devolved its designs to conventional document-centric search paradigms, returning lists of documents with matching snippets. Our problem, however, is fundamentally different: we operate on time-coded transcripts and require exhaustive, timestamp-level enumeration of matches.
Even with transcript chunking (e.g., 15-second segments), Gemini struggled to design indexing schemes that could correctly handle phrase searches spanning chunk boundaries while maintaining efficiency at scale under high query loads and across all of the languages we have to support.
Phase 2: Gemini Autonomous Agentic Iteration
To explore the potential of fully autonomous agentic development, we ran Gemini in an agentic framework that saw it effectively iterating over a vast landscape of potential designs. We even had it construct large, highly complex synthetic test datasets filled with edge cases to test its proposed architectures.
The results were incredibly impressive, though they also highlighted the current boundaries of agentic AI development. We found that the agentic workflow was not a "black box" that could run autonomously and independently return a perfectly finished product. Instead, it was an unparalleled brainstorming engine. It generated suites of highly creative ideas and recommendations that we were able to manually review and eventually turn into a final production architecture using Gemini back in interactive mode.
The Limits Of Agentic Development
While agentic development has immense potential, we encountered two key challenges that require further consideration:
- Hallucinations: Even with grounding enabled, Gemini would occasionally derail a promising design by hallucinating about Spanner features or misunderstanding its own design. In one instance, a highly promising optimization chain was abruptly abandoned when Gemini hallucinated that Spanner tables containing inverted indexes could not be partitioned, while in another case it steadfastly believed that replacing Spanner's purpose-built inverted index with its own bespoke term and doc list tables would be orders of magnitude faster, or that counting a single match per document met the design requirement of counting *all* matches in each document.
- Safe Scalable Sandboxing: Perhaps the greatest hurdle was resource and cost sandboxing. We did not yet feel comfortable with the current state of agentic development to hand Gemini the keys to an empty Enterprise Spanner instance and allowing an autonomous agent to endlessly ingest hundreds of terabytes of data over and over and over again to run infinite design and load-testing experiments. This meant that Gemini had to rely more on theoretical reasoning about the performance of its proposed designs under various kinds of load conditions, rather than actually run those load tests. Even with improved agentic resource and cost sandboxing, load and stress testing for hyperscale applications pose unique cost challenges to fully autonomous agentic testing at this time given their relative inefficiency and the high cost of each run.
A First Glimpse
Over the coming weeks we'll be talking a lot more about all of these steps and our journey, but below you can see a few glimpses at the final architecture Gemini 3 built for the new Spanner-powered Television News Explorer. Note the use of a trio of indexed fields to very elegantly handle phrase searches that span chunk boundaries. A phrase that is entirely contained within the chunk matches via the TXT field. However, a phrase that begins in the current chunk and ends in the next (CHUNK1: "the white", CHUNK 2: "house announced", SEARCH PHRASE: "white house") is handled by TXT_AFTER and TXT_MERGED in a very creative solution that Spanner is able to support with only minimal additional cost of index merging.
These early results demonstrate the enormous potential of combining globally distributed infrastructure like Spanner with agentic and AI-assisted architectural design. While limitations remain, the ability to go from a textual narrative of an application and its user journeys to final technical architecture designed for hyperscale load represents a profound shift in how large-scale systems can be built.
CREATE TABLE TRANSCRIPTS (
CHUNKID STRING(MAX) NOT NULL,
CHUNKTIME TIMESTAMP NOT NULL,
CHUNKTIME_INT INT64 NOT NULL AS (UNIX_MILLIS(CHUNKTIME)) STORED,
CHUNKOFFSET INT64 NOT NULL,
CHANNEL STRING(MAX) NOT NULL,
SHOWID STRING(MAX) NOT NULL,
SHOWNAME STRING(MAX) NOT NULL,
PRIMARYLANG STRING(MAX) NOT NULL,
TYPE STRING(MAX) NOT NULL, --(N)ATIVE/(T)RANSLATED
TXT STRING(MAX),
TXTAFTER STRING(MAX),
TXTMERGED STRING(MAX),
TXT_TOKENS TOKENLIST AS (TOKENIZE_FULLTEXT(TXT)) HIDDEN,
TXTMERGED_TOKENS TOKENLIST AS (TOKENIZE_FULLTEXT(TXTMERGED)) HIDDEN,
TXTAFTER_TOKENS TOKENLIST AS (TOKENIZE_FULLTEXT(TXTAFTER)) HIDDEN
) PRIMARY KEY (CHUNKID);
CREATE SEARCH INDEX TRANSCRIPTSIDX_FTXTSEARCH
ON TRANSCRIPTS(TXT_TOKENS, TXTMERGED_TOKENS, TXTAFTER_TOKENS)
STORING (SHOWID, SHOWNAME, CHUNKOFFSET, CHUNKTIME, TXT)
PARTITION BY CHANNEL, TYPE
ORDER BY CHUNKTIME_INT DESC;
CREATE INDEX TRANSCRIPTSIDX_SHOWID
ON TRANSCRIPTS(SHOWID);
Simple search results:
SELECT
CHANNEL,
CHUNKTIME,
SHOWID,
SHOWNAME,
CHUNKOFFSET,
TXT
FROM TRANSCRIPTS
WHERE CHANNEL IN UNNEST(['CHANNELS']) AND TYPE='N/T'
AND (
SEARCH(TXT_TOKENS, 'white house')
OR (SEARCH(TXTMERGED_TOKENS, 'white house') AND NOT SEARCH(TXTAFTER_TOKENS, 'white house')) --ONLY FOR PHRASE SEARCHES
)
ORDER BY CHUNKTIME_INT DESC LIMIT 50;
Normalized timeline:
WITH DailyMentions AS (
SELECT
TIMESTAMP_TRUNC(TIMESTAMP, DAY) AS time_bucket,
CHANNEL,
COUNT(*) AS mention_count
FROM TRANSCRIPTS
WHERE CHANNEL IN ('CNN', 'FOX') AND TYPE='N'
AND (
SEARCH(TXT_TOKENS, @search_phrase)
OR (SEARCH(TXTMERGED_TOKENS, @search_phrase) AND NOT SEARCH(TXTAFTER_TOKENS, @search_phrase))
)
GROUP BY
time_bucket,
CHANNEL
)
SELECT
n.BUCKET_TIME AS time_bucket,
n.CHANNEL,
IFNULL(m.mention_count, 0) AS mention_count,
n.tot AS total_chunks,
-- Calculate percentage, using 0 if mention_count is NULL
SAFE_DIVIDE(IFNULL(m.mention_count, 0), n.tot) * 100 AS normalized_pct
FROM NORMCOUNTS_1DAY n
LEFT JOIN DailyMentions m
ON n.CHANNEL = m.CHANNEL
AND n.BUCKET_TIME = m.time_bucket
WHERE n.CHANNEL IN UNNEST(@all_channels)
ORDER BY
n.BUCKET_TIME DESC,
n.CHANNEL
#THE NORM TABLES (15S/1H/1D/1W/1M/1Y)
CREATE TABLE NORMCOUNTS_15S (
CHANNEL STRING(MAX) NOT NULL,
BUCKET_TIME TIMESTAMP NOT NULL,
tot INT64 NOT NULL
) PRIMARY KEY (CHANNEL, BUCKET_TIME);
Word clouds:
WITH BaseQuery AS (
SELECT
TXT
FROM TRANSCRIPTS
WHERE CHANNEL IN UNNEST(@all_channels)
AND (
SEARCH(TXT_TOKENS, @search_phrase)
OR (SEARCH(TXTMERGED_TOKENS, @search_phrase) AND NOT SEARCH(TXTAFTER_TOKENS, @search_phrase))
)
ORDER BY TIMESTAMP DESC
LIMIT 50
),
Words AS (
SELECT REGEXP_EXTRACT_ALL(LOWER(TXT), r'[\p{L}\p{N}]+') AS word_array
FROM BaseQuery
)
SELECT
word,
COUNT(*) AS word_count
FROM Words, UNNEST(word_array) AS word
WHERE word NOT IN (
-- STOP WORDS: You will want to filter out common words, otherwise your top
-- results will always be "the", "and", "a", etc. Add/remove words as needed.
'the', 'and', 'to', 'of', 'a', 'in', 'is', 'that', 'it', 'for', 'on', 'with',
'as', 'this', 'was', 'at', 'by', 'an', 'be', 'from', 'or', 'are', 'we', 'i', 'you'
)
AND LENGTH(word) > 1
GROUP BY word
ORDER BY word_count DESC
LIMIT 20;