Continue Reading

Cataloging The TV News Archive By Channel Over The Past Quarter-Century: 8 Million Broadcasts & 5.65M Hours From 214 Channels

As we continue to analyze, explore and examine the Internet Archive's TV News Archive's insights into our global world, we…

Continue Reading

Charting The Internet Archive TV News Archive's Collection By Location Over The Past Quarter-Century

The Internet Archive's TV News Archive has preserved television news coverage from more than 50 countries and territories over the…

Continue Reading

Charting The TV News Archive's Transition From SD To HD Video Resolution

Earlier this week we examined the storage growth of the Internet Archive's TV News Archive over the past quarter-century, charting…

Continue Reading

Charting The TV News Archive's Belarusian, Russian & Ukrainian Archive

Continuing our series using BigQuery to analyze our Bigtable-based GCS digital twin, how can we use this same approach to…

Continue Reading

What Using LLMs To Filter Television News EPG Show Names For "News/Not News" Teaches Us About The Severe Limits Of GenAI

One of the most basic and mind-numbing tasks in analyzing global television news coverage involves cataloging each broadcast show as…

Continue Reading

Generative AI Experiments: Asking Gemini Ultra To Reformat BigTable's CBT CLI Output Into JSON

GCP has a wealth of data storage offerings, ranging from full-fledged warehouse and database platforms like BigQuery and Spanner to…

Continue Reading

Our Journey Towards User-Facing Vector Search: Evaluating Elasticsearch's ANN Vector Search RAM Costs

As we continue our journey towards offering realtime user-facing semantic search over our growing collection of embedding datasets, we are…

Continue Reading

GCP Tips & Tricks: Observations On A Decade Of Running Elasticsearch On GCP: Part 3 – Future Storage Options

We've run Elasticsearch clusters on GCP for almost a decade across many different iterations of hardware and cluster configurations. Earlier…

Continue Reading

The Hidden Dangers Of Generative Coding (CodeGen) Guardrails

One of the most intriguing findings from our generative code modernization (codegen) experiment earlier today was the degree to which…

Continue Reading

A Look Back At Mapping 2013's "The Global Conversation" And Ahead To The Future

One decade ago we mapped "The Global Conversation" for the December 2013 print edition of Foreign Policy Magazine that coincided…

Continue Reading

Experiments In Summarizing Global Media Tenor: Views Towards China – Part 1

How might we use LLMs to summarize at-scale media tone and portrayals of countries? Let's look at a few different…

Continue Reading

Tracing Global Media Tone & Anxiety Towards China & Ukraine 2020-Present

Using the GKG, what might we be able to learn about global media tone towards China and Ukraine? Let's first…

Continue Reading

The Perils Of LLMs For Translation Tasks On Lower-Resource Languages: Estonian Noun Declension

One of the great promises of large language models (LLMs) is their ability to revolutionize translation and linguistic tasks. A…

Continue Reading

The Erdogan Heart Attack Rumor On Social Vs Mainstream Media: The Perils Of Prioritizing Speed Over Verification

One of the most-touted aspects of social media when it comes to realtime global warning is its claimed ability to…

Continue Reading

GEN4: GCE Networking: VPC Networks, Firewalls, IAP Tunneling & PGA

Earlier this year we explored how Google Compute Engine (GCE)'s VPC networks and especially GCP's "Private Google Access" (PGA) allows…

Continue Reading

GEN4: Building A Complete Near-Realtime Live Stream Video Analytics Platform In The Cloud In Just A Few Lines Of Code

Given the growing use of live streaming video across the world, from speeches by heads of state to news programming,…

Continue Reading

GDELT Opening Keynotes: Watching, Visualizing And Forecasting The World In Realtime

For organizations and conferences looking for aspirational "grand challenge" opening keynotes and workshops to inspire their audiences, the next in…

Continue Reading

Web Archives As Digital History: Methodologies, Workflows And Technological Needs

Within GDELT's vast archives lie decades of global human history. A library of open datasets spanning more than 8 trillion…

Continue Reading

GDELT Keynotes & Workshops: In-Person & Virtual

For organizations interested in everything from inspirational opening keynotes on the incredible new insights we gain into the functioning of…

Continue Reading

Using The Global Similarity Graph To Bootstrap Categorization Models Using Web NGrams 3.0

A common question from organizations building document classifiers on top of the Web NGrams 3.0 dataset is how to accelerate…

Continue Reading

Using Chyrons To Understand Diversity In Television News

When television news channels turn to outside experts to interview on the major stories of the moment, who do they…

Continue Reading

Mapping News: The Hidden Geography Of The World's News Media

Ever since Culturomics 2.0 showcased the immense insights and hidden predictive power of the geography of the world's news media,…

Continue Reading

"You Are Here": Helping The Public Navigate Their Informational Choices

Over the years we've explored how the GKG's outlink graph can be used to construct "you are here" maps of…

Continue Reading

Creating A New Generation Of Recommender Services: From Cohorts & Classification To Context & Trustworthiness

As we continue to explore how to help the world's citizenry access trustworthy information, one key area of research we…