Continue Reading

Computing Quadgrams At BigQuery Scale Through ML.NGRAMS

Many questions in computational linguistics require the computation of character sequences over vast corpora, requiring strong scalability and robust distributed…

Continue Reading

Experiments With Machine Translation: KWIC Through BigQuery's ML.NGRAMS

As we carefully construct the training and test corpi for our machine translation models, one tool we rely heavily upon…

Continue Reading

Experiments With Machine Translation: From RAM Disks To BigQuery

At the core of all machine translation systems lie data. Vast archives of monolingual and bilingual training and testing data…

Continue Reading

GSG Embeddings + GKG + BigQuery + Tensorflow Embedding Projector = Visualizing The Covid-19 Vaccine News Landscape

What would it look like to visualize a day of worldwide online news coverage about a given topic, using document-level…

Continue Reading

Global Similarity Graph Document Embeddings & BigQuery UDFs: Semantic Multilingual Search Over The News

The new Global Similarity Graph Document Embeddings dataset uses the Universal Sentence Encoder V4 to compute document-level embeddings for each news…

Continue Reading

Using Global Similarity Graph Document Embeddings & BigQuery For "More Like This" Search: Cross Language Search

Earlier today we showed how the new Global Similarity Graph Document Embeddings dataset can be used to take an arbitrary…

Continue Reading

Using Global Similarity Graph Document Embeddings & BigQuery For "More Like This" Search

Earlier today we announced the new Global Similarity Graph Document Embeddings dataset that uses the Universal Sentence Encoder V4 to…

Continue Reading

Using BigQuery's UNNEST To Unroll Count-Based Datasets

Some applications like Google's Timeseries Insights API require that count-based datasets be unrolled since they examine discrete events. For example,…

Continue Reading

A Daily Timeline Of Key Vaccine Topics In 2021 Through A TF-IDF BigQuery Analysis Of The Global Relationship Graph

What are the most significant words and phrases associated with vaccines by day thus far this year? To explore this…

Continue Reading

BigQuery + UDF = Identifying The Earliest Glimmers Of Covid-19

The GKG 2.0 is essentially a realtime metadata index over the world's news in 65 languages dating back to 2015….

Continue Reading

Google Cloud & Elastic: BigQuery And Elasticsearch – Insights At Scale

This fantastic talk by Elastic's Adam Quan, Principal Solutions Architect and Google's Matt Lescohier, Strategic Partner Manager, Databases, Global Partner…

Continue Reading

Flattening TV News NGrams Using BigQuery

The Television News NGrams 2.0 dataset records how many times a given word was spoken in a given 10 minute…

Continue Reading

TFIDF Using BigQuery + Radio News NGrams To Chart The Most Significant Phrases Per Day On BBC World Service In 2020

Earlier today we showed how TFIDF calculation over the Radio News NGrams dataset could be used to surface the most…

Continue Reading

TFIDF Using BigQuery + Radio News NGrams To Chart The Most Significant Words Per Day On BBC World Service In 2020

How might we use the new Radio News NGrams dataset to examine the Internet Archive's Radio News Archive's ASR of…

Continue Reading

Carto Blog: Google BigQuery Visualization: Mapping Big Spatial Data

Carto republished our new BigQuery+Carto tutorial and video on their blog! Read The Full Post.

Continue Reading

GDELT Tutorial: Using Carto's BigQuery Connector To Seamlessly Map Covid-19 News

The third in our GDELT Tutorial series, this video walks through our blog post from earlier this week, Using Carto's…

Continue Reading

Using Carto's BigQuery Connector To Seamlessly Map The Global Geographic Graph

The Global Geographic Graph now spans more than 1.7 billion location mentions in worldwide English language news coverage back to…

Continue Reading

Measuring Shot-Level Visual Similarity Of TV News Using The Video AI API & BigQuery

As we continue in our efforts to automatically segment television news broadcasts into their component stories, we've looked at shot-level…

Continue Reading

Advanced OCR Similarity Metrics Using BigQuery + Video AI To Segment Television Evening News

Yesterday we showed how a single SQL query in BigQuery can be used to segment a television news show by…

Continue Reading

Using BigQuery To Segment A Television Evening News Show Using Shot Changes + OCR Similarity

As we continue to look at new ways to segment television shows into their component stories, its worth noting the…

Continue Reading

GDELT Tutorials: Interactive Sentiment Mining With BigQuery!

The second in our new GDELT Tutorial series, this 6-minute video walks through sentiment mining using BigQuery!

Continue Reading

Using BigQuery DML & External Temporary Tables To Perform Realtime Reformatting Inserts For Television News Ngrams

As we prepare to release the 2.0 release of our television news ngrams dataset, a guiding design goal has been…

Continue Reading

BigQuery Turns 10!

Google's BigQuery platform, which underlies so much of GDELT's work on understanding the world around us, turned 10 today! As…

Continue Reading

Using BigQuery To Compile A Decade-Long Chronology Of Dr.'s On Television News

Yesterday we showed how a single SQL query in BigQuery could be used to compile a list of all of…