Continue Reading

Experiments With Generative Coding: Modernizing Legacy BigQuery Code & CodeGen Guardrails

Modern generative coding systems have garnered immense hype, frequently presented as drop-in replacements for human coders. Yet, the majority of…

Continue Reading

Tracking Infections, Death & Vaccination Over The Covid-19 Pandemic Using NGrams & BigQuery

How can the Web News NGrams 3.0 dataset be used to extract and track trends in numeric quantities? For example,…

Continue Reading

Creating A Daily Global Shortage Timeline Using Web NGrams 3.0 & BigQuery In One SQL Query

Earlier today we showed how to use Web NGrams 3.0 and BigQuery to track mentions of "shortages of" across English…

Continue Reading

Using Web NGrams 3.0 & BigQuery To Track "Shortages of …"

We've published a growing collection of tutorials on how to use the Web News NGrams 3.0 dataset for a range…

Continue Reading

Commodities & Financial Early Warning Using Web NGrams + GCP Timeseries Insights API + Translate + BigQuery

On Friday, we combined GDELT's Web NGrams 3.0 dataset with GCP's Timeseries Insights API, Translate API and BigQuery to create…

Continue Reading

Monkeypox & Disease Early Warning: Planetary-Scale Anomaly Detection With Web NGrams + GCP Timeseries Insights API + Translate + BigQuery

From capturing the first flickers of 2014's Ebola outbreak to powering one of the earliest alerts of the Covid-19 pandemic,…

Continue Reading

Timeseries Insights API + BigQuery + Translate + Web NGrams = Monkeypox Early Warning Demo Coming This Week

Stay tuned for a really exciting new demo coming later this week using the GCP Timeseries Insights API, BigQuery, Google…

Continue Reading

Performing At-Scale Entity Extraction Over The News Using BigQuery UDFs & Web NGrams 3.0

Earlier this week we showed how to write a simple Perl script to download the latest Web NGrams 3.0 dataset…

Continue Reading

Computing Quadgrams At BigQuery Scale Through ML.NGRAMS

Many questions in computational linguistics require the computation of character sequences over vast corpora, requiring strong scalability and robust distributed…

Continue Reading

Experiments With Machine Translation: KWIC Through BigQuery's ML.NGRAMS

As we carefully construct the training and test corpi for our machine translation models, one tool we rely heavily upon…

Continue Reading

Experiments With Machine Translation: From RAM Disks To BigQuery

At the core of all machine translation systems lie data. Vast archives of monolingual and bilingual training and testing data…

Continue Reading

GSG Embeddings + GKG + BigQuery + Tensorflow Embedding Projector = Visualizing The Covid-19 Vaccine News Landscape

What would it look like to visualize a day of worldwide online news coverage about a given topic, using document-level…

Continue Reading

Global Similarity Graph Document Embeddings & BigQuery UDFs: Semantic Multilingual Search Over The News

The new Global Similarity Graph Document Embeddings dataset uses the Universal Sentence Encoder V4 to compute document-level embeddings for each news…

Continue Reading

Using Global Similarity Graph Document Embeddings & BigQuery For "More Like This" Search: Cross Language Search

Earlier today we showed how the new Global Similarity Graph Document Embeddings dataset can be used to take an arbitrary…

Continue Reading

Using Global Similarity Graph Document Embeddings & BigQuery For "More Like This" Search

Earlier today we announced the new Global Similarity Graph Document Embeddings dataset that uses the Universal Sentence Encoder V4 to…

Continue Reading

Using BigQuery's UNNEST To Unroll Count-Based Datasets

Some applications like Google's Timeseries Insights API require that count-based datasets be unrolled since they examine discrete events. For example,…

Continue Reading

A Daily Timeline Of Key Vaccine Topics In 2021 Through A TF-IDF BigQuery Analysis Of The Global Relationship Graph

What are the most significant words and phrases associated with vaccines by day thus far this year? To explore this…

Continue Reading

BigQuery + UDF = Identifying The Earliest Glimmers Of Covid-19

The GKG 2.0 is essentially a realtime metadata index over the world's news in 65 languages dating back to 2015….

Continue Reading

Google Cloud & Elastic: BigQuery And Elasticsearch – Insights At Scale

This fantastic talk by Elastic's Adam Quan, Principal Solutions Architect and Google's Matt Lescohier, Strategic Partner Manager, Databases, Global Partner…

Continue Reading

Flattening TV News NGrams Using BigQuery

The Television News NGrams 2.0 dataset records how many times a given word was spoken in a given 10 minute…

Continue Reading

TFIDF Using BigQuery + Radio News NGrams To Chart The Most Significant Phrases Per Day On BBC World Service In 2020

Earlier today we showed how TFIDF calculation over the Radio News NGrams dataset could be used to surface the most…

Continue Reading

TFIDF Using BigQuery + Radio News NGrams To Chart The Most Significant Words Per Day On BBC World Service In 2020

How might we use the new Radio News NGrams dataset to examine the Internet Archive's Radio News Archive's ASR of…

Continue Reading

Carto Blog: Google BigQuery Visualization: Mapping Big Spatial Data

Carto republished our new BigQuery+Carto tutorial and video on their blog! Read The Full Post.

Continue Reading

GDELT Tutorial: Using Carto's BigQuery Connector To Seamlessly Map Covid-19 News

The third in our GDELT Tutorial series, this video walks through our blog post from earlier this week, Using Carto's…