Search Results for: bigquery
Using Our BigQuery + Bigtable + GCS Digital Twin To Track Historical Backfilling Progress
With our new BigQuery + Bigtable digital twin over our GCS archive, we can trivially compile ongoing inventories of our…
Experiments With CCExtractor Using Our BigQuery + Bigtable + GCS Digital Twin
In December 2020 we unveiled a massive new initiative in collaboration with the Internet Archive's TV News Archive to catalog…
Using Our BigQuery + Bigtable + GCS Digital Twin To Make Date-Based Random Samples For Content Analysis & Testing
A key concept in "content analysis" methodologies over large temporally diverse archives is the notion of time-based random samples: creating…
Using Our BigQuery + Bigtable + GCS Digital Twin To Identify Missing Channels
One of the most powerful aspects of our BigQuery-analyzable Bigtable-based GCS digital twin is the capability it makes possible to…
Using Our BigQuery + Bigtable + GCS Digital Twin To Map The Status & Error Codes Of Analyzing A Quarter-Century Of The TV News Archive
Making it possible for us to perform archive-scale analyses over the massive Internet Archive TV News Archive lies a powerful…
Plotting Cumulative Archival Growth Using Our BigQuery + Bigtable + GCS Digital Twin
On Monday, we explored how BigQuery can be combined with Bigtable to create a digital twin over a vast GCS…
Using BigQuery's Bigtable Connector To Analyze A Petabyte GCS Archive Digital Twin
Powering the TV, TV AI and Visual Explorers is a petabyte-scale GCS archive consisting of hundreds of millions of discrete…
Experiments With Generative Coding: Modernizing Legacy BigQuery Code & CodeGen Guardrails
Modern generative coding systems have garnered immense hype, frequently presented as drop-in replacements for human coders. Yet, the majority of…
Tracking Infections, Death & Vaccination Over The Covid-19 Pandemic Using NGrams & BigQuery
How can the Web News NGrams 3.0 dataset be used to extract and track trends in numeric quantities? For example,…
Creating A Daily Global Shortage Timeline Using Web NGrams 3.0 & BigQuery In One SQL Query
Earlier today we showed how to use Web NGrams 3.0 and BigQuery to track mentions of "shortages of" across English…
Using Web NGrams 3.0 & BigQuery To Track "Shortages of …"
We've published a growing collection of tutorials on how to use the Web News NGrams 3.0 dataset for a range…
Commodities & Financial Early Warning Using Web NGrams + GCP Timeseries Insights API + Translate + BigQuery
On Friday, we combined GDELT's Web NGrams 3.0 dataset with GCP's Timeseries Insights API, Translate API and BigQuery to create…
Monkeypox & Disease Early Warning: Planetary-Scale Anomaly Detection With Web NGrams + GCP Timeseries Insights API + Translate + BigQuery
From capturing the first flickers of 2014's Ebola outbreak to powering one of the earliest alerts of the Covid-19 pandemic,…
Timeseries Insights API + BigQuery + Translate + Web NGrams = Monkeypox Early Warning Demo Coming This Week
Stay tuned for a really exciting new demo coming later this week using the GCP Timeseries Insights API, BigQuery, Google…
Performing At-Scale Entity Extraction Over The News Using BigQuery UDFs & Web NGrams 3.0
Earlier this week we showed how to write a simple Perl script to download the latest Web NGrams 3.0 dataset…
Computing Quadgrams At BigQuery Scale Through ML.NGRAMS
Many questions in computational linguistics require the computation of character sequences over vast corpora, requiring strong scalability and robust distributed…
Experiments With Machine Translation: KWIC Through BigQuery's ML.NGRAMS
As we carefully construct the training and test corpi for our machine translation models, one tool we rely heavily upon…
Experiments With Machine Translation: From RAM Disks To BigQuery
At the core of all machine translation systems lie data. Vast archives of monolingual and bilingual training and testing data…
GSG Embeddings + GKG + BigQuery + Tensorflow Embedding Projector = Visualizing The Covid-19 Vaccine News Landscape
What would it look like to visualize a day of worldwide online news coverage about a given topic, using document-level…
Global Similarity Graph Document Embeddings & BigQuery UDFs: Semantic Multilingual Search Over The News
The new Global Similarity Graph Document Embeddings dataset uses the Universal Sentence Encoder V4 to compute document-level embeddings for each news…
Using Global Similarity Graph Document Embeddings & BigQuery For "More Like This" Search: Cross Language Search
Earlier today we showed how the new Global Similarity Graph Document Embeddings dataset can be used to take an arbitrary…
Using Global Similarity Graph Document Embeddings & BigQuery For "More Like This" Search
Earlier today we announced the new Global Similarity Graph Document Embeddings dataset that uses the Universal Sentence Encoder V4 to…
Using BigQuery's UNNEST To Unroll Count-Based Datasets
Some applications like Google's Timeseries Insights API require that count-based datasets be unrolled since they examine discrete events. For example,…
A Daily Timeline Of Key Vaccine Topics In 2021 Through A TF-IDF BigQuery Analysis Of The Global Relationship Graph
What are the most significant words and phrases associated with vaccines by day thus far this year? To explore this…