Why We Need More Domain Experts In The Data Sciences

Kalev's latest piece for Forbes explores the dangerous gap between the producers and consumers of data sciences today and why we need more domain experts for the field to reach its full potential. Read the Full Article.

WATSON, GDELT, Jigsaw etc

Tiffany Trofino, Principle Strategist at IBM, wrote recently on LinkedIn about the future of using data for understanding global conflict. Read The Full Article.

Here’s the data that told us Bernie Sanders would lose

Putting together data from the television tracker, Cloud Vision analysis of political imagery, and Jordan and Felipe's talk from Google I/O 2016, Kalev's latest piece for the Washington Post focuses on the state of the Democratic race and how the data clearly showed Bernie on the wane long before his eventual loss. Read the Full […]

Using Google's Deep Learning To Model Visual Portrayal In The News

With the debut of the GDELT Visual Global Knowledge Graph (VGKG), which uses Google's Cloud Vision API deep learning algorithms to catalog global news imagery, we've been immensely excited about the ways this incredible technology can be used to catalog and understand global visual narratives. This time, Felipe Hoffa came up with an incredible way of combining the GKG […]

Bayesian Poisson Tucker Decomposition for Learning the Structure of International Relations

Researchers Aaron Schein (University of Massachusetts Amherst), Mingyuan Zhou (University of Texas at Austin), David Blei (Columbia University) and Hanna Wallach (Microsoft Research) presented "Bayesian Poisson Tucker Decomposition for Learning the Structure of International Relations" at the 33rd International Conference on Machine Learning. Read the Full Paper.

NGramming 9.5 Billion Words of Arabic News

One of the amazing things that happens when you monitor the world at GDELT's scale is that you begin to observe language evolving in realtime. New words and grammatical constructs come into being, older words and structures fade from use, punctuation rules change and the topics, contexts and ways in which language use are in constant flux. […]

Making NGrams At BigQuery Scale

Ever since 2010's Culturomics paper and with the rise of ever-more powerful linguistic modeling systems, ngrams have surged back into popularity as a quick, but primitive, way of studying language. On the surface, ngrams would appear to be quite simplistic to compute: just split each document into words and count up how many times each […]

Google I/O 2016: Tracking The Election Through GDELT / Reddit / Wikipedia

In their lively and fast-paced Google I/O 2016 talk, Felipe Hoffa and Jordan Tigani trace mainstream and social media coverage of the 2016 US presidential election through the eyes of GDELT and several other datasets. One particularly striking graph is the timeline below, showing the intensity of mentions of Bernie Sanders in the mainstream media (via […]

Google I/O 2016: Election 2016 The Big Data Showdown

Felipe Hoffa and Jordan Tigani offered a fantastic fast-paced look at Google BigQuery in their session at Google I/O 2016, showcasing how GDELT can be used to understand the media dynamics of Election 2016. Check out their talk below or read their session description. Fast forward to around 17:24 to see the television tracker at work and […]

BBVA: EAGLEs Economic Outlook Annual Report 2016

BBVA's latest report, the EAGLEs Economic Outlook Annual Report 2016, has a wide array of quite fascinating sentiment and other analyses based on GDELT and shows how the GDELT Global Knowledge Graph can be used to dive deeply into complex topics. Read the Full Report.

Paris, Georgia and Trump Fixes

We recently made three bug fixes to address two geographic issues and one person name extraction issue. Mapping the textual geography of the world's news media is an incredibly difficult task and contextual disambiguation plays a critical role in generating robust results. In some cases the geocoding infrastructure makes use of additional external domain knowledge beyond that contained in […]

CuriousGnu: How The World Sees Hillary Clinton & Donald Trump

CuriousGnu used BigQuery put together a set of fantastic maps in CartoDB using GDELT to map the average tone of the media coverage of each country around the world about Hillary Clinton and Donald Trump, producing some fascinating findings! At the most obvious level, the world's media seems to like Trump a lot less than […]

Opening Keynote Data Summit 2016

Kalev gave the opening keynote of the second day of Data Summit 2016 in New York City last week, speaking on the power of big data to reshape how we understand the world around us. Learn More. Conference Program. Keynote Writeup.

Mapping Global Bias In Facebook's Media List

Kalev's latest Forbes story "Is Facebook's Trending Topics Biased Against Africa And The Middle East?" includes a map of the geographic footprint of the 1,000 media outlets powering Facebook's Trending Topics module, visualizing the tremendous Western bias in the list and the tremendous paucity of outlets in Africa and the Middle East. Read the Article […]

What Facebook's Media List Tells Us About Monitoring News Versus Monitoring Society

Kalev's latest story for Forbes explores how the media list that powers Facebook's Trending Topics module appears to have been constructed as the top 1,000 media properties ranked by total online traffic, necessarily creating an enormous bias towards Western news outlets and the perils of monitoring news versus monitoring society. Read the Full Article.