The Official GDELT Project Blog

The Official Blog of the Global Database of Events, Language, and Tone (GDELT) Project!

GDELT 2.0: The Planet in Realtime in 65 Languages and 2,300 Emotions and Themes

A Compilation Of GDELT BigQuery Demos

With the debut of GDELT 2.0 earlier this year and the general availability of the GDELT Global Knowledge Graph (GKG) in Google BigQuery, we've seen an incredible boom in the diversity and complexity of analyses being performed on GDELT that leverage BigQuery's ability to perform massive and highly complex queries in near-realtime.  The examples below offer a small cross-section of […]


O'Reilly Ideas: Analyzing The World's News

Kalev and Felipe wrote an article for O'Reilly Ideas titled "Analyzing the world’s news: Exploring the GDELT Project through Google BigQuery: What it looks like to analyze, visualize, and even forecast human society using global news coverage" that summarizes some of the big data challenges facing the GDELT Project and how we use Google BigQuery […]


O'Reilly Strata+Hadoop World Singapore Next Week

Kalev and Felipe Hoffa of Google will be presenting at the O'Reilly Strata+Hadoop World conference in Singapore next week. If you're attending the conference, come see their talk at 11AM SGT on Thursday. Felipe will be presenting on behalf of both of us. Learn More.

Kalev Named Google Developer Expert for Google Cloud Platform

Google Developer Experts Summit 2015

Last week Kalev was at Google's Mountain View campus for the Google Developer Experts Summit, as one of its Cloud GDE's, together with more than 250 other GDE's and a cross-section of Google technical and program staff.


Why It's So Important To Understand What's In Our Web Archives

Kalev's latest piece for Forbes continues his exploration of web archives and why it is so important to understand what constitutes their holdings, along a survey of the myriad decisions that shape the archival of an infinite stream with finite resources. Read the Full Article.


Terascale Sentiment Analysis: BigQuery + Tone Coding Books

As we continue to add new sentiment dictionaries to GDELT on a regular basis, a common request has been the ability to extend the dictionaries backwards over time, especially over historical collections like books. It turns out that Google BigQuery is exceptionally adapted to this kind of massive coding workflow and with just a single SQL […]


Using BigQuery To Explore Large Log Files: Exploring the Wayback Machine

Kalev's latest Forbes piece explores what is really in the Internet Archive's Wayback Machine and the nuance and biases of its window onto the last 20 years of the evolution of the web. Executing this study involved a substantial amount of log file analysis over very large log files. The study began by downloading the Alexa […]


How Much of the Internet Does The Wayback Machine Really Archive?

Kalev's latest piece for Forbes explores the Internet Archive's Wayback Machine, which turns 20 years old next year and holds over 450 billion web pages and 22 petabytes of data. The piece uses several techniques to peer inside the archive as a whole, exploring its holdings and finding a significant need for a better understanding […]


When Our Televisions Watch Us In Our Homes

Kalev's latest piece for Forbes explores the rise of the "surveillance economy" and how last week's revelations about Vizio televisions are simply the latest in a stream of unanticipated applications of our digital breadcrumbs towards advertising.  The piece also explores where the trends are heading vis-a-vis "big data" understanding of individuals through such techniques and […]

GDELT 2.0: The Planet in Realtime in 65 Languages and 2,300 Emotions and Themes

Complex Queries: Combining Events, EventMentions, and GKG

A common request revolves around filtering events to just those mentioned in articles that focus on certain themes or groups. GDELT 2.0 uses a combination of three tables: the EVENTS table that stores distinct events, the EVENTMENTIONS table that stores every mention of every event, one row per mention of an event, and the GKG […]


Latest Internet Archive Television & Politics Visualizations: November 2015

By popular request, we've compiled a roundup of the latest Television & Politics visualizations we've been creating in collaboration with the Internet Archive's Television News Archive. This page will update throughout the month of November to reflect the latest set of visualizations as we roll them out. DEM Debate #2 GOP Debate #4 Primetime GOP […]


IRIN News: Paris vs Kenya Attacks (Terror Around The World)

Due to the high volume of misreports and sharing on social media that the Garissa attack occurred last week, rather than this past April, IRIN News created a timeline comparing the normalized volume of all mentions of Kenya against all mentions of France from February 19, 2015 through 2:30PM EST November 14, 2015. Read the Full […]


Online Social Networks And Offline Protests

A new paper by Zachary C Steinert-Threlkeld, Delia Mocanu, Alessandro Vespignani, and James Fowler of the Department of Political Science at the University of California – San Diego, Laboratory for the Modeling of Biological and Socio-Technical Systems at Northeastern University, and Department of Medicine at the University of California – San Diego, uses GDELT to explore the […]


TV Networks Shortchanging Ted Cruz

Jim Tankersley's article for the Washington Post uses our interactive campaign 2016 television tracker to compare television attention and polling results for the GOP candidates. Read the Full Article.


Visualizing The Global Influencer Network

Kalev's latest Forbes piece explores the connections that defined 2015 by constructing a network diagram over the people mentioned in the more than 150 million articles in 65 languages in the GKG thus far this year. The underlying network structure for all of the visualizations was computed using BigQuery and rendered using Gephi. The query for […]


Who's Connected To Whom In The Global Media

Kalev's latest piece for Forbes constructs a global influencer network of the top 100,000 pairs of newsmakers mentioned most frequently together in 150 million news articles in 65 languages, exploring the connections that defined 2015. Read the Full Article.


Why We Need To Verify Our Big Data Results

Kalev's latest piece for Forbes explores how triangulating word patterns drawn from the Google Books ngram viewer against those from the New York Times ngram viewer offers the ability to determine which patterns are genuine artifacts of books versus which reflect general linguistic changes in the English language. In doing so, it suggests that recent studies […]


One-Click Network Visualization With BigQuery+Gephi

For his latest Forbes piece, Kalev took a network visualization developed by the BBVA Research Emerging Markets Unit of how often countries are mentioned together in global news coverage of the Russian economic sanctions, and revisualized it in Gephi using modularity to group it into communities, PageRank to size each node by its importance in […]


Is The Era of Reader Comments On News Websites Fading?

Kalev's latest piece for Forbes explores the changing landscape of how citizens engage and interaction with the news media, from online reader comment sections to the shift towards social media engagement and the benefits in audience broadening and sharing that occur in the social space. Read the Full Article.


Graph-Based Detection of Occupy Protests

This paper by Fengcai Qiao, Pei Li, Jingsheng Deng, Zhaoyun Ding, and Hui Wang of the College of Information Systems and Management of the National University of Defense Technology in China, presents a novel graph-based event detection framework using subgraph pattern mining to identify occupy protest events. Read the Full Paper.


How Drones Are Changing Humanitarian Disaster Response

Kalev's latest piece for Forbes explores how drones are changing the humanitarian disaster response landscape and presents a vision of how advances in autonomous flight and coordination could be used to revolutionize how we respond to natural disasters. Read the Full Article.


Digitizing The World's Libraries Using Smartphones

Kalev's latest piece for Forbes explores how we could leverage advances in smartphone technology and crowd sourcing to collaboratively digitize the world's libraries, especially collections, languages, and libraries for which digitization funding has not historically been available. Read the Full Article.