The Official GDELT Project Blog
The Official Blog of the Global Database of Events, Language, and Tone (GDELT) Project! www.gdeltproject.org
With the debut of GDELT 2.0 earlier this year and the general availability of the GDELT Global Knowledge Graph (GKG) in Google BigQuery, we've seen an incredible boom in the diversity and complexity of analyses being performed on GDELT that leverage BigQuery's ability to perform massive and highly complex queries in near-realtime. The examples below offer a small cross-section of […]
Kalev and Felipe wrote an article for O'Reilly Ideas titled "Analyzing the world’s news: Exploring the GDELT Project through Google BigQuery: What it looks like to analyze, visualize, and even forecast human society using global news coverage" that summarizes some of the big data challenges facing the GDELT Project and how we use Google BigQuery […]
Kalev and Felipe Hoffa of Google will be presenting at the O'Reilly Strata+Hadoop World conference in Singapore next week. If you're attending the conference, come see their talk at 11AM SGT on Thursday. Felipe will be presenting on behalf of both of us. Learn More.
Last week Kalev was at Google's Mountain View campus for the Google Developer Experts Summit, as one of its Cloud GDE's, together with more than 250 other GDE's and a cross-section of Google technical and program staff.
Kalev's latest piece for Forbes continues his exploration of web archives and why it is so important to understand what constitutes their holdings, along a survey of the myriad decisions that shape the archival of an infinite stream with finite resources. Read the Full Article.
Kalev's latest piece for Forbes explores the encryption debate through historical and pragmatic lenses, examining just what options are truly available in the balance between security and privacy. Read the Full Article.
As we continue to add new sentiment dictionaries to GDELT on a regular basis, a common request has been the ability to extend the dictionaries backwards over time, especially over historical collections like books. It turns out that Google BigQuery is exceptionally adapted to this kind of massive coding workflow and with just a single SQL […]
Kalev's latest Forbes piece explores what is really in the Internet Archive's Wayback Machine and the nuance and biases of its window onto the last 20 years of the evolution of the web. Executing this study involved a substantial amount of log file analysis over very large log files. The study began by downloading the Alexa […]
Kalev's latest piece for Forbes explores the Internet Archive's Wayback Machine, which turns 20 years old next year and holds over 450 billion web pages and 22 petabytes of data. The piece uses several techniques to peer inside the archive as a whole, exploring its holdings and finding a significant need for a better understanding […]
Kalev's latest piece for Forbes explores the rise of the "surveillance economy" and how last week's revelations about Vizio televisions are simply the latest in a stream of unanticipated applications of our digital breadcrumbs towards advertising. The piece also explores where the trends are heading vis-a-vis "big data" understanding of individuals through such techniques and […]
A common request revolves around filtering events to just those mentioned in articles that focus on certain themes or groups. GDELT 2.0 uses a combination of three tables: the EVENTS table that stores distinct events, the EVENTMENTIONS table that stores every mention of every event, one row per mention of an event, and the GKG […]
By popular request, we've compiled a roundup of the latest Television & Politics visualizations we've been creating in collaboration with the Internet Archive's Television News Archive. This page will update throughout the month of November to reflect the latest set of visualizations as we roll them out. DEM Debate #2 GOP Debate #4 Primetime GOP […]
Due to the high volume of misreports and sharing on social media that the Garissa attack occurred last week, rather than this past April, IRIN News created a timeline comparing the normalized volume of all mentions of Kenya against all mentions of France from February 19, 2015 through 2:30PM EST November 14, 2015. Read the Full […]
A new paper by Zachary C Steinert-Threlkeld, Delia Mocanu, Alessandro Vespignani, and James Fowler of the Department of Political Science at the University of California – San Diego, Laboratory for the Modeling of Biological and Socio-Technical Systems at Northeastern University, and Department of Medicine at the University of California – San Diego, uses GDELT to explore the […]
Jim Tankersley's article for the Washington Post uses our interactive campaign 2016 television tracker to compare television attention and polling results for the GOP candidates. Read the Full Article.
Kalev's latest Forbes piece explores the connections that defined 2015 by constructing a network diagram over the people mentioned in the more than 150 million articles in 65 languages in the GKG thus far this year. The underlying network structure for all of the visualizations was computed using BigQuery and rendered using Gephi. The query for […]
Kalev's latest piece for Forbes constructs a global influencer network of the top 100,000 pairs of newsmakers mentioned most frequently together in 150 million news articles in 65 languages, exploring the connections that defined 2015. Read the Full Article.
Kalev's latest piece for Forbes explores how triangulating word patterns drawn from the Google Books ngram viewer against those from the New York Times ngram viewer offers the ability to determine which patterns are genuine artifacts of books versus which reflect general linguistic changes in the English language. In doing so, it suggests that recent studies […]
For his latest Forbes piece, Kalev took a network visualization developed by the BBVA Research Emerging Markets Unit of how often countries are mentioned together in global news coverage of the Russian economic sanctions, and revisualized it in Gephi using modularity to group it into communities, PageRank to size each node by its importance in […]
Kalev's latest piece for Forbes builds upon the work of the BBVA Research Emerging Markets Unit to explore the state of Russian sanctions and the Chinese economic slowdown. Read the Full Article.
Kalev's latest piece for Forbes explores the changing landscape of how citizens engage and interaction with the news media, from online reader comment sections to the shift towards social media engagement and the benefits in audience broadening and sharing that occur in the social space. Read the Full Article.
This paper by Fengcai Qiao, Pei Li, Jingsheng Deng, Zhaoyun Ding, and Hui Wang of the College of Information Systems and Management of the National University of Defense Technology in China, presents a novel graph-based event detection framework using subgraph pattern mining to identify occupy protest events. Read the Full Paper.
Kalev's latest piece for Forbes explores how drones are changing the humanitarian disaster response landscape and presents a vision of how advances in autonomous flight and coordination could be used to revolutionize how we respond to natural disasters. Read the Full Article.
Kalev's latest piece for Forbes explores how we could leverage advances in smartphone technology and crowd sourcing to collaboratively digitize the world's libraries, especially collections, languages, and libraries for which digitization funding has not historically been available. Read the Full Article.