The GDELT Project

  • The GDELT Project Blog
  • Website

Using The Cloud To Explore The Linguistic Patterns Of Half A Trillion Words Of News Homepage Hyperlinks

 September 2, 2019

What would it look like to convert a year and a half of homepage links totaling more than half a trillion words from worldwide news homepages in 110 languages into ngram datasets with just three SQL queries, an open source language detector, one script and the power of Google’s BigQuery platform?

Read The Full Article.

Post navigation

← The GDELT Global Frontpage Graph (GFG): 134 Billion URLs And Three-Quarters Of A Trillion Datapoints
FiveThirtyEight: Which Democratic Presidential Candidate Was Mentioned Most In The News Last Week? →

Archives

The Official GDELT Project Blog