The GDELT Project

Using The Cloud To Explore The Linguistic Patterns Of Half A Trillion Words Of News Homepage Hyperlinks

What would it look like to convert a year and a half of homepage links totaling more than half a trillion words from worldwide news homepages in 110 languages into ngram datasets with just three SQL queries, an open source language detector, one script and the power of Google’s BigQuery platform?

Read The Full Article.