We're excited to announce that earlier today the GDELT 2.0 Global Knowledge Graph (GKG) reached a total of just over 150 million records since February 19, 2015. The GDELT 2.0 Events table now has 70 million records and the GDELT 2.0 Event Mentions table has 224 million rows. The GDELT 1.0 Events table now contains 342 million records.
The total size of the GDELT 2.0 GKG table, when stored in its original compact nested delimited format, now stands at 1.47TB. All GDELT 2.0 tables are available in Google BigQuery, allowing interactive analysis.
On a given day, complex queries involving a majority of the GKG columns typically execute at a speed of around 20-30GB/sec. Given that BigQuery is a columnar database that accesses only the fields needed for a given query, this means that a typical query, accessing only one or two columns, tends to execute in 5-10 seconds, while even the most expensive queries, examining all fields, usually finish in around 30-60 seconds.
This is the power of the modern cloud upon which GDELT is built – systems like BigQuery make it possible to harness tens of thousands of processors and query multi-petabyte datasets or tables with tens of trillions of rows, all with just a single line of SQL. Scale is no longer a limiting factor in trying to understand the world – the tools and computing power are there, the only boundary today is the imagination to ask interesting questions.