The rising use of GDELT in contexts such as conflict early warning and emerging atrocities forecasting brings with it the need for new methodologies, mindsets, and techniques for working with this "big data" study of human society, the functioning of the global news media system, and the underlying technologies that make GDELT's "societal scale" codification possible. Using large real-world automatically-constructed datasets like GDELT require a fundamentally different approach to monitoring, modeling, and filtering than most researchers are used to.
When searching for emerging atrocities, the goal is to catch their earliest traces – by the time the international press is blaring a collective headline about thousands of civilians killed due to their religious beliefs, that is no longer early warning, it is simply assembling post-hoc notification. Pattern detection, especially geographically-centered bursts of events that share common attributes of perpetrators, victims, religion, ethnic, or actor roles, becomes an especially critical tool for uncovering emergent unrest in its earliest stages. For example, a surge in events near a refugee camp in Africa over a period of days, or a rise in attacks towards civilians by gunmen in a specific region are all potential indicators of an impending atrocity, even before the world’s major newspapers run a headline announcing it weeks or months later.
At the most basic level, one could liken using GDELT for atrocity early warning to using the Google Books NGram collection to understand our literary history – their greatest value lies in allowing one to see broad trends that simply would never otherwise be visible. Ngrams don't replace intensive reading of a single book if one is trying to deconstruct that book's views onto a topic, but rather let one place that book in the context of millions of other books on that topic and others published through time and space to understand its broader significance and overarching themes and patterns. In both cases the goal is not to focus or dissect a single incident or book in considerable detail, but rather to look across the data for macro-level patterns, such as spatial diffusion of conflict or to examine the broader conditions under which certain types of behavior occur more or less often.
We've prepared the first of a series of forthcoming guides that will offer guidance, advice, and approaches to working with GDELT and the "big data" and global perspective it requires.