While GDELT is most well-known as a collection of realtime and historical open metadata streams over global news media, it also plays a critical role as an archive of global events. Rather than merely preserving events once they are recognized, GDELT's continual global monitoring of local media across the entire planet in more than 150 languages ensures that the entire path leading up to an event is preserved.
Whether a spontaneous natural disaster like an earthquake or the long leadup to political violence like a coup, GDELT captures and preserves the state of society leading up to the event. The period leading up to a major event is actually the most important for understanding the initial societal reaction. Even a natural disaster like an earthquake must be understood in terms of the societal context in which it occurs to understand the societal and state response. Events like the January 6th storming of the US Capitol must be understood not just in terms of the events of that day, but in the rhetoric over the days, weeks, months and years leading up to it.
Traditional event archiving tends to be tactical, with archivists making judgement calls as events occur to decide they are important to archive and then searching for how best to preserve them, meaning they are unable to capture this period leading up to major events. In contrast, GDELT continually monitors the planet, ensuring it captures this leadup period.
Similarly, GDELT is able through its global monitoring to capture and preserve the earliest glimmers of events. In the earliest moments of the Covid-19 pandemic, the Chinese government had yet to recognize the thread and it was openly described as a "SARs-like pneumonia of unknown origin." Once the Chinese government recognized the political risk of the growing pandemic, however, it ordered much of that early coverage removed, meaning the majority of the leadup has vanished from the web. Yet, because GDELT was monitoring it at these earliest stages, it was able to capture all of that coverage.
Once events occur, GDELT is able to capture their entire course and aftermath, chronicling their long-term impact in the years and decades that follow. While most tactical archiving projects are ended shortly after an event or are constrained to just a small sampling of outlets deemed "most relevant" by SMEs, GDELT is able to capture the entire global reaction into the years that follow.
Most importantly, through GDELT's partnership with the Internet Archive's "No More 404" program, GDELT provides the Archive with a live stream of all of the URLs of online coverage it monitors across the web for the Archive to preserve in the Wayback Machine. In just its first few years, GDELT provided the Archive with more than 5.4 billion URLs totaling 221TB of global news content that the Archive preserved for posterity.
We are excited to see how this incredible archive of global history can be used by scholars and historians to better understand the courses that have led us to our present world.