The GDELT Project

Born Out Of The 2014 Ebola Epidemic GDELT's Mass Translation Infrastructure Helped BlueDot Identify 2019's Coronavirus

On March 13, 2014, GDELT's global monitoring infrastructure detected the first local media reports of what would go on to become the Ebola epidemic of 2014-2016. Unfortunately, they were in French and GDELT's English-only processing at the time never sent an alert. At the time, GDELT monitored global news media in more than 100 languages (which has grown today to more than 150), but the immense computational demands of high quality robust machine translation at the scale of even a fraction of the totality of global news output each day was beyond tractability of the day. Experimental work supported by Google Translate at the time reinforced just how much of global events and narratives were missing from the world's English language press and that to truly understand the world, GDELT must find a way to machine translate everything it monitored around the globe in as many languages as possible.

The end result later that year was GDELT Translingual, a pioneering infrastructure that first introduced the world to the concept of at-scale mass machine translation of the news, a model which GDELT has helped bring to countless industries in the years since. Powered by Translingual's global infrastructure, GDELT today translates absolutely everything it monitors globally in 65 languages, allowing it to surface the most nuanced narratives and subtle indicators about the least expected events.

Fast forward to this past December when the Chinese coronavirus first emerged and this vision of mass machine translation of the planet made it possible in December 2019 for BlueDot Global to use its machine learning algorithms to identify some of the earliest reports of the virus from GDELT's feeds when it was still just a handful of cases of "viral pneumonia … of unknown cause", with BlueDot sending out an alert nearly a full week before the CDC's and WHO's official warnings.

GDELT's mass machine translation initiative was born out of its inability in 2014 of its event and knowledge graph systems to identify the first French-language domestic media reports of the Ebola outbreak that it had monitored due to its inability to process content beyond English.

Fast forward to this past December and the ability of GDELT's immense translation infrastructure that resulted from that outbreak allowed BlueDot Global's machine learning algorithms to flag and send out alerts of the Chinese coronavirus outbreak almost a week before official warnings.

That's a pretty incredible outcome.