GDELT monitors worldwide news coverage in 152 languages and live translates all coverage it monitors in 65 of those languages. This means that a large portion of the coverage GDELT monitors each day is in a language other than English, reflecting GDELT's deep emphasis on local coverage in local languages. While machine translation has advanced in leaps and bounds in recent years, it is still far from English fluency in most cases, presenting problems when attempting to understand nuanced details about an event.
In an ideal world, storylines would be connected across languages, allowing you to select a major storyline and instantly see coverage of it in your own language.
For example, this Greek-language article about a Covid-19 outbreak in Mallorca is translated fairly fluently by Google Translate. While the site itself does not identify it as such, it is a translation of a Reuters news article. Using storyline clustering we instantly find the original English-language Reuters wire story, as well as this EuroNews republication.
Similarly, upon encountering this Bosnian, Bulgarian, Greek, and Polish coverage of an LGBTQ protest in Istanbul, storyline clustering groups those articles alongside an English-language Reuters story.
Storyline clustering makes it possible to look across languages when trying to understand major news stories.