Kalev will be giving a talk to the Credibility Coalition next week titled "Combatting Misinformation And Lending Context Through GDELT’s Global Open News Annotation Datasets":
The open data GDELT Project (https://www.gdeltproject.org/) monitors worldwide online news coverage in realtime in 152 languages, creating rich live-updating annotation datasets using both statistical and ML approaches, including using Google’s Cloud NLP, Video, Vision and Speech to Text APIs to catalog global events and narratives. Live machine translation of 65 languages allows looking across languages to connect stories to local sources, frontpage scanning allows construction of agenda setting datasets, outlink archives enables “you are here” media maps, 24 hour and one week differencing scans catalog stealth editing at global scale and embed scanning catalogs the videos and social media posts being discussed by the media. A rich array of emotional metrics can be used to map out coverage of a story from clinical to emotional, supportive to condemning and the kinds of emotions being appealed to, helping to understand differing contextualizations. Quotations and claims extraction can be used to automatically identify contested narratives and their sources, while temporal data allows mapping of how narratives spread through the global media ecosystem. Moving beyond text, worldwide news imagery is cataloged in realtime, including labels and EXIF metadata, but most importantly reverse image search catalogs where each image has appeared on the web before and how it was labeled, allowing automated detection of contested repurposing. Most recently, a massive new collaboration with the Media-Data Research Consortium and Internet Archive’s Television News Archive is non-consumptively cataloging television news coverage of COVID-19 and past disease outbreaks, allowing connection of the offline and online news worlds. Learn how GDELT’s myriad open data datasets and APIs can be used to lend powerful new signals to misinformation, disinformation and contested narratives research.