Earlier this week we unveiled a massive new dataset of more than 35.6 million citations in worldwide online news coverage to social media posts on Facebook, Instagram, QQ, Twitter, Vimeo, VK and YouTube from the start of GDELT's outlink monitoring on April 20, 2016 through the end of September 7, 2019.
This morning we announced that that dataset will now update each morning, creating a live updated compilation of the social media content that is reported on in the news across the world in the 65 languages GDELT monitors.
This new dataset has incredible applications in misinformation, disinformation, "fake news," digital falsehood and foreign influence research, providing a glimpse at the social media that has crossed over into mainstream journalism. Posts appearing in this dataset reflect a unique subset of the deluge of daily digital social commentary that are of particular interest to fact checkers and misinformation researchers.
Social media content that appears in the news has enhanced reach and credibility, both with the public and the myriad automated systems that power today's digital world. Identifying questionable content that has made the leap can help stop falsehoods before they spread further. In other cases, journalists may be calling attention to questionable content themselves, allowing researchers to leverage local insights to potential misinformation and falsehoods.
Here is an example of how the impact of Donald Trump's tweets on the news cycle can be understood through this data.
Local journalists are often in the best position to verify and vet social media content relating to the locations and events they are most familiar with or witnessed, identifying subtle nuances that allow them to confirm or refute their contents. Thus, the links in this dataset to local social media posts referenced by local journalists may be a particularly interesting information stream.
In all, we are tremendously excited to see what you are able to do with this powerful new dataset!