With the merger of document sentiment scores into the core Global Entity Graph (GEG), each GEG record now includes both the full list of entities and the sentiment score of the document as a whole. This means we can now use a single line of SQL to plot the average document-level sentiment of news coverage mentioning a particular entity.
For example, let's say we wanted to compute the sentiment of media coverage of Tesla over the past three years. With a single line of SQL we can request all articles from which Google's Cloud Natural Language API identified a mention of Tesla and average by day the document-level sentiment score the API computed for that article. At present we are only recording document-level sentiment scores, but as the Natural Language API rolls out entity-level sentiment scores across more languages, we will eventually integrate them.
The final result can be seen in the timeline below, showing the average document-level sentiment of the sampling of worldwide English-language news coverage mentioning Tesla over the past three years that was processed into the GEG.
To make the underlying trends clearer, here is a smoothed version that uses a 7-day rolling average.
For those interested in diving more deeply into this data, the underlying Excel spreadsheet is available for download.
TECHNICAL DETAILS
Constructing the graphs above took just a single line of SQL. We use UNNEST() to flatten the entity list and limit to articles published on or after November 14, 2016, which is when Google's Natural Language API introduced its new "score" sentiment score. We also limited to English in this case to ensure that scores reflected a single language sentiment scoring model.
SELECT DATE(date) day, count(1), avg(score) score, avg(magnitude) magnitude FROM `gdelt-bq.gdeltv2.geg_gcnlapi`, unnest(entities) entity WHERE entity.name='Tesla' and lang='en' and DATE(date) >= "2016-11-14" group by day order by day asc
With a single line of SQL, it is now possible to harness neural entity extraction and neural sentiment scoring to chart the overall document-level media tenor of an organization.