This study by Mohammed Elshendy, Andrea Fronzetti Colladon, Elisa Battistoni and Peter A Gloor examines crude oil price forecasting:
This study looks for signals of economic awareness on online social media and tests their significance in economic predictions. The study analyses, over a period of 2 years, the relationship between the West Texas Intermediate daily crude oil price and multiple predictors extracted from Twitter; Google Trends; Wikipedia; and the Global Data on Events, Location and Tone (GDELT) database. Semantic analysis is applied to study the sentiment, emotionality and complexity of the language used. Autoregressive Integrated Moving Average with Explanatory Variable (ARIMAX) models are used to make predictions and to confirm the value of the study variables. Results show that the combined analysis of the four media platforms carries valuable information in making financial forecasting. Twitter language complexity, GDELT number of articles and Wikipedia page reads have the highest predictive power. This study also allows a comparison of the different fore-sighting abilities of each platform, in terms of how many days ahead a platform can predict a price movement before it happens. In comparison with previous work, more media sources and more dimensions of the interaction and of the language used are combined in a joint analysis.