JRSI: Integrating deep-learning methods and web-based data sources for surveillance, forecasting and early warning of avian influenza

Highly pathogenic avian influenza (HPAI), especially the H5N1 subtype has caused repeated global outbreaks, primarily affecting birds, but occasionally spreading between humans. These events pose serious public health and economic risks, demanding enhanced surveillance. This study evaluates novel web-based data for predicting HPAI outbreaks using machine learning models in Canada as a case study. Seven web-based sources, Google Trends, Google News, Global Database of Events, Language, and Tone (GDELT), Reddit, Facebook, minimum temperature and air quality (UV index and CO levels), were automatically collected and integrated through an application programming interface (API)-driven pipeline and combined with historical HPAI cases. Forecasting was performed using deep-learning models: gated recurrent unit (GRU), long short-term memory (LSTM) and their combination with convolutional neural networks (CNN). Classical machine learning models, random forest (RF), support vector machine (SVM) and naive Bayes (NB), were included for comparison. Model performance was evaluated using root mean square error (RMSE) and correlation. Feature importance was assessed using permutation methods and the Mann–Whitney U test. GRU delivered the most accurate forecasts. Historical case data were the most important factor (p < 0.01), followed by Facebook activity and minimum temperature. These findings suggest that integrating diverse data with machine learning enhances early HPAI detection, enabling timely public health responses and mitigating economic impacts.

Read The Full Article.