Watching The Media Wake Up In The Morning Through Crawler CPU Graphs

The global media has a definitive circadian rhythm of its own, with the total journalistic output of each country following fairly consistent local daylight working hours. This rhythm can be measured in many ways, most simply of which is the total number of new articles published every 15 minutes. Yet it is also fascinating to see how that rhythm is reflected in other ways, such as CPU graphs. The banner image at the top of this post shows the total CPU utilization of one of our US East Coast crawlers as North, Central and South American media woke up this morning.

GDELT 2.0 operates on a 15 minute heartbeat and this can be seen in the early hours of the morning through around 5AM EST. There is a large dump of news at 5AM, often by automated CMS publishing systems releasing the morning's scheduled articles, followed by a lull at 5:30AM. The real morning begins around 6AM EST as journalism across this portion of the world wakes up and begins pouring out its daily deluge of coverage and the CPU graph begins to stabilize as it receives a steady stream of URLs to scan and process.

Just an interesting visual reminder of the natural human rhythms of our planet.