GDELT 3.0's New Adaptive Geographic Routing: A First Look

The Global Difference Graph (GDG) is now using a prototype version of our new adaptive geographic routing and the results from the first day have been extraordinary, with the system automatically learning geographic placement requirements for more than 570 domains in just its first few hours, routing all requests to those domains away from crawlers physically located in EU countries due to geotargeting (we've yet to observe 451 errors being systematically enforced by or against other regions, though work is underway to examine other kinds of more subtle geotargeting). Some sites also employ abnormal user behavioral tracking of visitors that can result in elevated 429 errors from just a few requests a minute, which our new routing system also now assists with, randomizing retries geographically to machines with the fewest errors to try and even requests across GCP's region-specific IP ranges. We're tremendously excited by this new system which has rendered our entire global URL queuing fabric geographically-aware and able to reason about the physical location of data centers in its routing decisions.