GEN4: Compute Engine + VPC PGA

GDELT's new GEN4 crawling architecture makes extensive use of the immense leaps in both performance and capability of GCP's Compute Engine networking. One advance in particular has allowed us to eliminate a global infrastructure fabric that previously underlay how our crawler fleets were managed: a VPC service called "Private Google Access."

Historically, under Legacy GCE networks, when a VM wished to deaccession its external network connectivity (retaining only its internal IP address), it was cut off not only from the external internet, but also from all GCP services. This meant that a VM that released its external IP address could no longer communicate with GCP APIs ranging from management layers to services like BigQuery or Vertex AI APIs.

To work around this, GDELT historically maintained a global proxy-like management fabric that communicated with VMs via their internal networking and acted as relay points to the broader GCP environment during periods when they deaccessioned their external connectivity. This was extremely cumbersome and required a huge infrastructure to manage and deal with unexpected behaviors and conditions. Since VMs were cut off from all GCP APIs, they could no longer determine basic information about their state in many cases, making it difficult to implement many kinds of self-healing behaviors.

Instead, with the rise of VPC networking, Private Google Access eliminates this need. PGA is disabled by default, but can be enabled on a subnet-by-subnet basis and allows VMs on that subnet to connect to any Google API and service external IP address without the VM having an external IP address. The service can be heavily customized, but in GDELT's use case, it means GEN4 services can now maintain uninterrupted communication with GCP management layers as they transition from having external network connectivity to deaccessioning external networking. This has allowed us to remove a large and complex infrastructure fabric and make our individual VMs self-dependent, making them far more resilient and flexible.

Most importantly, it also means GEN4 services can now be largely self-healing. While centralized, regional and data center-level fleet management systems orchestrate macro-level behaviors, individual VMs hosting GEN4 services can now diagnose a wide range of issues and autonomously correct them via GCP management APIs.