Earlier this month we previewed how we were moving our GCS and BigTable infrastructure connectivity to Go-based daemon proxies running locally on each VM. This transition has proved to be an even larger success than we could have imagined. Workflows that previously took 30 minutes and 100+ CPUs now complete in just 30-40 seconds on just a handful of CPUs on the same VM. Error rates from congestion and timeouts have plunged to effectively zero, while entire portions of our codebase devoted to complex error handling that used to be heavily exercised and constantly expanded have seen zero traversals since the transition. We are now evaluating moving all connectivity to GCP services behind such proxies. The use of file-based communication with these proxies (files are written to disk and the proxy is given their paths and instructions) rather than streaming those connections over pipes or localhost has allowed these daemons to avoid traditional bottlenecks and automatically adjust to overall system network pressure.
We are incredibly impressed by the sheer performance and minimal resource requirements of these Go utilities and have been actively rewriting other supporting tools in Go using agentic Gemini to oversee code production, testing and optimization. In fact, wherever possible we are now transitioning compute intensive workflows to Go using agentic Gemini and seeing immense cost savings – in some cases 10-50x more workflows able to run on the same VM hardware.
We've also been hard at work upgrading the hardware infrastructure underpinning our VMs. GCP has added a wealth of new CPU families and hardware configurations in recent years, but benchmarking workflows on every possible permutation of hardware was too costly and time-consuming in the past. Gemini's ability to accurately and precisely estimate the hardware requirements of specific workflows and to agentically test those configurations has completely transformed how we approach hardware selection and benchmarking, yielding significant speedups on utilities like industry standard utilities like ffmpeg and ImageMagick using unexpected and often counterintuitive hardware configurations that are optimized for our specific unique workflows and complex interactions.
Finally, we are beginning to explore Gemini-assisted agentic self-healing for our workflows. While in the past we focused on generalized self-healing infrastructures that worked across all of our workflows, our early experiments here have involved using Gemini to construct bespoke agentic self-healing frameworks for individual workflows that are proving exceptionally adept at helping us solve a number of longstanding stability issues in some of our workflows and we look forward to continuing to expand this work.