VGKG & Image Processing: Extreme Optimization Versus Generalized Pipelines

The Visual Global Knowledge Graph computes a range of characteristics about each image, from perceptual hashes to visual entropy, which it both returns as part of the image record's accessible metadata and uses as an internal prefilter to remove images that are not visually complex enough to warrant processing through GCP's Cloud Vision API. For example, an image in which just one or two shades of color account for 99% of the total image surface is unlikely to yield useful API results, while a blurry pixelated image without any discernable contents is also unlikely to yield useful results. This allows us to avoid applying the API to imagery that is unlikely to yield useful findings. Several of these computations involve file format conversions, image downsampling and advanced image adjustments.

Such image processing is extremely computationally expensive. To distribute this computational load, we distribute it across our global crawler fleet, performing it inline as part of the crawling process, to avoid having to shuttle large image data globally until an image has been cleared for API processing. Given that our crawlers operate in relatively low resource hardware environments, historically we have leveraged extreme optimization in their image processing computational paths. For example, image resizing was performed using specialty fixed point libraries designed for embedded environments with core routines written in assembly in order to yield highly performant code with results nearly identical to more traditional unoptimized libraries, but requiring just a fraction of the hardware or execution time.

At the same time, all of this optimization came at a stark cost: brittleness. As web developers have continued to push the boundaries of major image formats, adding new features and turning formerly rare edge cases into standard features, these libraries have increasingly struggled. They also required elaborate pipelines to handle formats like WEBP and to pass image data between libraries with divergent development paths.

Thus, as the Visual Global Knowledge Graph transitions to the new GEN4 architecture that underlies GDELT 3.0, we are excited to announce that we are transitioning to a generalized image architecture that uses standard off-the-shelf image components optimized for format coverage and robustness rather than raw execution speed. We are making up for the increased computational load through an exciting new architecture that we'll be talking more about soon that makes more holistic use of the underlying hardware and host environment to largely mitigate the additional computational overhead through efficiencies and creative scheduling and kernel use. We're already seeing a tremendous increase in the new system's ability to handle edge cases and we're tremendously excited about the new opportunities this will open up for handling ever more diverse still imaging environments across the web.