VGKG Adds EXIF Support – The GDELT Project

We're excited to announce that with the release of the new version of the Visual Global Knowledge Graph (VGKG) in the next few weeks, we will be adding full EXIF metadata support! All images will now be processed through the fantastic Image::ExifTool PERL module, which supports a massive array of image metadata formats, including EXIF, IPTC and XMP. From extensive testing over the past month we've found that up to 10% of news imagery includes keywords, author information and extended textual image descriptions embedded in the image file, while a majority of images include at least basic information on the image processing and management pipeline used to create them, offering a powerful window into how the world's news media handle imagery.

Every extractable metadata field in all formats supported by Image::ExifTool will be compiled into a new JSON block added to the end of the current Cloud Vision-returned JSON. UTF8-encoded fields are correctly handled in most cases and properly JSON escaped except where the original image was generated using non-UTF8-aware software that corrupted the field (this appears to affect only a very small number of images each day).

For a comprehensive list of all of the metadata fields recognized and extracted by Image::ExifTool and encoded into the VGKG metadata field, please see the Image::ExifTool Tags Documentation.