Making Visual AI More Inclusive To The World's Diversity

Much as GDELT's 150-language Web NGrams 3.0 dataset offers the potential to make textual AI far more inclusive to the world's languages, GDELT's global image annotation datasets offer the same potential for image AI. The Visual Global Knowledge Graph (VGKG) encompasses annotations by Google's Cloud Vision API and extracted EXIF metadata for nearly three quarters of a billion news images from across the world 2015-present, with each record including the URL of the image, the API's annotations and the extracted EXIF metadata, which can be combined with the GKG for a complete list of articles the image appeared in.

Most uniquely, the VGKG spans news coverage from across the entire world, reflecting the incredible diversity of the world's societies. Most obviously, GDELT's datasets capture the rich tapestry of human life across the world, from dress and architecture to culture, celebrations and ceremonies such as the myriad differences in what a "wedding" or "feast" looks like across the world. Yet this also includes more subtle geographic trends in visual portrayal and contextualization such as whether images capture events themselves or whether they are presented through the eyes of leaders at podiums and the kinds of images that are cropped, pixelated or presented through alternative angles or representations.

To date, we've produced a few specialized datasets, such as PPE medical facial coverings during Covid-19 for onscreen mask detection on television news, but we've love to hear from you as to the kinds of new datasets you'd like to see and we'd love to see the visual AI community use datasets like the VGKG to create vastly more inclusive training and testing datasets.