Last March we released a map showcasing the deep learning-powered "visual geocoder" capability of Google's Cloud Vision API applied to 20 million images. Today GDELT's VGKG has grown to more than a quarter-billion images and we've created a new map to see what new insights a year's worth of data offers. This map is part of a new Forbes piece titled "Visual Geocoding A Quarter Billion Global News Photographs Using Google's Deep Learning API."
Click the image below to launch the interactive zoomable/clickable map or download a high resolution static image.
TECHNICAL DETAILS
To create the map, we've modified the instructions a bit from last year's to sort the results by date so that the most recent images are returned first and also to return only 5 results for each location to reduce the filesize of the resulting map.
This is the final SQL query that was run on BigQuery to export all of the geolocation data to date from the "cloudvision" table.
SELECT DATE, DocumentIdentifier, ImageURL, GeoLandmarks FROM [gdelt-bq:gdeltv2.cloudvision] where GeoLandmarks is not null
Once the query has successfully completed, save the results as a CSV file called "EXPORT.csv" to your local computer (this may require doing an "Export Table" and saving to GCS as an intermediate step due to the size of the data being exported).
Once you've saved the export to your computer as "EXPORT.csv", run "sort -r -n EXPORT.csv > IN.csv" at the command prompt to sort the file numerically in descending order, which will sort all of the images from newest to oldest. This ensures that the most recent images are returned first.
Now download the "parsecloudvisionbqcsvtogeojson.pl" PERL script to the same directory and run it. After a few seconds it will output a file "OUT.geojson" that you can then import directly to CartoDB and create the map above! Happy Image Mapping!