The GDELT Project

Visual Global Entity Graph 2.0: Master Visual Entity List

Google's Cloud Video API recognizes an incredible range of objects and activities, allowing the Visual Global Entity Graph 2.0 to identify 11,253 distinct visual entities across a decade of evening television news broadcasts on ABC, CBS and NBC. This is not the complete list of all entities that the Video API recognizes, just the list of entities that it saw in the more limited domain of evening news broadcasts 2009-2020. What are all of the entities that the Video API is able to recognize? To help researchers find the entities most aligned with their research questions, we've compiled a master spreadsheet of all 11,253 distinct entities, the total seconds of airtime they were seen in and Google's unique "MID" identifier code for that entity. Below are the first 13 entries in alphabetical order.

Name Count MID
100 metres hurdles 1745 /m/079zcf
110 metres hurdles 1735 /m/0bb154
1937 ford 10 /m/0gcy3w
1949 ford 16 /m/0glyd8
1952 ford 16 /m/0gl_d9
1955 ford 28 /m/0gm02k
1957 chevrolet 128 /m/081tx_
1957 ford 16 /m/0gm1g3
3d modeling 101814 /m/02p30vv
3×3 basketball 673 /m/0ch1v_m
4 100 metres relay 2558 /m/067_4k
4 400 metres relay 1402 /m/06882p
800 metres 3656 /m/02slmm

 

TECHNICAL DETAILS

Creating the spreadsheet above took just a single line of SQL and 4.6 seconds to process through 485 million annotations.

SELECT entity.name Entity, count(1) Count, APPROX_TOP_COUNT(entity.mid, 1)[OFFSET(0)].value FROM `gdelt-bq.gdeltv2.vgegv2_iatv`, unnest(entities) entity where length(entity.name) > 1 group by Entity order by Entity asc

They can also be ordered by count:

SELECT entity.name entity, count(1) count  FROM `gdelt-bq.gdeltv2.vgegv2_iatv`, unnest(entities) entity where length(entity.name) > 1 group by Entity order by count desc