What if, with a single SQL query, we could surface the most common dyadic relationships defining each year of the past 40 years? In short, for each year, find the two actors that appear most commonly together. This may not necessarily be the most intense or meaningful event of the year, but rather reflects what the media chose to emphasize and thus the interactions on the global public consciousness. The final result is the table below:
Year | Actor1Name | Actor2Name |
1979 | CHINA | VIETNAM |
1980 | AFGHANISTAN | RUSSIA |
1981 | RUSSIA | UNITED STATES |
1982 | ISRAEL | LEBANON |
1983 | ISRAEL | LEBANON |
1984 | RUSSIA | UNITED STATES |
1985 | RUSSIA | UNITED STATES |
1986 | RUSSIA | UNITED STATES |
1987 | IRAN | IRAQ |
1988 | IRAN | IRAQ |
1989 | ISRAELI | PALESTINIAN |
1990 | IRAQ | KUWAIT |
1991 | IRAQ | KUWAIT |
1992 | ISRAEL | PALESTINIAN |
1993 | ISRAEL | PALESTINIAN |
1994 | ISRAEL | JORDAN |
1995 | ISRAEL | SYRIA |
1996 | ISRAEL | PALESTINIAN |
1997 | ISRAEL | PALESTINIAN |
1998 | ISRAEL | PALESTINIAN |
1999 | ISRAEL | PALESTINIAN |
2000 | ISRAEL | PALESTINIAN |
2001 | ISRAELI | PALESTINIAN |
2002 | ISRAELI | PALESTINIAN |
2003 | IRAQ | UNITED STATES |
2004 | IRAQ | UNITED STATES |
2005 | ISRAEL | PALESTINIAN |
2006 | IRAQ | UNITED STATES |
2007 | IRAQ | UNITED STATES |
2008 | IRAQ | UNITED STATES |
2009 | AFGHANISTAN | UNITED STATES |
2010 | CHINA | UNITED STATES |
2011 | PAKISTAN | UNITED STATES |
2012 | CHINA | UNITED STATES |
2013 | RUSSIA | UNITED STATES |
2014 | RUSSIA | UKRAINE |
2015 | IRAN | UNITED STATES |
2016 | RUSSIA | UNITED STATES |
2017 | RUSSIA | UNITED STATES |
2018 | RUSSIA | UNITED STATES |
2019 | RUSSIA | UNITED STATES |
2020 | IRAN | UNITED STATES |
TECHNICAL DETAILS
Constructing this table took just a single SQL query and 3 seconds of query time in BigQuery. The "UNION ALL" is used to combine two versions of the same base query, one that selects Actor1 and Actor2's names in correct order and one that reverses their order so that an event that encodes A doing something to B is counted the same as B doing something to A, to ensure that both directions of dyadic relationships are captured. The outer query wrapper converts these results to a ranking by count and teh final result filters these down to just the top by year.
Simply by changing "where rank=1" to where rank<=5" will yield the top 5 relationships per year and so on. You can also adjust the query terms to return just the top relationships for a particular country for each year and so on.
SELECT Year, Actor1Name, Actor2Name, Count FROM ( SELECT Actor1Name, Actor2Name, Year, COUNT(1) Count, RANK() OVER(PARTITION BY YEAR ORDER BY Count(1) DESC) rank FROM ( (SELECT Actor1Name, Actor2Name, Year FROM `gdelt-bq.full.events` WHERE Actor1Name < Actor2Name and Actor1CountryCode != '' and Actor2CountryCode != '' and Actor1CountryCode!=Actor2CountryCode) UNION ALL (SELECT Actor2Name Actor1Name, Actor1Name Actor2Name, Year FROM `gdelt-bq.full.events` WHERE Actor1Name > Actor2Name and Actor1CountryCode != '' and Actor2CountryCode != '' and Actor1CountryCode!=Actor2CountryCode) ) WHERE Actor1Name IS NOT null AND Actor2Name IS NOT null GROUP BY 1, 2, 3 HAVING Count > 100 ) WHERE rank=1 ORDER BY Year