Global Similarity Graph: Transitive Similarity Chaining

The Global Similarity Graph compares the articles published every 15 minutes against all other articles published in that 15 minutes and the previous 15 minutes in the 65 languages it live-translates. You can transitively chain these to extend your search horizon over time. For example, if article A published in the current 15 minutes is similar to article B published 15 minutes prior, you can search for articles that are similar to B in the 15 minutes prior to its publication, which might yield C and in turn you can search for articles similar to C published in the 15 minutes prior to it, and so on.

One simple approach is the following, using this approach to search for all articles mentioning Florida in either the URL or title and then searching for all articles connected to those articles or their paired articles:

WITH data AS (
  SELECT fromUrl url FROM `gdelt-bq.gdeltv2.gsg` WHERE DATE(fromDate) = "2021-07-03" and ( LOWER(fromUrl) like '%florida%' OR LOWER(fromTitle) like '%florida%' OR LOWER(toUrl) like '%florida%' OR LOWER(toTitle) like '%florida%' )
    UNION ALL
  SELECT toUrl url FROM `gdelt-bq.gdeltv2.gsg` WHERE DATE(fromDate) = "2021-07-03" and ( LOWER(fromUrl) like '%florida%' OR LOWER(fromTitle) like '%florida%' OR LOWER(toUrl) like '%florida%' OR LOWER(toTitle) like '%florida%' )
)
select * from `gdelt-bq.gdeltv2.gsg` where DATE(fromDate) = "2021-07-03" and fromUrl in (select url from data) OR toUrl in (select url from data)

Given that it only searches for the English word for Florida, this particular query will return primarily English coverage. It also will yield a high number of false positives regarding ransomware given a large-scale ransomware attack on Friday that hijacked software from a Florida-based company. In a real-world example you would have compiled a list of relevant coverage and work outward from there, but this shows how easy it is to work outwards from a given query!