The GDELT Project

Visualizing A Year Of Trump's Television Tweets

What would it look like to take all 139,450 seconds of airtime across  from Jan. 1 of this year through present in which Donald Trump's social media handle was visible onscreen across BBC News London, CNN, MSNBC and Fox News 24/7 and the ABC, CBS and NBC evening news broadcasts and transform them into a collage movie showing a "Year Of Trump's Television Tweets?"

The end result can be seen below, played forward at high speed (every second of the movie below represents 30 seconds of actual airtime, with frames sorted by time so the airing of a tweet across multiple stations will be seen interspersed). Some of the appearances of his handle are tweets, Facebook or Instagram posts, others may be movies or still images credited to one of his social accounts and others may be campaign events with his social handle displayed on signage.

 

View Movie.

TECHNICAL DETAILS

To create this movie, the Visual Global Entity Graph (VGEG) 2.0 was searched in BigQuery for all appearances of Donald Trump's Twitter handle this year, exporting a list of all of the thumbnail images for the corresponding seconds of airtime:

SELECT iaThumbnailUrl FROM `gdelt-bq.gdeltv2.vgegv2_iatv` WHERE DATE(date) >= "2020-01-01" and (LOWER(OCRText) like '%realdonald%' OR LOWER(OCRText) like '%donald j. trump retweeted%') order by date asc

The results were exported to GCS to a text file:

The thumbnail images were then downloaded via GNU Parallel and Wget. Since the filenames themselves are not important, only their ordering in the thumbnaillist.txt file (since they are date-ordered), we rename the images by their GNU Parallel job id using the "{#}" syntax:

mkdir CACHE
time cat thumbnaillist.txt | parallel -j 100 'wget -q {} -O ./CACHE/{#}.jpg'

Broadcasts can come in slightly different resolutions, meaning the thumbnail images vary across the collection. In theory FFMPEG should be able to handle images of different sizes and correctly resize them when converting into a movie, but in practice we ran into issues with different filesizes causing problems, so we used ImageMagic Convert to resize them all to a common resolution (this is actually the target resolution of our thumbnails to begin with):

mkdir RESIZED
time find CACHE/ | parallel 'convert {} -resize "320X240!" ./RESIZED/{#}.jpg'

In a small number of cases a thumbnail image might not have downloaded properly or might have internal issues that cause problems with FFMPEG, so the resizing above has the added benefit of failing for such images, meaning that missing or corrupted images will be absent from the RESIZED directory. Unfortunately, FFMPEG has problems with non-sequential image series, meaning we must renumber the images to remove these gaps. This is easily accomplished using the query below, which reads in all of the images from the RESIZED directory and renumbers them as their Parallel Job ID, resulting in a CACHE2 directory of just valid images sequentially numbered:

mkdir CACHE2
time find RESIZED/ | parallel 'cp {} ./CACHE2/{#}.jpg'

Finally, this image sequenced is converted to a 30fps H.264 MP4 movie using FFMPEG:

time ffmpeg -i ./CACHE2/%d.jpg -vcodec libx264 -y -vf "scale=320x240" -r 30 -an video.mp4

We can verify that all of the frames were correctly included via:

ffmpeg -i video.mp4 -map 0:v:0 -c copy -f null -

That's it! Congratulations, you now have a collage movie of a year of Donald Trump's onscreen tweets!

If you were using an alternative moviemaker other than FFMPEG that requires the images to be zero-padded, you could use Parallel's "-rpl" option to zero-pad the Job ID's:

time cat thumbnaillist.txt | parallel -j 100 --rpl '{0#} $f="%0".int(1+log(total_jobs()-1)/log(10))."d";$_=sprintf($f,$job->seq()-1)' 'wget -q {} -O ./CACHE/{0#}.jpg'

Similarly, if you wanted to automatically size the movie, you could use FFMPEG's "-vf" filter on the output stream:

time ffmpeg -i ./CACHE2/%d.jpg -vcodec libx264 -y -vf "pad=ceil(iw/2)*2:ceil(ih/2)*2,scale=iw*2:ih*2" -r 30 -an video.mp4

You can easily customize the resulting movie's speed, sampling rate, etc, by adjusting the FFMPEG options and even composite multiple sequences together, etc.