Cataloging Two Weeks Of Russia 1 Using Visual Explorer + YOLOv5 Object Detection

Yesterday we demonstrated applying the YOLOv5x pretrained COCO model to a single Russian television news broadcast. Today we are going to demonstrate how to apply it at scale to perform object detection over the entire 24/7 coverage of Russia 1 over the last two weeks showing how trivial the underlying workflow is to apply even advanced object detection like YOLO across 14 days (336 hours) of television news coverage.

The instructions below rely on our previously configured quad-core V100 GCE VM, which has the necessary libraries and drivers installed.

First, we install YOLO:

pip install -qr https://raw.githubusercontent.com/ultralytics/yolov5/master/requirements.txt

Given the high IO load of this workflow with all of the parallel streaming downloads, rapid file creation through unzipping, and random file reading and writing, we recommend creating a GCE VM with ~50-100GB of RAM and working entirely in a RAM disk, but you could also run this workflow at reduced speed on an SSD disk:

mkdir /dev/shm/YOLO
cd /dev/shm/YOLO

We want to process all shows on Russia 1 from November 11 to 24 inclusive, so we'll use a simple one-liner to generate the list of dates and then download the Visual Explorer inventory files for those dates:

#make date list...
start=20221111; end=20221124; while [[ ! $start > $end ]]; do echo $start; start=$(date -d "$start + 1 day" "+%Y%m%d"); done > DATES

#get the inventory files
rm -rf JSON
mkdir JSON
time cat DATES | parallel --eta 'wget -q https://storage.googleapis.com/data.gdeltproject.org/gdeltv3/iatv/visualexplorer/RUSSIA1.{}.inventory.json -P ./JSON/'
rm IDS; find ./JSON/ -depth -name '*.json' | parallel --eta 'cat {} | jq -r .shows[].id >> IDS'
wc -l IDS
rm -rf JSON

#and make the LABELS subdirectory where we will write all of the object detections to
mkdir LABELS

Now we'll create a modified version of yesterday's YOLO runner script. This one does not create the annotated images and writes all of the label annotations to a single shared "LABELS" subdirectory. It also allows the images directory to be passed as a command line option.

Save the following script to "run_yololabel.py":

import torch
import os
import sys

#load the model...
model = torch.hub.load('ultralytics/yolov5', 'yolov5x')

#run a directory of images...
directory = sys.argv[1]
for filename in os.listdir(directory):
    f = os.path.join(directory, filename)
    fout = os.path.join('./LABELS/', filename)
    if os.path.isfile(f):
        results = model(f)
        f = open(fout + '.labels', 'w'); print(results.pandas().xyxy[0], file=f); f.close();

Now we'll create a shell script that automates the workflow of downloading the Visual Explorer ZIP file for a given broadcast, unpacking it, running all of its images through the YOLO runner Python script above and cleaning up.

Save the following script to "run_downlabelshow.sh":

#!/bin/sh
rm -rf $1.IMAGES
mkdir $1.IMAGES
wget -q https://storage.googleapis.com/data.gdeltproject.org/gdeltv3/iatv/visualexplorer/$1.zip -P ./$1.IMAGES/
unzip -n -q -j -d ./$1.IMAGES/ ./$1.IMAGES/$1.zip
rm $1.IMAGES/$1.zip
python3 ./run_yololabel.py ./$1.IMAGES/
rm -rf $1.IMAGES

And make it executable:

chmod 755 run_downlabelshow.sh

You can run the script over a single show like:

mkdir LABELS
time ./run_downlabelshow.sh RUSSIA1_20221123_220000_Sudba_cheloveka_s_Borisom_Korchevnikovim

And see how well it utilizes the GPU via:

nvidia-smi -l 1

In the case of our V100 GCE VM you can see the specific configuration we are using:

YOLOv5 🚀 2022-11-23 Python-3.7.3 torch-1.12.1+cu102 CUDA:0 (Tesla V100-SXM2-16GB, 16161MiB)

Finally, to run the entire 14 day collection of broadcasts (171 distinct broadcasts in all totaling 587,529 images at 1280×720 resolution):

time cat IDS | parallel --eta -j 4 './run_downlabelshow.sh {}'

On this quad-core V100 GCE VM, we found that running 4 processes in parallel allowed us to achieve 75-85% GPU utilization and process the more than half million images in just under 1 hour 55 minutes – around 85 images per second, including the time taken to download and unzip each Visual Explorer ZIP file. Note that this represents an entirely unoptimized pipeline designed for maximal simplicity for the sake of demonstration.

You can download the complete set of all annotations:

In all, 293,760 out of the 587,529 images (49.99%) contained at least one recognized object and there were 1,547,085 total recognized objects of 81 types. Just under 86% of airtime contained at least one person, capturing the centrality of people to television news. Ties were found in 28% of airtime, reflecting the density of formal suit-and-tie dress, with 15.5% of airtime containing a chair, 9.2% a cup and 3.9% a potted plant, reflecting its ubiquity as decor.

To examine the labels in more detail, we'll create a Perl script to parse the results.

Save the following script to "parseyololabels.pl":

#!/usr/bin/perl

opendir(DIR, $ARGV[0]);
while(readdir(DIR)) {
    if ($_!~/\.labels/) { next; };
    ($date) = $_=~/_(\d\d\d\d\d\d\d\d)_/; $DATES{$date}++;    
    my $buf; open(FILE, "$ARGV[0]/$_"); read(FILE, $buf, (-s FILE)); close(FILE);
    
    my %hash;
    while($buf=~/\d+\s+([a-z].*)/g) { $label = $1; $label=~s/\s+$//; if ($label=~/\]/) { next;};  $LABELSBYDATE{$date}{$label}++; $LABELS{$label}++; $hash{$label}=1; $CNT_LABELSTOT++; };
    #there might be multiple "person" labels per image, so make a tally that is per-image/airtime as well.
    foreach (keys %hash) { $LABELSBYDATE_AIRTIME{$date}{$_}++; $LABELS_AIRTIME{$_}++; }; #counts each label once per image...
    $CNT_FRAMES++;
        
    if ($counter++ % 10000 == 0) { print "Processing Labels $counter...\n"; };
    #if ($counter > 10000) { last; };
}
closedir(DIR);

print "Writing Output...\n";
@DATES = sort {$a <=> $b} keys %DATES;
@LABELS = sort keys %LABELS;

open(OUT, ">./TOTALS_BYLABEL.csv");
print OUT "TOTAL,$CNT_LABELSTOT\n";
foreach (sort {$LABELS{$b} <=> $LABELS{$a}} keys %LABELS) {
    print OUT "$_,$LABELS{$_}\n";
}
close(OUT);

open(OUT, ">./TOTALS_BYAIRTIME.csv");
print OUT "TOTAL,$CNT_FRAMES\n";
foreach (sort {$LABELS_AIRTIME{$b} <=> $LABELS_AIRTIME{$a}} keys %LABELS_AIRTIME) {
    print OUT "$_,$LABELS_AIRTIME{$_}\n";
}
close(OUT);

open(OUT, ">./TIMELINE_BYLABEL.csv");
my $row; foreach (@DATES) { $row.=",$_"; }; print OUT $row . "\n";
foreach $label (@LABELS) {
    my $row = $label; foreach $date (@DATES) { $row.=",$LABELSBYDATE{$date}{$label}"; }; print OUT $row . "\n";
}
close(OUT);

open(OUT, ">./TIMELINE_BYAIRTIME.csv");
my $row; foreach (@DATES) { $row.=",$_"; }; print OUT $row . "\n";
foreach $label (@LABELS) {
    my $row = $label; foreach $date (@DATES) { $row.=",$LABELSBYDATE_AIRTIME{$date}{$label}"; }; print OUT $row . "\n";
}
close(OUT);

And run it:

chmod 755 ./parseyololabels.pl
time ./parseyololabels.pl ./LABELS/

It generates several files, including label-level summaries and a timeline:

You can see a treemap breakdown of the labels by number of appearances:

Looking at the labels over time as a percentage of labels each day in a stacked area graph, the dominance of "person" remains relatively consistent over time, though  "tie" does fade on certain days and "book" seems to have high and low-density days.

We hope this gives you an idea of just how easy it is to run highly advanced computer vision analyses at scale over television news using the Visual Explorer!