More Experiments Visually Describing TV News Using Google's Imagen Generative AI Image Captioning

Kalev Leetaru

3 years ago

Earlier this month we examined Imagen on Vertex AI, Google's new image-based generative AI foundational model for images, applying it to two complete television news broadcasts. Today we'll expand those experiments to three new broadcasts to test its international capabilities: one each from the Republic of Congo's Tele Congo, South Sudan's SSBC and Taiwan's CTV.

We've made a few updates to the scripts used last time, including automatic adjustment for the resolution of the original broadcast.

Republic of Congo's Tele Congo (TELECONGO_20230825_143000)
- Original Broadcast In Visual Explorer.
- Captioned Display.
South Sudan's SSBC (SOUTHERNSUDAN_20230825_120000)
- Original Broadcast In Visual Explorer.
- Captioned Display.
Taiwan's CTV (CTV_20230829_090000)
- Original Broadcast In Visual Explorer.
- Captioned Display.

Creating those captioned displays was as simple as:

#install required utilities...
apt-get install -y parallel

#download and unpack the thumbnails...
wget https://storage.googleapis.com/data.gdeltproject.org/gdeltv3/iatv/visualexplorer/CTV_20230829_090000.zip
wget https://storage.googleapis.com/data.gdeltproject.org/gdeltv3/iatv/visualexplorer/SOUTHERNSUDAN_20230825_120000.zip
wget https://storage.googleapis.com/data.gdeltproject.org/gdeltv3/iatv/visualexplorer/TELECONGO_20230825_143000.zip

#unpack the ZIP files...
find *.zip | parallel --eta 'unzip {}'
rm *.zip

#download our API wrapper (NOTE: change the [YOURPROJECTID] in the script to your GCP Project ID):
wget https://storage.googleapis.com/data.gdeltproject.org/blog/2022-tv-news-visual-explorer/exp_captionimagevertexaiimagetext.pl
chmod 755 exp_captionimagevertexaiimagetext.pl

#now run the API over all of the images
#NOTE: Imagen quotas have increased dramatically this month and you can now safely run many more images in parallel...
time find ./CTV_20230829_090000/ -maxdepth 1 -name "*.jpg" | parallel --eta -j 40 './exp_captionimagevertexaiimagetext.pl {}'
time find ./SOUTHERNSUDAN_20230825_120000/ -maxdepth 1 -name "*.jpg" | parallel --eta -j 40 './exp_captionimagevertexaiimagetext.pl {}'
time find ./TELECONGO_20230825_143000/ -maxdepth 1 -name "*.jpg" | parallel --eta -j 40 './exp_captionimagevertexaiimagetext.pl {}'

#compile the captioning output into the thumbnail displays
wget https://storage.googleapis.com/data.gdeltproject.org/blog/2022-tv-news-visual-explorer/exp_captionimagevertexaiimagetext_compileresults.pl?1
chmod 755 exp_captionimagevertexaiimagetext_compileresults.pl
time ./exp_captionimagevertexaiimagetext_compileresults.pl CTV_20230829_090000
time ./exp_captionimagevertexaiimagetext_compileresults.pl SOUTHERNSUDAN_20230825_120000
time ./exp_captionimagevertexaiimagetext_compileresults.pl TELECONGO_20230825_143000