Visual Explorer: New Channel Inventory JSON File For Programmatic Discovery

To make it easier to programmatically interact with the Visual Explorer and its new Visual Explorer Lenses metaphor, we have created a new centralized JSON inventory file of all television news channels available in the Visual Explorer interface. This contains the IDs, names, locations, start and stop dates and other relevant metadata about each channel, making it trivial to identify relevant coverage in the Explorer.

For example, when performing at-scale analyses of Visual Explorer channels, such as visual analysis using the every-4-seconds image ZIP files, it is necessary to know the earliest available date for a given channel. This is now trivial using our two JSON endpoints:

The first returns a standard JSON object. The second wraps it in a callback, allowing seamless one-line integration into downstream applications.

The JSON object currently contains only a single field: "channels" that is an array of objects, each of which documents a single channel. For example, here is the entry for CNN:

{ "id": "CNN", "label": "CNN", "location": "United States", "startDate": 20090702, "endDate": 99999999, "hasSearch": 1, "hasAISearch": 1 },

Only select channels have the "hasSearch" and "hasAISearch" fields: these represent channels that are available in the TV Explorer and TV AI Explorers.

For most channels you can trivially parse their key attributes just by filtering for the object containing that channel's ID. For example, to use "jq" in a shell script to extract the start date in YYYYMMDD format of Russia 24:

apt-get -y install jq
curl -s | jq -r '.channels[] | select(.id=="RUSSIA24") | .startDate'

Due to technical issues, there are a number of anomalies in the Internet Archive's TV News Archive: most notably IRINN. From 8/2/2011 to 10/19/2015, this channel identifier recorded IRINN. From 10/27/2022 to 1/28/2023 it was inadvertently reused to archive Iran International. From 1/29/2023 onwards it contains IRINN again. Other channels have similar anomalies, especially in earlier years with satellite channels, as channel numbers were reassigned to other channels. For the majority of these cases, we have set the start and end dates for each channel to exclude these errors, but as we continue to refine the channels inventory, we may expand the entries for each channel identifier to document the different channels recorded under that identifier.

This means that robust applications must consider the startDate and endDate fields for each entry. A production application will consider the channel(s) and date range of broadcasts it is interested in and then iterate over all of the objects in the channels array to select the channel objects covering those channels in those date ranges.

For example, to determine which channel an IRINN_ broadcast represents, you can filter using the startDate and endDate fields:

apt-get -y install jq
curl -s | jq -r '.channels[] | select(.id=="IRINN" and 20140105 >= .startDate and 20140105 <= .endDate) | .label'
>IRINN (2011-2015 Archive)
curl -s | jq -r '.channels[] | select(.id=="IRINN" and 20221105 >= .startDate and 20221105 <= .endDate) | .label'
>Iran International
curl -s | jq -r '.channels[] | select(.id=="IRINN" and 20230205 >= .startDate and 20230205 <= .endDate) | .label'

We are excited for the new kinds of downstream applications this new JSON inventory file makes possible and hope it makes it much easier to engage programmatically with the Visual Explorer!