Television Explorer: New Inventory Tables

With the forthcoming release of the Television Explorer 2.0 platform, we are excited to announce the availability of our new Television Explorer Inventory Tables that detail the precise news programming from each station that was monitored by the Internet Archive's Television Archive in a given day (and thus searchable via the Television Explorer).

The Internet Archive monitors only news programming (purely entertainment shows are excluded) and we currently only search those shows that provide closed captioning streams. Inevitably with an archive this massive spanning back to 2009 there will be brief outages or other disruptions and some stations may be monitored for only a specific period of time (such as adding local stations in key markets during specific months of a national election cycle).

Most users will not have a need for these inventory tables, but advanced users have requested the ability to see precisely which shows are being monitored at any given moment and to understand the complete profile of the searchable airtime monitored by the Internet Archive in a given day.

You can now access this complete inventory, from the very first day of the Archive, June 16, 2009 through present. The inventory is updated every 15 minutes (typically when a broadcast finishes it takes up to 24 hours for the Archive to make it available for searching, after which it becomes searchable in the Television Explorer within 15 minutes). Each day's inventory is stored in a separate CSV file, allowing you to request the inventory of just the specific days of interest. Note that on rare occasions a particularly long show or transient technical issue may delay the processing of a show by up to 5 days, so the last 5 days of inventory files are updated every 15 minutes as needed.

You can download all CSV files via the following URL (replace YYYYMMDD with the date of interest):

  • http://data.gdeltproject.org/gdeltv3/iatv/inventory/YYYYMMDD.inventory.csv

For example, to request the inventory of shows monitored on the very first official day of the Archive's existance, use the URL http://data.gdeltproject.org/gdeltv3/iatv/inventory/20090616.inventory.csv or to request the inventory for February 2, 2018, use the URL http://data.gdeltproject.org/gdeltv3/iatv/inventory/20180202.inventory.csv.

Each row of the CSV file reflects a distinct broadcast indexed on that day in the Television Explorer. NOTE that late evening broadcasts that span into the following morning are included in both days' inventory files. – this is why you see broadcasts of just a few seconds listed in a given day (that means that only a few seconds of the show spanned into the following morning).

  • ShowID. The unique identifier assigned to the broadcast by the Internet Archive.
  • URL. The URL to view the broadcast through the Internet Archive's website.
  • StationID. The unique identifier of the station from which the broadcast was monitored.
  • Show. The official name of the show (note that some shows may include the date and/or broadcast time as part of the show name).
  • NumberClips. The number of 15 second clips the broadcast is divided into. Note that the number of distinct clips may be larger than suggested by total duration of the broadcast. For example, a broadcast from yesterday that spanned 20 seconds into this morning will be recorded as having a 20 second duration, but may list that it was broken into 3 distinct 15 second clips. This means that there was one clip that spanned from yesterday into the first second or two of this broadcast, a 15 second clip, and then a third clip that starts in the last second or two of this broadcast and is truncated. The final clip of a given broadcast will be truncated if there are less than 15 seconds remaining.
  • DurationSec. The total duration of the broadcast today in seconds. As noted above, a one hour late night broadcast yesterday that spans 20 seconds into this morning will be listed with a duration of 59 minutes and 40 seconds yesterday and listed again today with a duration of 20 seconds.
  • StartTime. The precise UTC time when the broadcast started today.
  • EndTime. The precise UTC time when the broadcast ended today.