At-Scale OCR Of Television News Experiments: How OCR And Captioning Tell Different Stories About PBS In One Broadcast

Yesterday we offered the first statistics of just how much onscreen text there can be in a single hour-long American television news broadcast. We also demonstrated just how many times more onscreen than spoken words there can be (5-13x in our two sample broadcasts we examined). Word counts alone don't tell us whether OCR might tell a different story than spoken text and thus offer an entirely new lens through which to understand and assess the topical focus of television news.

If we look at a CNN broadcast from earlier this week, it includes a lengthy exchange about potential funding cuts to PBS and NPR. In the human-generated closed captioning transcript of the broadcast, the words "PBS" and "NPR" each appear just 5 times in the entire hour-long broadcast, out of 9,867 total spoken words (0.05%). Taken in this context, the PBS funding story appears to be a minor blip in the broadcast's coverage. The five singular mentions also make it difficult to know where the exchange begins and ends, since the majority of the conversation about the threatened funding cuts doesn't actually mention the two organizations by name.

In contrast, OCR picks up the chyron "$1.1B IN FUNDING FOR NPR, PBS AT RISK AS WH TARGETS PUBLIC MEDIA", finding that "PBS" is mentioned in the onscreen text of 129 seconds of airtime, meaning it accounts for just over two minutes of the broadcast (3.53%).

In this case, spoken and onscreen mentions of PBS tell two very different stories, from a brief blip to a significant story, showcasing the potential for OCR-based story assessment.

Moreover, by computing the start and end times of mentions of "PBS" in the onscreen text and determining that they all fall into one cluster (rather than being interspersed throughout the broadcast), we can estimate that the story segment begins around 42:56 and ends around 45:03, allowing us to group it into a single clip fairly accurately. Of course, in reality chyrons don't always refer to the current story being discussed and even when they do, don't always so neatly bound the story at its start and end points, but this example demonstrates the immense power of OCR-powered analysis of onscreen text for scholarly assessment of existential media topics like agenda setting and story segmentation.