A BBC News article earlier this year noted the power of the word "the" in the English language. This got us wondering how much usage of the word has changed in the past ten years on television news?
The timeline below shows the total percentage of all words uttered on BBC News (Jan 2017 to present), CNN, MSNBC and Fox News (July 2009 to present) by month were the word "the" using their closed captioning from the Internet Archive's Television News Archive via the Television News NGram Dataset. The timelines are smoothed using a 6 month rolling average. We used the same query code as we used for our gendered pronoun analysis earlier this year, requiring just a single BigQuery query.
Overall the trendlines are very similar, meaning that the graph above is zoomed to focus on just a percent and a half difference. Despite this small window of variability, given the total volume of words covered in this analysis, these differences are meaningful.
Two fascinating trends are apparent.
The most obvious is that for the three and a half years of BBC News data, it has consistently used the word "the" more than three quarters of a percent more than its American brethren.
The second is that from 2009 through late 2016, CNN used the word consistently differently than its peers, originally using it less and then more. Since the end of 2016 and Donald Trump's election, all three have used it fairly similarly.
What's driving these trends? Its unclear, but the graph above reminds us just how much we can learn about the evolution of language from a simple humble word like "the".