Are there words that appear more often on one television news station than another? To explore this question in more detail, the Television News Ngrams dataset was processed to identify the 63,784 words that appeared at least 100 times in the past decade on CNN, MSNBC or Fox News and construct a table showing how often each was mentioned across the three stations. Such a dataset could be used to compare word affinity, to learn the words that are more common on one station compared with the others.
The top 15 most common words across the three stations can be seen below.
Word | CNN | MSNBC | FOXNEWS |
the
|
31033314
|
31791602
|
30302778
|
to
|
18558092
|
19421445
|
17442140
|
and
|
15270559
|
15423652
|
14699850
|
a
|
13982126
|
14152248
|
13495076
|
of
|
13074610
|
13456480
|
11833186
|
that
|
10200926
|
10749417
|
9457552
|
in
|
10179932
|
10512601
|
9396239
|
is
|
9146218
|
9153095
|
10052914
|
you
|
9308473
|
9232848
|
9588556
|
i
|
7284656
|
8108468
|
7274558
|
it
|
6873654
|
6978404
|
7086300
|
this
|
6376440
|
6277189
|
5616418
|
for
|
5627181
|
5989439
|
5404281
|
on
|
5017853
|
5343884
|
5131636
|
we
|
4865441
|
4875525
|
5339444
|
Using this dataset it is possible to see that the word "Schiff" (likely referring to US Representative Adam Schiff) appeared on CNN 9,203 times, MSNBC 10,798 times and Fox News 14,823 times. Similarly, former Iranian president (Mahmoud) "Ahmadinejad" was mentioned 3,126 times on CNN, 1,732 times on MSNBC and 4,293 times on Fox News. In contrast, CNN mentioned "Beijing" the most, with 12,121 mentions compared with 4,069 references on MSNBC and 5,246 on Fox News.
CNN favors the word "reporter" by a long margin, mentioning it 735,003 times compared to MSNBC's 296,973 and Fox News' 253,208 mentions, while also mentioning "bulletin" the most, at 4,712 times compared with 643 mentions on MSNBC and 1,256 mentions on Fox News. The words "viewers" (52,117 CNN mentions, 19,160 MSNBC mentions and 31,311 Fox News mentions) and "correspondents" (7,086 CNN mentions, 2,827 MSNBC mentions and 2,618 Fox News mentions) are also clear CNN favorites.
The final CSV dataset can be downloaded for further analysis:
TECHNICAL DETAILS
Constructing this table required just a single SQL query.
select WORD, CNN, MSNBC, FOXNEWS, CNN+MSNBC+FOXNEWS TOT from ( select WORD, SUM(CNN) CNN, SUM(MSNBC) MSNBC, SUM(FOXNEWS) FOXNEWS from ( (SELECT WORD, COUNT CNN, 0 MSNBC, 0 FOXNEWS FROM `gdelt-bq.gdeltv2.iatv_1grams` WHERE STATION='CNN') UNION ALL (SELECT WORD, 0 CNN, COUNT MSNBC, 0 FOXNEWS FROM `gdelt-bq.gdeltv2.iatv_1grams` WHERE STATION='MSNBC') UNION ALL (SELECT WORD, 0 CNN, 0 MSNBC, COUNT FOXNEWS FROM `gdelt-bq.gdeltv2.iatv_1grams` WHERE STATION='FOXNEWS') ) GROUP BY WORD ) WHERE (CNN>100 OR MSNBC>100 OR FOXNEWS>100) order by TOT desc