A Daily Timeline Of Key Vaccine Topics In 2021 Through A TF-IDF BigQuery Analysis Of The Global Relationship Graph

What are the most significant words and phrases associated with vaccines by day thus far this year? To explore this question further we scanned the Global Relationship Graph's (GRG) Realtime Verb-Centered NGram Pilot for all records containing the word "vaccin*" since the start of this year in English language online news coverage, yielding a total of 15.3M entries. We then aggregated the statements by day and performed a simple TF-IDF analysis entirely in BigQuery to yield a daily chronology of the most significant words and phrases found within 10 words of "vaccin*," tracing the macro level evolution of vaccine coverage thus far this year.

We generated four different versions to showcase different ways of filtering the data. The first two look for significant words, while the second two look for the most significant two-word phrases. We tested a cutoff that required matching words/phrases to appear more than 20 times that day or more than 250 times that day to show how a simple filter can be used to tradeoff relevance and reach. Each is available as a CSV file.

You can see the results of the 2-word phrases with >250 cutoff below. Reporting of efficacy, vaccination goals and major milestones and issues are all captured below, from the Jan 4th approval of Astrazeneca to the plan to vaccinate the 40 Guantanamo Bay prisoners, to the vaccination of the UK's queen and the Pope, to warnings of blood clots and myriad other stories.

day station wordlist
1/1/2021 WEB vaccination numbers, from refrigeration, of doses, vaccinate millions, thursday the, safety and, vaccinations would, vials of, pfizer/biontech vaccine, 1 million
1/2/2021 WEB dry run, delivery system, system with, tested its, health minister, 12 weeks, another vaccine, take place, rather than, oxford university
1/3/2021 WEB university and, oxford university, both vaccines, by oxford, bharat biotech, 530,000 doses, two coronavirus, two covid-19, — india, astrazeneca/oxford vaccine
1/4/2021 WEB vaccine approval, oxford vaccine, university and, the oxford, oxford university, both vaccines, britain 's, a million, administered at, by oxford
1/5/2021 WEB people by, people 's, on dec., false claims, u.s. government, historic vaccination, the historic, being rolled, more quickly, effort that
1/6/2021 WEB 27-nation bloc, second vaccine, a second, bloc a, on dec., 1.3 million, could soon, hospitals that, moderna coronavirus, a freezer
1/7/2021 WEB a second, nursing homes, out to, vaccine administration, 1.3 million, the oxford/astrazeneca, round of, once the, people and, first responders
1/8/2021 WEB pfizer study, that pfizer, 's ability, pfizer 's, once the, sure that, research suggests, new research, 1.5 million, from 20
1/9/2021 WEB queen and, vaccinations that, rollout even, the queen, that gov., vaccines so, release of, speed release, be free, so we
1/10/2021 WEB rollout will, reporting zero, the vatican, has given, of two, both vaccines, given the, on saturday, vaccination centres, be a
1/11/2021 WEB delivery plan, vaccines delivery, large-scale vaccination, seven mass, two million, government 's, in england, pfizer/biontech and, 15 million, nadhim zahawi
1/12/2021 WEB covishield vaccine, consignment of, first consignment, serum institute, of covishield, the covishield, list of, 75 and, institute of, ebola vaccine
1/13/2021 WEB willing to, mass covid-19, it for, by china, people 65, shot of, states to, been distributed, distributed , use and
1/14/2021 WEB more coronavirus, and local, should get, states and, 's sinovac, vaccine during, right now, california counties, highly effective, effective vaccines
1/15/2021 WEB rollout and, up mass, trump administration, the trump, financial help, and local, help to, two highly, centers and, 1.9 trillion
1/16/2021 WEB 's largest, narendra modi, world 's, minister narendra, week that, to educators, vaccines the, on saturday, biggest vaccination, reserve of
1/17/2021 WEB or that, were administered, 's largest, whether to, on saturday, clinical trials, million vaccinations, begin vaccinating, world 's, residents and
1/18/2021 WEB enough supply, vaccinate more, healthier adults, fraught with, offered to, the priority, supply and, 80 and, adults in, the effort
1/19/2021 WEB bharat biotech, reactions to, state is, allergic reactions, lot of, for more, mass vaccinations, shots in, been distributed, and provide
1/20/2021 WEB to anxious, anxious americans, run out, americans and, and pass, the distribution, of moderna, to run, neighboring countries, to its
1/21/2021 WEB up vaccination, available through, in vaccination, and testing, local pharmacies, administration 's, through local, strategy to, national covid-19, covid-19 strategy
1/22/2021 WEB provide the, available through, up vaccination, to meet, n't enough, to vaccinations, and there, government to, vaccination must, vaccination drives
1/23/2021 WEB six weeks, teachers would, to six, of either, vaccinating teachers, their own, vaccines on, delivery of, because it, 100 million
1/24/2021 WEB program that, britain is, an effort, to roll, vaccine but, vaccines may, vaccines on, effort to, 100 million, vaccination centre
1/25/2021 WEB vaccine rather, potential vaccines, booster vaccine, australia 's, on two, doubts about, hopes that, 's medical, of pfizer-biontech, medical regulator
1/26/2021 WEB availability of, how much, funding for, nadhim zahawi, 1.5 million, vaccinations as, vaccines minister, a passive, passive vaccine, widely available
1/27/2021 WEB his vaccine, how much, vaccinate 300, vaccine wo, quantities of, 300 million, much vaccine, limited supply, over vaccine, enough doses
1/28/2021 WEB vaccine maker, deliveries of, export controls, canada 's, vaccination event, sure that, own vaccine, the nhs, go to, vaccines should
1/29/2021 WEB novavax vaccine, vaccine appears, the novavax, 89 %, % effective, single-shot vaccine, 66 %, that its, was 66, said its
1/30/2021 WEB northern ireland, vaccine station, exports of, controls on, the export, a hospital, export of, the bloc, 66 %, funds for
1/31/2021 WEB the 40, shut down, 40 prisoners, largest vaccination, temporarily shut, some covid-19, at fenway, down saturday, the entrance, rest of
2/1/2021 WEB care home, been offered, vaccine maker, maker astrazeneca, care homes, doses from, that while, if vaccines, needed to, york city
2/2/2021 WEB the russian, the sputnik, distribution and, appointments to, russian vaccine, v vaccine, 's sputnik, sites will, nation 's, russia 's
2/3/2021 WEB oxford vaccine, transmission of, the russian, the oxford, not only, russian vaccine, vaccine roll-out, confidence that, new zealand, concerns that
2/4/2021 WEB vaccines when, that these, share of, show that, vaccines can, different coronavirus, of teachers, this will, nadhim zahawi, we get
2/5/2021 WEB sites are, that as, 7,500 vaccinated, their turn, give the, hopeful that, the u.k., vaccination at, as vaccinations, vaccines so
2/6/2021 WEB 7,500 vaccinated, vaccinated health, the offer, the nfl, workers to, offer on, fewer vaccines, the program, conjunction with, in conjunction
2/7/2021 WEB severe disease, against severe, efficacy against, on saturday, protection against, protect against, vaccinated health, vaccine against, on sunday, we do
2/8/2021 WEB severe disease, south africa, against severe, 7,500 vaccinated, south african, protect against, efficacy in, the oxford/astrazeneca, after a, that even
2/9/2021 WEB south africa, clear that, in oregon, about vaccines, up and, questions about, have to, severe disease, vaccine may, effective against
2/10/2021 WEB south africa, or probably, definitely or, essential workers, not get, begin administering, on whether, use the, in south, community health
2/11/2021 WEB well as, countries that, distribution plan, vaccinated persons, rate of, to protect, university of, the more, plan and, vaccines —
2/12/2021 WEB lack of, los angeles, five mass, of supply, is to, approval for, in england, the top, provisional approval, many other
2/13/2021 WEB vaccination teams, but not, children as, in children, benefit from, shortages and, of teachers, by vaccine, possible vaccinated, many oregonians
2/14/2021 WEB been offered, on sunday, offered a, its first, first covid-19, a first, did not, how the, after receiving, vaccine if
2/15/2021 WEB 15 million, programme is, work as, ebola vaccine, doses for, the who, how successfully, key data, its first, successfully vaccines
2/16/2021 WEB vaccination events, grocery workers, green light, to getting, vaccine shipments, the in-store, in-store pharmacy, pharmacy without, sites will, appointment and
2/17/2021 WEB shipments and, and deliveries, vaccine shipments, delays in, imported vaccines, 's administration, from coronavirus, to allow, president joe, sent to
2/18/2021 WEB vaccine shipments, north carolina, vaccine deliveries, hong kong, the variant, pope francis, a global, no vaccine, winter weather, information on
2/19/2021 WEB developing countries, to developing, supplies to, world health, week that, health organization, administration to, surplus vaccines, deliveries of, vaccines once
2/20/2021 WEB health ministry, the minister, and gonzález, gonzález garcía, ministry where, equitable access, appointments will, preferentially after, vaccination preferentially, to what
2/21/2021 WEB target to, several other, adults by, israel 's, britain is, to allow, vaccinated will, vaccinate all, offered a, and vaccinations
2/22/2021 WEB success of, in scotland, winter weather, some vaccination, and icy, icy weather, impact of, depend on, the success, vaccine reduced
2/23/2021 WEB that were, on vaccinations, of having, vaccinations were, tweaked vaccine, and then, the weekend, 32,540 vaccinations, recipes if, if regulators
2/24/2021 WEB middle-income countries, through covax, overall the, receive vaccines, 's single-dose, that overall, first vaccines, shots of, vaccines through, as part
2/25/2021 WEB version of, third vaccine, vaccine drive, original vaccine, flu vaccine, first-generation covid-19, a third, where vaccine, a more, in case
2/26/2021 WEB vaccination status, vaccine you, and immunisation, health canada, hong kong, had the, 've had, ( jcvi, jcvi ), on vaccination
2/27/2021 WEB third vaccine, the third, people the, pfizer-biontech vaccines, from johnson, the sinovac, 's shot, shot the, vaccines developed, the fda
2/28/2021 WEB a third, third vaccine, vaccine offers, because he, offers strong, sinovac vaccine, on sunday, j&j 's, strong protection, with 200
3/1/2021 WEB the philippines, to recommend, recommend the, johnson 's, single-dose johnson, a third, out that, and pfizer, & johnson, be made
3/2/2021 WEB vaccine diplomacy, the covax, one medical, china 's, with its, newly approved, vaccines it, ahead of, johnson 's, started vaccinating
3/3/2021 WEB every adult, for every, enough coronavirus, newly approved, have enough, into vials, the newly, vaccine since, to prioritize, produce the
3/4/2021 WEB flu vaccine, 40 %, giving the, waiting for, aside 40, cleared for, the newly, want the, germany 's, for every
3/5/2021 WEB numbers of, fourth vaccine, the vaccinated, 40 %, teachers for, whichever vaccine, school staff, other signs, and i, is n't
3/6/2021 WEB fourth vaccine, a fourth, signs of, oxford-astrazeneca vaccines, of oxford-astrazeneca, and signs, montana educators, educators even, numbers of, even as
3/7/2021 WEB but most, vaccinated but, most iraqis, the one-dose, one-dose vaccine, that our, members of, sri lanka, our vaccination, on saturday
3/8/2021 WEB vaccinated people, other vaccinated, people can, can gather, indoors without, that fully, with other, the unvaccinated, vaccinated individuals, risk for
3/9/2021 WEB indoors without, with other, vaccinated people, other vaccinated, people can, people indoors, with unvaccinated, can gather, unvaccinated people, vaccinated individuals
3/10/2021 WEB state to, federally supported, eligibility requirements, requirements for, the percentage, site that, of 100, restrictions on, doses for, first state
3/11/2021 WEB americans to, americans are, trials and, normal activities, for how, where they, guidance for, your child, how vaccinated, will expand
3/12/2021 WEB adults eligible, may 1, make all, by may, efforts and, all adults, troops to, vaccine manufacturing, vaccine diplomacy, an additional
3/13/2021 WEB by may, may 1, vaccine diplomacy, who wants, to use, be eligible, all adults, as vaccine, which vaccine, on saturday
3/14/2021 WEB everyone who, between the, astrazeneca vaccinations, link between, by may, that there, indication that, astrazeneca covid-19, blood clotting, on saturday
3/15/2021 WEB the data, 17 million, not suggest, no evidence, suggest the, than 17, are refusing, to suspend, increased risk, suspend use
3/16/2021 WEB no evidence, to suspend, suspended use, is no, said there, china has, blood clots, evidence the, four vaccines, suspend use
3/17/2021 WEB safe vaccine, in mississippi, a safe, no evidence, exports to, world health, vaccine certificates, to suspend, a great, higher vaccination
3/18/2021 WEB increase in, astrazeneca coronavirus, the benefits, benefits of, no evidence, evidence to, european countries, a year, a safe, outweigh any
3/19/2021 WEB vaccinations with, european medicines, medicines agency, been authorized, vaccine over, trust in, increase in, astrazeneca vaccinations, adult population, enough of
3/20/2021 WEB as he, two days, he was, his vaccination, the jab, vaccinating residents, these variants, the 100, million vaccine-threshold, and italy
3/21/2021 WEB 80 %, vaccine —, vaccine delivery, because the, the supply, vaccine shot, of astrazeneca, population has, been fully, using the
3/22/2021 WEB 79 %, an investigation, vaccine provided, provided strong, the rest, to block, safety concerns, vaccine exports, strong protection, was safe
3/23/2021 WEB 79 %, may have, monday that, vaccine trial, that its, was 79, a very, in europe, was n't, on camera
3/24/2021 WEB hong kong, produced in, the bloc, uk 's, vaccines produced, vaccine exports, vaccine export, would not, israel 's, open vaccinations
3/25/2021 WEB 76 %, our covid-19, that our, million residents, 2.5 million, compared to, vaccine exports, protective the, need for, countries that
3/26/2021 WEB 200 million, goal on, original goal, on covid-19, whether vaccinated, vaccine starting, become eligible, and students, eligibility to, new jersey
3/27/2021 WEB program to, because she, vaccines minister, vaccine starting, for coronavirus, now fully, the age, all adults, eligibility to, get their
3/28/2021 WEB surplus of, a surplus, resistant to, vaccines or, population has, been fully, fully vaccinated, all adults, vaccine eligibility, ) of
3/29/2021 WEB 90 %, dose and, be eligible, japan 's, when you, vaccinated if, vaccine starting, vaccine roll-out, opened up, eligibility to
3/30/2021 WEB 90 %, adults under, to adults, vaccine card, be eligible, — and, under 55, its vaccines, will open, to find
3/31/2021 WEB in kids, wednesday that, this age, expand the, is safe, the urgency, 100 %, age group, share the, we share
4/1/2021 WEB a batch, 100 %, to manufacture, of citations, by emergent, wednesday that, of j&j, vaccine ingredient, six months, was 100
4/2/2021 WEB the agency, vaccinated people, can travel, people can, its guidance, people could, vaccinated two, weeks after, not need, vaccinating inmates
4/3/2021 WEB require testing, a negative, vaccination or, astrazeneca coronavirus, had the, got vaccinated, vaccinated people, people can, proof of, or a
4/4/2021 WEB vaccination hours, transportation to, administration will, employees to, vaccine passports, your vaccine, sites and, vaccine administration, vaccination center, vaccine sites
4/5/2021 WEB vaccine passports, drive in, of americans, available vaccine, your vaccine, reporting zero, have n't, its available, according to, vaccination card
4/6/2021 WEB april 19, by april, in california, all residents, site in, vaccine passports, vaccinations of, flood of, opened up, a flood
4/7/2021 WEB benefits of, the benefits, by april, the oxford/astrazeneca, following vaccination, oxford/astrazeneca vaccine, april 19, vaccine passport, vaccine passports, levels of
4/8/2021 WEB the benefits, benefits of, under 30, vaccine passport, a single-dose, the governor, an alternative, of receiving, is among, the few
4/9/2021 WEB under 50, pnc arena, adverse reactions, at pnc, own vaccines, importance of, ramping up, with vaccinations, vaccination programs, australia 's
4/10/2021 WEB long the, how long, to include, european regulators, vaccine shortage, — european, geneva —, shortage of, people ages, adverse reactions
4/11/2021 WEB chinese vaccines, mrna vaccines, the use, the benefits, astrazeneca vaccines, of astrazeneca, proof of, been fully, population has, fully vaccinated
4/12/2021 WEB to require, chinese vaccines, the majority, students to, while vaccinations, to show, cautious while, vaccine mandates, about whether, vaccinations is
4/13/2021 WEB j&j vaccine, the j&j, 13 days, to 13, the johnson, vaccine who, & johnson, johnson &, days after, johnson covid-19
4/14/2021 WEB j&j vaccine, the j&j, the johnson, 13 days, johnson covid-19, to 13, & johnson, johnson &, vaccine while, johnson vaccine
4/15/2021 WEB j&j vaccine, the j&j, the pause, as vaccinations, americans who, six women, the johnson, blood clots, & johnson, pause in
4/16/2021 WEB vaccinated —, the spread, of californians, enough people, californians eligible, vaccines while, age 16, spread of, 12 months, vaccinate enough
4/17/2021 WEB pregnant women, blood clots, 's vaccines, that covid-19, j&j vaccine, after vaccination, linked to, the j&j, who had, & johnson
4/18/2021 WEB j&j vaccine, available vaccine, its available, the j&j, vaccinating people, the cdc, & johnson, johnson &, blood clots, johnson vaccine
4/19/2021 WEB have gotten, vaccination cards, available vaccine, its available, in vaccinations, j&j vaccine, the j&j, not get, register for, astrazeneca covid-19
4/20/2021 WEB j&j vaccine, lagging vaccination, 's lagging, the one-shot, the j&j, confidence and, from may, one-shot vaccine, students to, vaccine confidence
4/21/2021 WEB enough people, americans vaccinated, vaccinated enough, his administration, 200 million, vaccine producer, major vaccine, producer but, a major, j&j vaccine
4/22/2021 WEB start to, to see, protection from, population and, vaccinated population, time off, long as, and start, as long, on hold
4/23/2021 WEB malaria vaccine, j&j vaccine, so they, j&j 's, the j&j, about getting, demand for, j&j vaccinations, she was, out and
4/24/2021 WEB pause on, that j&j, j&j vaccine, the j&j, j&j 's, to resume, one-and-done vaccine, pregnant people, recipients who, 15 vaccine
4/25/2021 WEB pause on, 11-day pause, an 11-day, lifting an, providers they, on vaccinations, recipients who, 15 vaccine, than 1.7, ramped-up vaccinations
4/26/2021 WEB this summer, raw materials, to india, vaccinated americans, approved by, americans who, other states, the three, with vaccines, vaccine campaign
4/27/2021 WEB unvaccinated people, vaccinated americans, you are, outdoor gatherings, vaccination center, small outdoor, and unvaccinated, that fully, for fully, you 're
4/28/2021 WEB unvaccinated people, vaccinated americans, small outdoor, you are, or not, — a, with his,  saying, outdoor gatherings, vaccinated people
4/29/2021 WEB 1 %, of residents, of americans, here's what, all vaccines, for other, else is, what else, not vaccinated, rates and
4/30/2021 WEB their second, two-dose covid-19, from different, different manufacturers, manufacturers as, people were, not vaccinated, more vaccinations, less than, smaller packages
5/1/2021 WEB 95 %, in canada, high gear, into high, campaign into, gear by, the facility, site at, — the, 18 and
5/2/2021 WEB india has, a second, fully vaccinated, vaccines can, all adults, one vaccine, vaccines at, population has, been fully, vaccinated people
5/3/2021 WEB vaccination levels, free vaccination, firms to, to multiple, multiple chinese, chinese firms, a free, and access, chinese vaccine, an effort
5/4/2021 WEB to authorize, whether to, north korea, vaccinated sections, broader rollout, as demand, until their, their total, to teens, total capacity
5/5/2021 WEB vaccination goal, goal to, new vaccination, to children, 160 million, to authorize, as demand, that age, unvaccinated people, pfizer 's
5/6/2021 WEB intellectual property, property rights, rights for, protections on, protections for, waiving intellectual, on covid-19, production of, property protections, to waive
5/7/2021 WEB patent protections, on covid-19, intellectual property, ban on, protections on, property rights, sinopharm vaccine, the possibility, the sinopharm, said it

We hope this inspires you in ways you can use the GRG for compiling these kinds of significant phrase chronologies!


We first compiled all of the GRG entries matching "vaccin*" into a temporary table in BigQuery:

WITH data AS (select date, pre, verb, post, urls[offset(0)].url url, urls[offset(0)].title title, LOWER(CONCAT(pre, ' ', verb, ' ', post)) search from `gdelt-bq.gdeltv2.grg_vcn` WHERE DATE(date) >= "2021-01-01")
select date, search, url, title from data where ( search like '%vaccin%' )

Using this table, we can then convert the entries into word histogram tables using SPLIT() and BigQuery's built-in ML.NGRAMS() function. To split into single words:

select date, word, count(1) count from (
SELECT DATE(date) date, ML.NGRAMS(SPLIT(search, ' '), [1, 1], ' ') words FROM `[TEMPTABLE]` where DATE(date) = "2021-01-09"
), unnest(words) word group by date, word having length(word) > 2 and count>20 order by count desc

To split into phrases just change the "[1, 1]" in ML.NGRAMS() to "[2, 2]":

select date, word, count(1) count from (
SELECT DATE(date) date, ML.NGRAMS(SPLIT(search, ' '), [2, 2], ' ') words FROM `[TEMPTABLE]` where DATE(date) = "2021-01-09"
), unnest(words) word group by date, word having length(word) > 2 and count>20 order by count desc

Just adjust the "count>X" to use a higher filtering level.

And here is the final query that uses the initial temp table compiled from the GRG:

WITH nested AS (
WITH data AS (
select date, 'WEB' station, word, count(1) count from (
SELECT DATE(date) date, ML.NGRAMS(SPLIT(search, ' '), [2, 2], ' ') words FROM `[TEMPTABLE]` where DATE(date) >= "2021-01-01"
), unnest(words) word group by date, word having length(word) > 2 and count>250 order by count desc
, word_day_type AS (
# how many times a word is mentioned in each "document"
SELECT word, SUM(count) counts, date, station
FROM data
GROUP BY 1, 3, 4
, day_type AS (
# total # of words in each "document"
SELECT SUM(count) counts, date, station
FROM data
, tf AS (
# TF for a word in a "document"
SELECT word, date, station, a.counts/b.counts tf
FROM word_day_type a
JOIN day_type b
USING(date, station)
, word_in_docs AS (
# how many "documents" have a word
SELECT word, COUNT(DISTINCT FORMAT('%s %s', FORMAT_TIMESTAMP( "%m/%d/%E4Y", date), station)) indocs
FROM word_day_type
, total_docs AS (
# total # of docs
SELECT COUNT(DISTINCT FORMAT('%s %s', FORMAT_TIMESTAMP( "%m/%d/%E4Y", date), station)) total_docs
FROM data
, idf AS (
# IDF for a word
SELECT word, LOG(total_docs.total_docs/indocs) idf
FROM word_in_docs
CROSS JOIN total_docs
SELECT date, station,
ARRAY_AGG(STRUCT(ARRAY_TO_STRING(words, ', ') AS wordlist)) top_words
SELECT date, station, ARRAY_AGG(word ORDER BY tfidf DESC LIMIT 10) words
SELECT word, date, station, tf.tf * idf.idf tfidf
JOIN idf
GROUP BY date, station
GROUP BY date, station
) select FORMAT_TIMESTAMP( "%m/%d/%E4Y", date) day, station, wordlist from nested, UNNEST(top_words) order by date asc, station asc