A TF-IDF Chronology Of The Global Geographic Graph Of The Top Terms Associated With Italy Jan-Apr 2020

How might we apply a TF-IDF analysis to the contextual field of the Global Geographic Graph of English language online news coverage to compile a daily chronology of the top phrases associated with Italy from Jan 1 to April 30, 2020, tracing the rise of Covid-19 in the country? Using a single query we get the table below, tracing everything from Kobe Bryant's death to the first glimmers of Covid-19 news relating to the country to the complete pandemic saturation of global English language news mentions of the country in March.

You can download the final spreadsheet and see the final results below:

day station wordlist
1/1/2020 WEB new year, in us, of all, all the, italy in, is a, of a, in italy, for the, on the
1/2/2020 WEB he has, well as, while the, to italy, as well, and italy, the united, for a, by the, the italian
1/3/2020 WEB of violence, troops to, at times, ac milan, if needed, alert to, embassy there, to beirut, interests in, official said
1/4/2020 WEB on alert, troops to, to protect, the u, u s, said the, united states, the united, of a, as a
1/5/2020 WEB german tourists, alto adige, of young, the driver, their bus, fire service, bolzano in, to board, the alto, board their
1/6/2020 WEB inter milan, serie a, on monday, year old, said the, as the, to a, the first, u s, the world
1/7/2020 WEB kisses from, from italy, the company, year old, the u, u s, united states, by the, with a, is a
1/8/2020 WEB in iraq, coalition in, led coalition, s led, report from, islamic state, state group, no report, group there, in baghdad
1/9/2020 WEB 6 2, di maio, with his, saudi arabia, in iran, p m, a statement, the u, u s, of italy
1/10/2020 WEB your pizza, when you, a free, that you, an excellent, of your, loads of, spots and, has prompted, italian spots
1/11/2020 WEB emporio armani, canadian twins, made in, sister sledge, milan fashion, anniversary of, fall winter, fashion week, the milan, of their
1/12/2020 WEB fashion houses, fashion chamber, milan fashion, capasa said, racially insensitive, and racially, culturally and, italian fashion, as culturally, carlo capasa
1/13/2020 WEB for best, nominated for, civil aviation, the film, the civil, flights from, on monday, last year, part of, he was
1/14/2020 WEB rome cooking, cooking workshops, romecookingworkshops com, of rome, he was, it is, and a, is the, in rome, italy and
1/15/2020 WEB your birthday, a free, on your, get a, and a, italy and, the first, that the, to a, will be
1/16/2020 WEB during the, as well, and italy, according to, on a, united kingdom, is a, to a, that the, u s
1/17/2020 WEB portrait of, a lady, the painting, art gallery, the gallery, the work, gallery walls, italian art, years ago, klimt portrait
1/18/2020 WEB of a, to be, the italian, from the, with the, in a, to the, for the, and the, at the
1/19/2020 WEB prime minister, to be, in a, to the, at the, on the, of the, in the, for the, and the
1/20/2020 WEB mr salvini, interior minister, alleged kidnapping, italy borders, league party, for alleged, for six, he refused, six days, when he
1/21/2020 WEB us million, million by, the island, in us, 2025 table, the two, last year, italy in, the city, on monday
1/22/2020 WEB di maio, 5 star, party leader, luigi di, star movement, the top, he would, foreign minister, that he, on wednesday
1/23/2020 WEB the more, on thursday, among the, a new, part of, more than, as well, italy and, of italy, the world
1/24/2020 WEB the letter, the biblioteca, biblioteca nazionale, emilia romagna, nazionale marciana, the right, 2017 table, 2025 table, in us, us million
1/25/2020 WEB matteo salvini, space station, was the, and italy, it was, in rome, that the, has been, as a, of a
1/26/2020 WEB salvini is, emilia romagna, left wing, in emilia, the league, matteo salvini, league party, the right, leader matteo, for his
1/27/2020 WEB his father, kobe bryant, emilia romagna, in emilia, italian basketball, cattani said, stefano bonaccini, reggio emilia, kobe was, bryant told
1/28/2020 WEB his daughter, his father, reggio emilia, bryant was, the game, father played, played for, his family, kobe bryant, team in
1/29/2020 WEB platto said, his friends, so he, he couldnt, a remote, remote birthday, birthday celebration, friends over, invite his, over we
1/30/2020 WEB costa smeralda, the costa, 54 year, the ship, costa crociere, of civitavecchia, cruise ship, port of, ship in, woman and
1/31/2020 WEB two chinese, chinese tourists, costa smeralda, air traffic, 6 000, between italy, cruise ship, two cases, a cruise, the costa
2/1/2020 WEB maio said, italian foreign, chinese tourists, chinese embassy, also plan, with chinese, ministry officials, embassy representatives, di maio, officials also
2/2/2020 WEB fly to, stranded in, left wing, in switzerland, center left, historic stronghold, stronghold of, loss in, a political, tourists and
2/3/2020 WEB a military, were all, wuhan on, sweden and, transferred to, chinese city, nations have, from wuhan, italian media, near rome
2/4/2020 WEB some are, chinese new, billion euros, first half, di maio, monday night, new year, us million, million by, chinese tourists
2/5/2020 WEB date night, your valentine, your evening, comes to, some italian, whether you, with your, your amore, it comes, when it
2/6/2020 WEB high speed, train derailed, the train, derailed in, speed train, whom he, of lodi, the engine, kisses from, near the
2/7/2020 WEB bridge collapse, dolce vita, the code, 56 italians, from wuhan, a bridge, a military, man was, she was, and her
2/8/2020 WEB which was, in rome, the world, by the, with a, the italian, to be, in italy, with the, from the
2/9/2020 WEB students returning, at age, from wuhan, up a, returning from, from china, to china, stay home, the virus, to italy
2/10/2020 WEB best picture, to win, film to, 4 2, toy story, award for, ac milan, the best, last year, on sunday
2/11/2020 WEB air italy, qatar airways, the court, the airline, the company, asia pacific, last year, to have, new york, said the
2/12/2020 WEB coast guard, mr salvini, his immunity, interior minister, a senate, senate commission, salvini for, of ministers, to lift, fellow senators
2/13/2020 WEB police said, rome police, a police, arrested in, remains were, an online, badly burned, police database, online system, guests in
2/14/2020 WEB market italy, 2015 2025, was arrested, valentine day, she was, saudi arabia, asia pacific, the city, had been, year old
2/15/2020 WEB a military, and will, year old, due to, it was, on a, to a, to be, the italian, in italy
2/16/2020 WEB diamond princess, the diamond, the cruise, 35 italians, to japan, send a, cruise ship, passengers and, and crew, bring back
2/17/2020 WEB the legion, a legion, wind of, to report, canada hong, planning similar, milan prosecutors, similar flights, were planning, cover up
2/18/2020 WEB diamond princess, rome4kidstours com, in japan, will take, the ship, the other, cruise ship, including the, in us, the vatican
2/19/2020 WEB americans who, quarantine in, the ship, milan fashion, people from, hong kong, cruise ship, north america, in us, middle east
2/20/2020 WEB milan fashion, winter 2020, fashion week, fall winter, 2020 milan, street style, the bana, cargo ship, to libya, the ship
2/21/2020 WEB lodi in, of codogno, the roman, near lodi, codogno near, walks in, new beds, woman walks, 38 year, the 38
2/22/2020 WEB veneto to, isolation pending, pending test, 78 year, the secondary, secondary contagions, a 78, confirmed infected, to order, order schools
2/23/2020 WEB nearly all, a dozen, two people, towns in, clustered in, are clustered, and who, dozen towns, luca zaia, 80s and
2/24/2020 WEB carnival events, a dozen, la scala, manned checkpoints, the virus, three deaths, called off, includes the, capital milan, 152 cases
2/25/2020 WEB the canary, should self, canary islands, mission impossible, self isolate, a dozen, lombardy and, flu like, italian doctor, their first
2/26/2020 WEB school in, high school, ski trips, to northern, from ski, austria croatia, northern italy, croatia and, and switzerland, self isolate
2/27/2020 WEB ash wednesday, northern italy, casalpusterlengo fombio, castelgerundo and, latin america, than 400, terranova dei, dei passerini, dadda casalpusterlengo, fombio maleo
2/28/2020 WEB sub saharan, saharan africa, italian citizen, citizen who, in sub, 650 cases, in nigeria, northern ireland, on thursday, least 650
2/29/2020 WEB respiratory syndrome, travel to, reconsider travel, hotel federation, final blow, in wales, of recession, 1 128, raised the, on friday
3/1/2020 WEB travel to, public offices, to 34, 29 deaths, 1 100, veneto and, travel advisory, many companies, five more, most public
3/2/2020 WEB 1 694, the louvre, rhode island, retirement and, bring doctors, american airlines, travel to, surged in, to 1, of retirement
3/3/2020 WEB south korea, g 7, the g, 2 036, travel to, the virus, covid 19, in south, and central, to 2
3/4/2020 WEB education minister, lombardy on, lucia azzolina, caseload explode, explode since, virus caseload, minister lucia, test was, feb 19, 2 500
3/5/2020 WEB the elderly, world oldest, absolutely necessary, and universities, unless absolutely, on visiting, elderly not, outside unless, south korea, relatives in
3/6/2020 WEB south korea, on thursday, with 148, the virus, community transmission, the vatican, 148 virus, risk areas, were quarantined, homes and
3/7/2020 WEB the wine, the girl, turned away, her family, the water, water pipes, 20 homes, liters of, be bottled, of ready
3/8/2020 WEB 5 883, conte signed, 16 million, quarter of, the governor, governor of, april 3, taking the, a quarter, to 233
3/9/2020 WEB 16 million, 7 375, quarter of, million people, april 3, a quarter, until april, 366 deaths, city in, at milan
3/10/2020 WEB 9 172, whole country, entire country, british airways, from italy, on tuesday, soldiers and, the entire, foreign and, and commonwealth
3/11/2020 WEB air canada, from italy, on tuesday, government announced, already extraordinary, anti virus, to toughen, british airways, consider requests, st peter
3/12/2020 WEB nationals who, foreign nationals, pharmacies and, most foreign, 12 000, austria belgium, schengen area, states the, homeland security, at any
3/13/2020 WEB i was, codogno which, coronavirus infection, 15 000, have slowed, for me, on thursday, dead the, red cross, thursday march
3/14/2020 WEB cities including, 1 266, unprecedented lockdown, close and, ordered an, public playgrounds, and restricting, lockdown ordering, ordering businesses, of many
3/15/2020 WEB italy coping, coping photo, everything will, be alright, photo gallery, slow coronavirus, still early, for much, already are, showing signs
3/16/2020 WEB the fiat, 1 809, critical inflection, we could, inflection point, 24 747, 1 800, there every, surgeon general, a critical
3/17/2020 WEB 2 158, 27 980, 158 deaths, now accounts, 980 with, to 27, with 2, covid 19, surgeon general, 1 800
3/18/2020 WEB 2 503, 31 506, to 2, 503 deaths, 2 500, covid 19, who travelled, second hardest, 31 000, the air
3/19/2020 WEB 2 978, large elderly, elderly population, italy dead, its large, high toll, toll key, variety of, a variety, 3 249
3/20/2020 WEB 3 405, nuns at, 19 of, 21 nuns, who disembark, of 21, the outskirts, outskirts of, banned all, daily il
3/21/2020 WEB 4 032, all parks, and playgrounds, giuseppe sala, or at, care beds, 4 000, nearly two, a great, what has
3/22/2020 WEB 4 825, mobile medical, medical teams, unless we, personnel and, ready to, the russian, all production, russian defense, except those
3/23/2020 WEB 5 476, 59 000, 476 deaths, new or, to announce, countries to, 59 138, covid 19, or extended, announce new
3/24/2020 WEB 6 000, 000 italians, than 6, 600 more, from 793, more lives, over 600, down from, 793 two, lives down
3/25/2020 WEB 6 800, 743 deaths, declines the, ice rink, an ice, of declines, game in, a jump, than 69, 69 000
3/26/2020 WEB covid 19, s army, 7 500, 7 503, spain death, on thursday, russian military, number of, diagnosed cases, biggest economy
3/27/2020 WEB five times, weeks into, seems to, than reported, reported perhaps, slowing at, least in, although two, increase seems, not being
3/28/2020 WEB 9 134, the grim, 86 000, patients from, second most, five countries, than 86, countries exceed, exceed its, victims to
3/29/2020 WEB record of, 10 000, to it, spain china, i will, 400 million, strong vigorous, food aid, coupons and, 30 000
3/30/2020 WEB de donatis, 97 689, 10 779, mr p, he saw, last day, number of, covid 19, cautious optimism, optimism that
3/31/2020 WEB 800 000, de donatis, of silence, 164 000, jeremiah tower, 38 000, than 164, a minute, than 800, minute of
4/1/2020 WEB the rate, dying at, 12 400, because hospitals, homes because, and brescia, called in, brescia as, dead so, like bergamo
4/2/2020 WEB 13 000, pence said, most comparable, area to, akin to, comparable area, show the, trajectory akin, states at, we think
4/3/2020 WEB sergio rossi, of about, 14 000, covid 19, nearly 14, the brand, 13 915, overall toll, a population, masks and
4/4/2020 WEB shopping in, care workers, emergency is, to 124, 15 362, 124 632, the emergency, 681 new, count to, with 681
4/5/2020 WEB shopping in, hit veneto, world highest, naples rome, figure since, genoa and, rome genoa, 124 632, palm sunday, 15 362
4/6/2020 WEB italy still, still has, 16 000, highest coronavirus, almost 16, world highest, toll almost, coronavirus death, has by, by far
4/7/2020 WEB conte promised, of daily, 16 500, soon reap, reap the, the fruit, fruit of, third straight, for fighting, beds occupied
4/8/2020 WEB bocelli will, the cathedral, to perform, universal music, music group, voice of, ave maria, italian tenor, maria and, organist emanuele
4/9/2020 WEB super old, to issue, italy may, bloc to, issue so, im well, the bloc, hospitalizations and, her doctor, which would
4/10/2020 WEB cross procession, good friday, the cross, way of, in st, friday ceremony, 18 000, of pilgrims, ceremony in, five from
4/11/2020 WEB 18 849, s death, while italy, stood at, at 18, most coronavirus, 849 and, spain had, university dashboard, had 16
4/12/2020 WEB easter weekend, 20 000, weekend with, an easter, k will, 19 500, 19 468, 160 000, to 19, of dead
4/13/2020 WEB in three, were down, the lowest, 19 899, past day, 431 people, lowest number, day to, three weeks, recorded the
4/14/2020 WEB to reopen, europe including, distancing and, shops selling, to open, were allowed, professor of, covid 19, their jobs, a professor
4/15/2020 WEB of restrictions, have begun, 2 million, date 2020, country en, 0000 00, 00 infected, months to, to reopen, then we
4/16/2020 WEB the beach, is empty, beach is, seaside town, capital city, of ostia, italy capital, empty in, ostia near, near italy
4/17/2020 WEB nursing homes, early may, covid 19, in nursing, 22 000, 5 3, in early, date 2020, country en, 00 infected
4/18/2020 WEB lifesavers photo, italy lifesavers, photo gallery, cristina settembrese, luca bruno, friday april, nurse cristina, 2020 nurse, san paolo, her work
4/19/2020 WEB the deliziosa, migrants from, rescue ship, 23 227, naval ship, the coast, disembark 168, 168 spanish, will disembark, aboard the
4/20/2020 WEB the boat, the deliziosa, were expected, 1 831, boat 1, rest were, boat in, get off, 831 passengers, on land
4/21/2020 WEB loosening of, start reopening, can start, reopening on, on may, may 4, any hopes, 4 but, would like, the efforts
4/22/2020 WEB stuck at, to explore, 25 000, a 1, may 4, start reopening, air pollution, near san, italy government, covid 19
4/23/2020 WEB the table, plexiglass divider, testing out, ron burgundy, of heaven, car care, care products, the restaurant, while still, and delivery
4/24/2020 WEB the table, testing out, plexiglass divider, the question, question is, and delivery, the restaurant, take out, has only, il ciak
4/25/2020 WEB europe highest, government commissioner, 20 000, liberation day, lockdown restrictions, when lockdown, 26 000, may 4, holiday the, resistance fighters
4/26/2020 WEB the alzano, positive case, average of, should have, nursing homes, hospital was, home care, italy first, case was, drive away
4/27/2020 WEB may 4, seven weeks, since mid, can resume, on may, laid out, mid march, construction sites, deaths since, starting may
4/28/2020 WEB meant to, companies are, ease up, on restrictions, are watching, italy companies, were meant, detail plans, as politicians, politicians detail
4/29/2020 WEB arcuri told, second wave, above junk, major economy, wave of, one level, level above, first downgrade, just one, to bbb
4/30/2020 WEB in care, 27 359, third highest, italy 27, 58 355, kingdom death, italy based, and any, is unclear, also include


Creating the table above took just 10 seconds in BigQuery using the following query:

WITH nested AS (
WITH data AS (
select date, 'WEB' station, word, count(1) count from (
SELECT DATE(DateTime) date, ML.NGRAMS(SPLIT(ContextualText, ' '), [2, 2], ' ') words FROM `gdelt-bq.gdeltv2.ggg` WHERE DATE(DateTime) >= "2020-01-01" and DATE(DateTime) <= "2020-04-30" and Location like '%Italy%' 
), unnest(words) word group by date, word having length(word) > 2 and count>500 order by count desc
, word_day_type AS (
# how many times a word is mentioned in each "document"
SELECT word, SUM(count) counts, date, station
FROM data
GROUP BY 1, 3, 4
, day_type AS (
# total # of words in each "document"
SELECT SUM(count) counts, date, station
FROM data
, tf AS (
# TF for a word in a "document"
SELECT word, date, station, a.counts/b.counts tf
FROM word_day_type a
JOIN day_type b
USING(date, station)
, word_in_docs AS (
# how many "documents" have a word
SELECT word, COUNT(DISTINCT FORMAT('%s %s', FORMAT_TIMESTAMP( "%m/%d/%E4Y", date), station)) indocs
FROM word_day_type
, total_docs AS (
# total # of docs
SELECT COUNT(DISTINCT FORMAT('%s %s', FORMAT_TIMESTAMP( "%m/%d/%E4Y", date), station)) total_docs
FROM data
, idf AS (
# IDF for a word
SELECT word, LOG(total_docs.total_docs/indocs) idf
FROM word_in_docs
CROSS JOIN total_docs
SELECT date, station,
ARRAY_AGG(STRUCT(ARRAY_TO_STRING(words, ', ') AS wordlist)) top_words
SELECT date, station, ARRAY_AGG(word ORDER BY tfidf DESC LIMIT 10) words
SELECT word, date, station, tf.tf * idf.idf tfidf
JOIN idf
GROUP BY date, station
GROUP BY date, station
) select FORMAT_TIMESTAMP( "%m/%d/%E4Y", date) day, station, wordlist from nested, UNNEST(top_words) order by date asc, station asc

We hope this inspires you on your own creative uses of the Global Geographic Graph!