Tracing Global Media Tone & Anxiety Towards China & Ukraine 2020-Present

Using the GKG, what might we be able to learn about global media tone towards China and Ukraine?

Let's first look at media tone towards China since the start of 2020, giving us a nearly three year window that includes the onset of the pandemic, the recovery and worsening Chinese-US relations. Given that Chinese media produce a high volume of content that is fairly uniformly positive about the government, we'll also calculate the same graph that excludes coverage that either ends in ".cn" or is published in Chinese.

Let's first plot average positive/negative "tone" (in blue) of all coverage in the GKG's 65 monitored languages by day that mentioned China at least twice in the article since January 1, 2020. In orange, we'll plot the same graph, but excluding Chinese press. To make the trends more apparent we'll use a 5-day rolling average. The end result is that the two graphs are highly similar, but with the graph that includes Chinese coverage being much more positive overall than the graph that excludes it. The impact of the pandemic is clear, but so is the rapid and linear recovery in tone towards China through September 2020, at which point tone had largely recovered in both media portfolios. Interestingly, over the past two months media tone has been trending positive, continuing a general trend that began earlier this year.

Looking just at coverage outside of China, the graph below compares tone (blue) with anxiety (orange) on the same vertical axis by using Z-Scores (standard deviations from the mean) to normalize them. Here we can see that as global news tone outside of China plunged towards high negativity at the start of the pandemic, anxiety soared and lingered even as tone began to recover, finally beginning to drop sharply in April 2020. Interestingly, over the past two months, global media anxiety towards China has linearly decreased even as tone has linearly increased towards positivity. This suggests that global portrayals of China are shifting from uncertainty and concern to presenting it as a global power increasingly expressing its interests and exerting its will on the global stage.

What about the same graph for global media coverage of Ukraine over the same time period? Here we see a much delayed increase in anxiety in April 2020 and a period in mid-2021 where tone was relatively positive and anxiety low. The long slow buildup to the invasion can be seen in global media coverage from early September 2021, with anxiety increasing steadily and tone becoming steadily more negative through a turning point in May 2022 as the invasion failed in its objectives to rapidly replace the Ukrainian government and entered its stalemate phase. The first few months of this year exhibited relatively stable tone and anxiety, suggesting a steady stalemate, with tone becoming more negative since June, but starting in August, anxiety has decreased and tone has become more positive, though it has leveled off in the past few weeks.

For those interested in how we computed these scores, here are the BigQuery queries we used. Note that these each consume around 3.75TB of query quota.

China:

SELECT
  substr(CAST(DATE AS STRING),0,8) as day, 
  avg(CAST(REGEXP_REPLACE(V2Tone, r',.*', "") AS FLOAT64)) tone,
  avg(CAST(REGEXP_EXTRACT(GCAM, r'v19.1:([-\d.]+)') AS FLOAT64)) anew,
  sum(CAST(REGEXP_EXTRACT(GCAM, r'c8.3:([-\d.]+)') AS INT64)) / sum(CAST(REGEXP_EXTRACT(GCAM, r'wc:([\d]+)') AS INT64)) *100 ridanxietyperc
  FROM `gdelt-bq.gdeltv2.gkg_partitioned` WHERE
 V2Locations like '%China%China%' 
 AND TIMESTAMP_TRUNC(_PARTITIONTIME, DAY) >= TIMESTAMP("2020-01-01")
 group by day

Excluding China:

SELECT
  substr(CAST(DATE AS STRING),0,8) as day, 
  avg(CAST(REGEXP_REPLACE(V2Tone, r',.*', "") AS FLOAT64)) tone,
  avg(CAST(REGEXP_EXTRACT(GCAM, r'v19.1:([-\d.]+)') AS FLOAT64)) anew,
  sum(CAST(REGEXP_EXTRACT(GCAM, r'c8.3:([-\d.]+)') AS INT64)) / sum(CAST(REGEXP_EXTRACT(GCAM, r'wc:([\d]+)') AS INT64)) *100 ridanxietyperc
  FROM `gdelt-bq.gdeltv2.gkg_partitioned` WHERE
 V2Locations like '%China%China%' 
 and DocumentIdentifier not like '%.cn/%'
 and TranslationInfo not like '%srclc:zho%'
 AND TIMESTAMP_TRUNC(_PARTITIONTIME, DAY) >= TIMESTAMP("2020-01-01")
 group by day

Ukraine:

SELECT
  substr(CAST(DATE AS STRING),0,8) as day, 
  avg(CAST(REGEXP_REPLACE(V2Tone, r',.*', "") AS FLOAT64)) tone,
  avg(CAST(REGEXP_EXTRACT(GCAM, r'v19.1:([-\d.]+)') AS FLOAT64)) anew,
  sum(CAST(REGEXP_EXTRACT(GCAM, r'c8.3:([-\d.]+)') AS INT64)) / sum(CAST(REGEXP_EXTRACT(GCAM, r'wc:([\d]+)') AS INT64)) *100 ridanxietyperc
  FROM `gdelt-bq.gdeltv2.gkg_partitioned` WHERE
 V2Locations like '%Ukraine%Ukraine%' 
 AND TIMESTAMP_TRUNC(_PARTITIONTIME, DAY) >= TIMESTAMP("2020-01-01")
 group by day