Visualizing Entity Co-Occurrences In Covid-19 Coverage Using The Natural Language API

What might the Global Entity Graph tell us about the co-occurrence network of entities mentioned in coverage of Covid-19, as extracted by the Cloud Natural Language API? To explore this, we queried for all entities mentioning "virus" or "infect*" or "Covid*" and then compiled the list of the top 1,500 co-occurring entities from those documents and rendered in Gephi, as seen in the final image below. In this case we limited only to entities with assigned MID codes.

TECHNICAL DETAILS

The final BigQuery query can be seen below.

SELECT Source, Target, Count RawCount, "Undirected" Type, ( Count/SUM(Count) OVER () ) Weight FROM (
SELECT a.entity Source, b.entity Target, COUNT(*) as Count
FROM (
 (SELECT url, entities.name entity FROM `gdelt-bq.gdeltv2.geg_gcnlapi`, unnest(entities) entities where entities.mid is not null and url in
   (SELECT url FROM `gdelt-bq.gdeltv2.geg_gcnlapi`, unnest(entities) entity WHERE (entity.name like '%virus%' OR entity.name like '%infect%'  OR entity.name like '%covid%') and DATE(date) >= "2020-01-01")
 )
) a
JOIN (
 (SELECT url, entities.name entity FROM `gdelt-bq.gdeltv2.geg_gcnlapi`, unnest(entities) entities where entities.mid is not null and url in
   (SELECT url FROM `gdelt-bq.gdeltv2.geg_gcnlapi`, unnest(entities) entity WHERE (entity.name like '%virus%' OR entity.name like '%infect%'  OR entity.name like '%covid%') and DATE(date) >= "2020-01-01")
 )
) b
ON a.url=b.url
WHERE a.entity<b.entity
GROUP BY 1,2
ORDER BY 3 DESC
LIMIT 1500
)
order by Count Desc