The GDELT Project

Mapping 212 Years of History Through Books

Kalev's latest Forbes piece includes an incredible new interactive animated zoomable map of 212 years of world history as seen through the lens of books.

Click on the animated GIF below to view the live interactive zoomable map based on Internet Archive books.  Only those locations appearing each year in more than 30 times overall and in more than 15 books are displayed in that year's map and all coordinates are rounded to one decimal place.  This collapses tight clusters of locations to reduce the number of points on the map in order to make it possible to render the entire 212 year period on a single map, but can result in a grid-like view when zooming too far into the map.

Alternatively, you can also view the same map, but generated from the HathiTrust book collection, for comparison.  Given the much larger number of books in the collection, the cutoffs for inclusion were raised to more than 80 mentions per year and more than 40 books per year.

Read the original piece in Forbes for more detail.

 

 

TECHNICAL DETAILS

For those interested in making their own version of the maps above, access the BigQuery datasets for either the Internet Archive or HathiTrust collections.  Then use one of the queries below:

Internet Archive

select concat(string(DATE),'-01-01') as date, lat,long, cnt, numbooks from (
SELECT DATE, lat,long, COUNT(*) as cnt, count(distinct(BookMeta_Identifier)) as numbooks
FROM (
select DATE, BookMeta_Identifier, ROUND(FLOAT(REGEXP_EXTRACT(SPLIT(V2Locations,';'),r'^[2-5]#.*?#.*?#.*?#.*?#(.*?)#.*?#')),1) as lat, ROUND(FLOAT(REGEXP_EXTRACT(SPLIT(V2Locations,';'),r'^[2-5]#.*?#.*?#.*?#.*?#.*?#(.*?)#')),1) AS long
FROM (TABLE_QUERY([gdelt-bq:internetarchivebooks], 'REGEXP_EXTRACT(table_id, r"(d{4})") BETWEEN "1800" AND "2015"'))
)
where lat is not null and long is not null and abs(lat) < 80 and (abs(lat) > 0 or abs(long) > 0)
group by lat,long, DATE
) where cnt > 30 and numbooks > 15
ORDER BY cnt DESC

HathiTrust

select concat(string(DATE),'-01-01') as date, lat,long, cnt, numbooks from (
SELECT DATE, lat,long, COUNT(*) as cnt, count(distinct(BookMeta_Identifier)) as numbooks
FROM (
select DATE, BookMeta_Identifier, ROUND(FLOAT(REGEXP_EXTRACT(SPLIT(V2Locations,';'),r'^[2-5]#.*?#.*?#.*?#.*?#(.*?)#.*?#')),1) as lat, ROUND(FLOAT(REGEXP_EXTRACT(SPLIT(V2Locations,';'),r'^[2-5]#.*?#.*?#.*?#.*?#.*?#(.*?)#')),1) AS long
FROM (TABLE_QUERY([gdelt-bq:hathitrustbooks], 'REGEXP_EXTRACT(table_id, r"(d{4})") BETWEEN "1800" AND "2015"'))
)
where lat is not null and long is not null and abs(lat) < 80 and (abs(lat) > 0 or abs(long) > 0)
group by lat,long, DATE
) where cnt > 80 and numbooks > 40
ORDER BY cnt DESC