Using the new Web News Ngram (WEB-NGRAM) dataset it becomes possible to explore the evolution and use of the world's languages with unprecedented resolution.
All it takes is a single SQL query and 7.6 seconds to leverage the 30 million words of Estonian news coverage monitored by GDELT January – September 2019 to find the most common Estonian words that begin with "dollar" in order of their popularity in the news.
Rank | Word | Count |
1 | dollari | 2427 |
2 | dollarit | 2157 |
3 | dollarini | 824 |
4 | dollar | 152 |
5 | dollareid | 145 |
6 | dollariga | 102 |
7 | dollarile | 73 |
8 | dollarite | 52 |
9 | dollarist | 42 |
10 | dollarilise | 37 |
11 | dollaril | 27 |
12 | dollarilt | 25 |
13 | dollarites | 24 |
14 | dollarise | 22 |
15 | dollariline | 15 |
16 | dollarisse | 10 |
17 | dollaritesse | 9 |
18 | dollariindeks | 7 |
19 | dollarid | 6 |
20 | dollarisendini | 6 |
TECHNICAL DETAILS
Here is the SQL query used to generate the table above.
SELECT NGRAM, sum(COUNT) TOT FROM `gdelt-bq.gdeltv2.web_1grams` where LANG='ESTONIAN' and NGRAM like 'dollar%' group by NGRAM order by TOT desc LIMIT 100