What are the most common attributes seen in <META> tags on the open web? Earlier today, with around 12 hours or so of data in the Global Embedded Metadata Graph, we used the query below to compile a histogram of all of the tags that appeared in <META> tags in that period in at least two articles:
SELECT tag.key, tag.type, count(1) numtags, count(distinct url) numpages, count(distinct url)/(select count(1) from `gdelt-bq.gdeltv2.gemg`)*100 percpages FROM `gdelt-bq.gdeltv2.gemg`, unnest(metatags) tag group by key, type having numpages>1 order by numpages desc
You can download the complete list of tags below:
You can see the results below, showing each unique key/type combination, the total number of times it appeared, the number of unique URLs it appeared in and the percentage of all URLs it appeared in. Note especially the "fb:app_id" tag which is the 10th most common tag, but appeared in only 48.4% of all pages. This indicates that this tag is less common but appears multiple times when it is used, often connecting an article to multiple Facebook properties owned by its publisher. Comparing the numtags and numpages fields can be used to identify tags that are often used multiple times on a page versus those that typically appear just once.
key | type | numtags | numpages | percpages | ||
---|---|---|---|---|---|---|
1 |
og:title
|
property
|
967199
|
932074
|
81.14898536996472
|
|
2 |
viewport
|
name
|
981560
|
918362
|
79.95518006331208
|
|
3 |
og:image
|
property
|
986680
|
913519
|
79.53353485472698
|
|
4 |
og:url
|
property
|
942430
|
907916
|
79.04572190744177
|
|
5 |
description
|
name
|
925426
|
895687
|
77.98103075406844
|
|
6 |
og:type
|
property
|
919280
|
884355
|
76.99443494492407
|
|
7 |
og:description
|
property
|
914946
|
877743
|
76.41877561823304
|
|
8 |
og:site_name
|
property
|
787314
|
759980
|
66.16599744383578
|
|
9 |
twitter:card
|
name
|
666604
|
641937
|
55.88884168149637
|
|
10 |
fb:app_id
|
property
|
590514
|
556001
|
48.407011690794675
|
|
11 |
keywords
|
name
|
567404
|
546334
|
47.565375467092
|
|
12 |
robots
|
name
|
577689
|
521901
|
45.438169730697304
|
|
13 |
twitter:site
|
name
|
534826
|
519045
|
45.18951833368739
|
|
14 |
twitter:title
|
name
|
522238
|
505436
|
44.00468049688489
|
|
15 |
twitter:description
|
name
|
501609
|
486178
|
42.32802482334955
|
|
16 |
og:image:width
|
property
|
448497
|
431381
|
37.55724380025701
|
|
17 |
twitter:image
|
name
|
440069
|
424926
|
36.99525333537641
|
|
18 |
og:image:height
|
property
|
440744
|
424464
|
36.95503031527186
|
|
19 |
fb:pages
|
property
|
997718
|
369159
|
32.14002138262714
|
|
20 |
og:locale
|
property
|
353845
|
341809
|
29.758853417563703
|
We hope this offers a first glimpse into the world of <META> tags!