Schema.org defines a property called "isAccessibleForFree" that is recorded in a page's JSON-LD block captured in the Global Embedded Metadata Graph (GEMG). This property can be used to explicitly indicate whether a given piece of content is paywalled or openly accessible.
We can examine the number of articles in the GEMG that contain this property somewhere in their JSON-LD blocks:
SELECT count(distinct(url)) FROM `gdelt-bq.gdeltv2.gemg`, unnest(jsonld) block WHERE (block like '%"isAccessibleForFree"%') and DATE(date) >= '2021-11-01' and DATE(date) <= '2021-11-30'
This results in 996,860 pages out of 12,520,264 (7.96%).
Of the pages explicitly listing this property, a total of 242,660 define it as false, meaning the article is paywalled. Thus, of the pages providing this property, 24.3% state their full contents is paywalled, while taken as a percentage of the total articles in the GEMG over that period, the percentage is 1.94%. Though it is important to remember that this property is optional and thus this counts only those articles that explicitly provided this property.