Why We Need To Verify Our Big Data Results

Kalev's latest piece for Forbes explores how triangulating word patterns drawn from the Google Books ngram viewer against those from the New York Times ngram viewer offers the ability to determine which patterns are genuine artifacts of books versus which reflect general linguistic changes in the English language. In doing so, it suggests that recent studies arguing that Google Books is skewed towards scientific literature, based on the rise of scientific language in books, are not as conclusive as originally thought, due to identical trends being observed in newspapers.

