Global Similarity Graph: 440 Million 512-Dimension Embeddings Now Available Covering Online & Television News

On top of the 190 million sentence-level embeddings in the Global Similarity Graph Television News Sentence Embeddings dataset spanning 7 stations over more than a decade, there are now 250 million document-level embeddings in theĀ Global Similarity Graph (GSG) Document Embeddings dataset computed across worldwide online news coverage in 65 languages back to January 1, 2020. In all, there are now more than 440 million 512-dimension Universal Sentence Encoder v4 embeddings available for researchers interested in exploring embedding-powered semantic similarity and search at scale!