We are excited today to announce that the Global Similarity Graph (GSG) Document Embeddings dataset has been extended back to January 1, 2020 and now covers more than a quarter-billion articles, each represented as a 512-dimension Universal Sentence Encoder v4 document-level embedding!