The GDELT Project

Embedding Models: The Impact Of Textual Length On Embedding Similarity Part 2

As embeddings play an increasingly central role in semantic search and as LLM external memory, one challenge is that despite their output being normalized unit vectors, the length of the input text can directly impact the resulting embedding and thus the similarity score. In other words, when comparing a short passage of text to two identically similar texts, one long and one short, an embedding model may score the the shorter one as more similar because it has a more similar length of the input text. Such behavior would be especially problematic when attempting to search across systematically different corpi, such as searching both short social media posts like tweets and long-form text like news articles and books. Synthetic repeated data shows extreme sensitivity to length, but to what degree do embeddings based on real-world text exhibit such behavior?

To explore this question, we'll use our embedding visualization template to cluster a set of pandemic-era sentences representing a mixture of short and longer sentences, some presented as search queries a user might enter into a system and the others as example texts that a system might wish to surface in response to those queries.

We'll use the same set of models as before: the English-only USEv4, the larger English-only USEv5-Large, the 16-language USEv3-Multilingual and the larger 16-language USEv3-Multilingual-Large models (supporting 16 languages: Arabic, Chinese-simplified, Chinese-traditional, English, French, German, Italian, Japanese, Korean, Dutch, Polish, Portuguese, Spanish, Thai, Turkish, Russian), the 100-language LaBSEv2 model optimized for translation-pair scoring and the Vertex AI Embeddings for Text API.

Unlike our synthetic benchmark, longer texts do not necessarily result in automatically disjoint embeddings, but rather longer texts tend to cover more topics, which causes them to correctly have embeddings that represent their broader, more diffuse topical range, but which has the downside of yielding wea

sentences = [
    "COVID cases continue to rise.",
    "Hospitals are being overrun by COVID.",
    "The number of diagnosed COVID infections is doubling every few days.",
    "The number of COVID cases in the country is soaring, with the number of sick patients increasing every single day. In fact, the number of infected is growing so fast that it is effectively doubling every three days, completely overwhelming overburdened hospitals that simply can't cope with the influx. In Chicago alone there were 10 hospitals that declared states of emergencies.",
    "Already, 10 Chicago hospitals have emergencies.",
    "COVID",
    "How many Illinois hospitals have declared emergencies?",
    "Chicago hospital emergencies?",
    "How many hospitals have states of emergency?",
    "Are covid infections increasing?",
    "Are covid infections doubling or tripling?",
    "covid growth rate",
    "covid infection growth rate",
    "given that covid is spreading, with more and more people infected every day, and people are very scared about being infected without social distancing and widespread availability of masks, with people increasingly being hospitalized and many are dying, what is the case count growth rate?",
    "social distancing",
    "which cities are socially distanced?",
    "what are typical social distance lengths?",
    "Chicago has introduced social distancing at its hospitals",
    "many cities are instituting social distancing procedures",
    "social distancing has become a major topic of discussion these days, with many cities instituting such procedures and regulations. required distances range from 6 to 30 feet and often come with a range of other requirements alongside them.",
 ]

Vertex AI

Vertex clusters the state of emergency-related sentences together, scatters social distancing sentences, but separates them from the others and creates two separate clusters for infection growth. Overall, queries are generally grouped alongside relevant sentences, making it possible to use as an LLM external memory. For example, take sentence #9: "How many hospitals have states of emergency?". While the 2D graph below groups this with several related sentences, it separates it from the lengthy sentence #4 that is grouped closer to other topics. This is because while the sentence does include a precise count of state of emergencies, it also covers many other topics that broaden its reach. In contrast, in an LLM external memory semantic search, we would simply be looking for the sentences that are the closest match to it, even if they are a closer match to other sentences. Through this lens we have as the top 5 most similar matches:

In this case, the longer sentence appears in the top 10, but has a fairly weak similarly score, meaning that a thresholded search might not return it, while a Top-K search would in this case return it.

The complete table is below:

[[0.99999984 0.72736205 0.77200013 0.78324141 0.53711666 0.6756287 0.49328292 0.52150585 0.49608734 0.78580896 0.72483717 0.69339202 0.71137617 0.77523274 0.57996991 0.48492917 0.47842466 0.57563873 0.59828169 0.60109369]
[0.72736205 0.99999956 0.75257175 0.8166596 0.66780156 0.6292146 0.64135105 0.61618325 0.66032707 0.71054909 0.71394849 0.64782878 0.67932341 0.71541695 0.59027142 0.52073223 0.48001077 0.71463448 0.67281429 0.60149365]
[0.77200013 0.75257175 0.99999928 0.84319539 0.57250295 0.62102474 0.59351084 0.51725972 0.58612556 0.77001799 0.82289738 0.72459434 0.76847842 0.8053474 0.56551645 0.48686786 0.51406875 0.60333027 0.61703129 0.61086262]
[0.78324141 0.8166596 0.84319539 0.99999622 0.76860284 0.64599928 0.72002987 0.67350092 0.70857402 0.72772658 0.75172519 0.69504995 0.73844996 0.82064317 0.5572397 0.56195082 0.50074646 0.72573896 0.67651294 0.6576914 ]
[0.53711666 0.66780156 0.57250295 0.76860284 0.99999979 0.49745698 0.82006008 0.82225515 0.78378997 0.54198955 0.54869318 0.50029315 0.53023445 0.57428441 0.42007167 0.544507 0.40726767 0.71386862 0.58807115 0.53687076]
[0.6756287 0.6292146 0.62102474 0.64599928 0.49745698 0.99999979 0.50720156 0.52349528 0.51735267 0.65076007 0.66234023 0.71201028 0.69566177 0.63570852 0.72259616 0.58742143 0.57002761 0.60183208 0.56375043 0.56021109]
[0.49328292 0.64135105 0.59351084 0.72002987 0.82006008 0.50720156 0.99999973 0.79899821 0.90741411 0.55172176 0.58655701 0.50129997 0.53801885 0.57843769 0.4531669 0.59190827 0.486049 0.69482192 0.6162482 0.55250159]
[0.52150585 0.61618325 0.51725972 0.67350092 0.82225515 0.52349528 0.79899821 0.99999999 0.75361472 0.53529993 0.5524203 0.50867123 0.53425172 0.55767565 0.46091011 0.56125914 0.50679809 0.73719733 0.56830656 0.53603103]
[0.49608734 0.66032707 0.58612556 0.70857402 0.78378997 0.51735267 0.90741411 0.75361472 0.99999974 0.56022633 0.58970172 0.49912635 0.53127581 0.59645717 0.43488945 0.60149136 0.50192054 0.63047972 0.57802035 0.52246511]
[0.78580896 0.71054909 0.77001799 0.72772658 0.54198955 0.65076007 0.55172176 0.53529993 0.56022633 0.99999985 0.87238183 0.72055096 0.77057361 0.75478391 0.59878468 0.53609103 0.53557758 0.56921945 0.58749862 0.57457995]
[0.72483717 0.71394849 0.82289738 0.75172519 0.54869318 0.66234023 0.58655701 0.5524203 0.58970172 0.87238183 0.99999972 0.74177019 0.77082009 0.73714865 0.60340195 0.55359866 0.55354179 0.57659331 0.61168833 0.59494434]
[0.69339202 0.64782878 0.72459434 0.69504995 0.50029315 0.71201028 0.50129997 0.50867123 0.49912635 0.72055096 0.74177019 0.99999987 0.96307796 0.79191875 0.61771411 0.525019 0.54686761 0.58330154 0.57428435 0.56654866]
[0.71137617 0.67932341 0.76847842 0.73844996 0.53023445 0.69566177 0.53801885 0.53425172 0.53127581 0.77057361 0.77082009 0.96307796 0.99999972 0.81676725 0.61791465 0.50523472 0.53718212 0.60146072 0.58042281 0.56298463]
[0.77523274 0.71541695 0.8053474 0.82064317 0.57428441 0.63570852 0.57843769 0.55767565 0.59645717 0.75478391 0.73714865 0.79191875 0.81676725 0.99999753 0.64217555 0.55126978 0.58245757 0.6310464 0.64861359 0.66266197]
[0.57996991 0.59027142 0.56551645 0.5572397 0.42007167 0.72259616 0.4531669 0.46091011 0.43488945 0.59878468 0.60340195 0.61771411 0.61791465 0.64217555 0.99999988 0.69502453 0.71203669 0.72456337 0.7523904 0.71746599]
[0.48492917 0.52073223 0.48686786 0.56195082 0.544507 0.58742143 0.59190827 0.56125914 0.60149136 0.53609103 0.55359866 0.525019 0.50523472 0.55126978 0.69502453 0.9999997 0.71536049 0.68494541 0.73098447 0.68472039]
[0.47842466 0.48001077 0.51406875 0.50074646 0.40726767 0.57002761 0.486049 0.50679809 0.50192054 0.53557758 0.55354179 0.54686761 0.53718212 0.58245757 0.71203669 0.71536049 0.99999954 0.58232576 0.63797941 0.69722158]
[0.57563873 0.71463448 0.60333027 0.72573896 0.71386862 0.60183208 0.69482192 0.73719733 0.63047972 0.56921945 0.57659331 0.58330154 0.60146072 0.6310464 0.72456337 0.68494541 0.58232576 0.99999943 0.81233765 0.71714079]
[0.59828169 0.67281429 0.61703129 0.67651294 0.58807115 0.56375043 0.6162482 0.56830656 0.57802035 0.58749862 0.61168833 0.57428435 0.58042281 0.64861359 0.7523904 0.73098447 0.63797941 0.81233765 0.99999942 0.83845733]
[0.60109369 0.60149365 0.61086262 0.6576914 0.53687076 0.56021109 0.55250159 0.53603103 0.52246511 0.57457995 0.59494434 0.56654866 0.56298463 0.66266197 0.71746599 0.68472039 0.69722158 0.71714079 0.83845733 0.99999791]]

Universal Sentence Encoder

USE results in a more diffuse arrangement, with less structural clustering, likely due to its knowledge cutoff:

[[ 1.00000012e+00 3.06168318e-01 1.28678858e-01 1.24936432e-01 -3.89568601e-03 2.42109671e-01 -2.21119653e-02 -4.50888276e-03 5.78011870e-02 2.94109881e-01 1.30023345e-01 1.81028455e-01 1.22572735e-01 1.47263646e-01 9.04754996e-02 1.66274130e-01 -7.19926227e-03 1.05109408e-01 1.71222165e-01 6.02617934e-02]
[ 3.06168318e-01 9.99999762e-01 2.32912108e-01 3.61477554e-01 4.11730915e-01 1.46052316e-01 4.13478822e-01 3.54360521e-01 5.11071503e-01 4.16761816e-01 2.48169094e-01 8.65757912e-02 1.88879758e-01 2.97977269e-01 9.85034257e-02 2.26205826e-01 1.10805131e-01 5.11530995e-01 3.25108886e-01 6.90550730e-02]
[ 1.28678858e-01 2.32912108e-01 9.99999881e-01 4.17716742e-01 1.59531981e-01 -9.20936000e-04 2.21216232e-01 9.90267247e-02 2.24615380e-01 4.66845304e-01 5.64487636e-01 7.15555847e-02 3.37885231e-01 3.43647361e-01 2.82417964e-02 1.08296007e-01 1.40547112e-01 2.29161620e-01 1.68727934e-01 7.47844204e-02]
[ 1.24936432e-01 3.61477554e-01 4.17716742e-01 9.99999762e-01 4.59566951e-01 -1.47983544e-02 4.61518526e-01 3.74622643e-01 4.70166713e-01 2.07688957e-01 1.65434480e-01 1.68551102e-01 2.48585925e-01 5.74094415e-01 7.37171620e-02 2.07856938e-01 -7.67392665e-03 4.21262652e-01 2.71114796e-01 2.48268992e-01]
[-3.89568601e-03 4.11730915e-01 1.59531981e-01 4.59566951e-01 9.99999940e-01 -2.60680635e-02 6.70314550e-01 7.43288636e-01 5.94396234e-01 2.02415526e-01 1.27915636e-01 3.67305540e-02 1.33317664e-01 1.79591730e-01 -4.50193472e-02 1.33735135e-01 -6.49345368e-02 5.48285246e-01 2.52201289e-01 1.40918255e-01]
[ 2.42109671e-01 1.46052316e-01 -9.20936000e-04 -1.47983544e-02 -2.60680635e-02 1.00000012e+00 -2.42763031e-02 1.46947615e-02 2.11508758e-02 1.05507895e-01 4.16590124e-02 3.35502267e-01 1.96875453e-01 2.74482798e-02 1.91898167e-01 3.91336232e-02 1.70569513e-02 -4.73701395e-02 6.34803548e-02 6.11497462e-02]
[-2.21119653e-02 4.13478822e-01 2.21216232e-01 4.61518526e-01 6.70314550e-01 -2.42763031e-02 9.99999762e-01 6.31055593e-01 7.95537591e-01 1.66428059e-01 1.59268200e-01 3.04371584e-03 7.98556432e-02 1.90071076e-01 -7.96119273e-02 1.93980381e-01 8.02851543e-02 4.45508361e-01 2.15886533e-01 8.62102360e-02]
[-4.50888276e-03 3.54360521e-01 9.90267247e-02 3.74622643e-01 7.43288636e-01 1.46947615e-02 6.31055593e-01 1.00000000e+00 5.28058112e-01 1.61250800e-01 9.83791202e-02 1.09519422e-01 2.14526877e-01 1.25018284e-01 7.82608986e-05 9.78996456e-02 -5.09976894e-02 5.14559269e-01 1.59334719e-01 1.12695836e-01]
[ 5.78011870e-02 5.11071503e-01 2.24615380e-01 4.70166713e-01 5.94396234e-01 2.11508758e-02 7.95537591e-01 5.28058112e-01 9.99999762e-01 2.15466797e-01 1.61071122e-01 -1.28444722e-02 6.75408468e-02 2.45867014e-01 -4.50293953e-03 2.61125892e-01 7.28594065e-02 3.39101851e-01 3.08021426e-01 1.05629601e-01]
[ 2.94109881e-01 4.16761816e-01 4.66845304e-01 2.07688957e-01 2.02415526e-01 1.05507895e-01 1.66428059e-01 1.61250800e-01 2.15466797e-01 9.99999881e-01 6.86020672e-01 2.50364959e-01 4.90377218e-01 3.20480227e-01 1.25016510e-01 1.75487161e-01 1.06687494e-01 2.47396111e-01 2.03991756e-01 -7.43240938e-02]
[ 1.30023345e-01 2.48169094e-01 5.64487636e-01 1.65434480e-01 1.27915636e-01 4.16590124e-02 1.59268200e-01 9.83791202e-02 1.61071122e-01 6.86020672e-01 1.00000000e+00 1.96283698e-01 4.14514065e-01 2.37848729e-01 1.57441348e-02 1.81064159e-01 1.19359031e-01 1.25446782e-01 5.88846616e-02 -1.04456529e-01]
[ 1.81028455e-01 8.65757912e-02 7.15555847e-02 1.68551102e-01 3.67305540e-02 3.35502267e-01 3.04371584e-03 1.09519422e-01 -1.28444722e-02 2.50364959e-01 1.96283698e-01 9.99999881e-01 7.89444327e-01 2.64019817e-01 1.45759583e-01 7.31197298e-02 9.24438089e-02 7.52207115e-02 7.97929093e-02 2.57385373e-02]
[ 1.22572735e-01 1.88879758e-01 3.37885231e-01 2.48585925e-01 1.33317664e-01 1.96875453e-01 7.98556432e-02 2.14526877e-01 6.75408468e-02 4.90377218e-01 4.14514065e-01 7.89444327e-01 9.99999821e-01 3.76597494e-01 1.11455470e-01 8.28244165e-02 9.48166400e-02 1.74780518e-01 1.35157824e-01 2.56794859e-02]
[ 1.47263646e-01 2.97977269e-01 3.43647361e-01 5.74094415e-01 1.79591730e-01 2.74482798e-02 1.90071076e-01 1.25018284e-01 2.45867014e-01 3.20480227e-01 2.37848729e-01 2.64019817e-01 3.76597494e-01 1.00000012e+00 2.34291852e-01 2.91479498e-01 1.02311909e-01 3.17129314e-01 2.93870509e-01 2.55126417e-01]
[ 9.04754996e-02 9.85034257e-02 2.82417964e-02 7.37171620e-02 -4.50193472e-02 1.91898167e-01 -7.96119273e-02 7.82608986e-05 -4.50293953e-03 1.25016510e-01 1.57441348e-02 1.45759583e-01 1.11455470e-01 2.34291852e-01 9.99999642e-01 3.18202406e-01 3.06147963e-01 2.94110656e-01 3.21678996e-01 1.19044378e-01]
[ 1.66274130e-01 2.26205826e-01 1.08296007e-01 2.07856938e-01 1.33735135e-01 3.91336232e-02 1.93980381e-01 9.78996456e-02 2.61125892e-01 1.75487161e-01 1.81064159e-01 7.31197298e-02 8.28244165e-02 2.91479498e-01 3.18202406e-01 9.99999762e-01 3.79286826e-01 3.76724601e-01 6.07666016e-01 2.02831984e-01]
[-7.19926227e-03 1.10805131e-01 1.40547112e-01 -7.67392665e-03 -6.49345368e-02 1.70569513e-02 8.02851543e-02 -5.09976894e-02 7.28594065e-02 1.06687494e-01 1.19359031e-01 9.24438089e-02 9.48166400e-02 1.02311909e-01 3.06147963e-01 3.79286826e-01 9.99999881e-01 1.80458367e-01 2.99034536e-01 3.42592508e-01]
[ 1.05109408e-01 5.11530995e-01 2.29161620e-01 4.21262652e-01 5.48285246e-01 -4.73701395e-02 4.45508361e-01 5.14559269e-01 3.39101851e-01 2.47396111e-01 1.25446782e-01 7.52207115e-02 1.74780518e-01 3.17129314e-01 2.94110656e-01 3.76724601e-01 1.80458367e-01 9.99999940e-01 4.82268393e-01 1.69476241e-01]
[ 1.71222165e-01 3.25108886e-01 1.68727934e-01 2.71114796e-01 2.52201289e-01 6.34803548e-02 2.15886533e-01 1.59334719e-01 3.08021426e-01 2.03991756e-01 5.88846616e-02 7.97929093e-02 1.35157824e-01 2.93870509e-01 3.21678996e-01 6.07666016e-01 2.99034536e-01 4.82268393e-01 1.00000000e+00 3.72767270e-01]
[ 6.02617934e-02 6.90550730e-02 7.47844204e-02 2.48268992e-01 1.40918255e-01 6.11497462e-02 8.62102360e-02 1.12695836e-01 1.05629601e-01 -7.43240938e-02 -1.04456529e-01 2.57385373e-02 2.56794859e-02 2.55126417e-01 1.19044378e-01 2.02831984e-01 3.42592508e-01 1.69476241e-01 3.72767270e-01 9.99999940e-01]]

Universal Sentence Encoder Large

USE Large yields stronger clustering:

[[ 0.99999946 0.4687184 0.27935714 0.33419573 0.07855088 0.32810676 0.05738001 0.06410298 0.04163175 0.4742181 0.31782466 0.3579839 0.3233214 0.33271205 0.08586451 0.12630224 0.03406784 0.10062585 0.20192775 0.06561338]
[ 0.4687184 1. 0.32862818 0.53158605 0.36906633 0.36440152 0.33859152 0.33615828 0.3506515 0.53434765 0.41203713 0.27195913 0.39221737 0.39124435 0.13289882 0.14004889 0.01076385 0.43721956 0.29349175 0.00785148]
[ 0.27935714 0.32862818 0.9999999 0.46125913 0.07446364 0.25681096 0.14757423 0.00831248 0.18105395 0.50264335 0.61050886 0.22271436 0.46297693 0.39237264 -0.0614661 0.02843161 0.00109814 0.04600433 0.14726728 0.00221974]
[ 0.33419573 0.53158605 0.46125913 0.99999976 0.47325516 0.2719706 0.4451041 0.4648155 0.3776698 0.3863385 0.3816029 0.21331769 0.32091781 0.618667 0.0101081 0.08931073 -0.03746324 0.42613947 0.17333958 0.1396812 ]
[ 0.07855088 0.36906633 0.07446364 0.47325516 1. -0.1143196 0.6476519 0.73466116 0.5494226 0.1414201 0.12614821 -0.07637329 0.03879337 0.08661892 -0.01154459 0.143155 0.00234884 0.5847902 0.24824083 0.11296961]
[ 0.32810676 0.36440152 0.25681096 0.2719706 -0.1143196 0.9999999 -0.10179853 -0.03066026 -0.08967339 0.34143806 0.29808927 0.4150352 0.29441145 0.32752424 0.17715779 0.00134799 -0.03815699 -0.0658346 -0.02634784 -0.01052977]
[ 0.05738001 0.33859152 0.14757423 0.4451041 0.6476519 -0.10179853 0.9999997 0.6449292 0.8114 0.13820693 0.18808955 0.01503249 0.11885239 0.22710875 -0.04397846 0.20505969 0.10468296 0.46220803 0.24334314 0.06831467]
[ 0.06410298 0.33615828 0.00831248 0.4648155 0.73466116 -0.03066026 0.6449292 1. 0.50490177 0.09049612 0.06651579 -0.00159065 0.10968944 0.04717136 0.0687369 0.1086092 0.02127307 0.58579284 0.12870885 0.08078978]
[ 0.04163175 0.3506515 0.18105395 0.3776698 0.5494226 -0.08967339 0.8114 0.50490177 1.0000002 0.16790506 0.23526691 0.02800062 0.1255431 0.22227079 -0.04119195 0.24243422 0.16724257 0.2771064 0.25024056 0.06934864]
[ 0.4742181 0.53434765 0.50264335 0.3863385 0.1414201 0.34143806 0.13820693 0.09049612 0.16790506 1. 0.7077933 0.39031672 0.63765544 0.46076256 0.08482644 0.12576282 0.06605925 0.13008386 0.12199621 -0.01515815]
[ 0.31782466 0.41203713 0.61050886 0.3816029 0.12614821 0.29808927 0.18808955 0.06651579 0.23526691 0.7077933 0.9999999 0.28847268 0.5156866 0.38208157 -0.03438292 0.09714253 0.06736529 0.04027107 0.08174749 -0.03157323]
[ 0.3579839 0.27195913 0.22271436 0.21331769 -0.07637329 0.4150352 0.01503249 -0.00159065 0.02800062 0.39031672 0.28847268 0.99999964 0.7861607 0.4468103 0.21076927 0.05622634 0.1306605 -0.02419923 0.01921743 -0.02134873]
[ 0.3233214 0.39221737 0.46297693 0.32091781 0.03879337 0.29441145 0.11885239 0.10968944 0.1255431 0.63765544 0.5156866 0.7861607 0.99999946 0.48167694 0.13274166 0.05877588 0.09965976 0.07441901 0.03523926 -0.06781664]
[ 0.33271205 0.39124435 0.39237264 0.618667 0.08661892 0.32752424 0.22710875 0.04717136 0.22227079 0.46076256 0.38208157 0.4468103 0.48167694 1. 0.13299167 0.17676681 0.10741509 0.15770972 0.17250034 0.10705069]
[ 0.08586451 0.13289882 -0.0614661 0.0101081 -0.01154459 0.17715779 -0.04397846 0.0687369 -0.04119195 0.08482644 -0.03438292 0.21076927 0.13274166 0.13299167 0.99999994 0.41361827 0.28464606 0.39911896 0.41147223 0.29610953]
[ 0.12630224 0.14004889 0.02843161 0.08931073 0.143155 0.00134799 0.20505969 0.1086092 0.24243422 0.12576282 0.09714253 0.05622634 0.05877588 0.17676681 0.41361827 1. 0.359249 0.3419376 0.5454853 0.3847053 ]
[ 0.03406784 0.01076385 0.00109814 -0.03746324 0.00234884 -0.03815699 0.10468296 0.02127307 0.16724257 0.06605925 0.06736529 0.1306605 0.09965976 0.10741509 0.28464606 0.359249 1. 0.14062445 0.18919969 0.3579756 ]
[ 0.10062585 0.43721956 0.04600433 0.42613947 0.5847902 -0.0658346 0.46220803 0.58579284 0.2771064 0.13008386 0.04027107 -0.02419923 0.07441901 0.15770972 0.39911896 0.3419376 0.14062445 0.9999999 0.4800431 0.23286074]
[ 0.20192775 0.29349175 0.14726728 0.17333958 0.24824083 -0.02634784 0.24334314 0.12870885 0.25024056 0.12199621 0.08174749 0.01921743 0.03523926 0.17250034 0.41147223 0.5454853 0.18919969 0.4800431 1.0000002 0.5234207 ]
[ 0.06561338 0.00785148 0.00221974 0.1396812 0.11296961 -0.01052977 0.06831467 0.08078978 0.06934864 -0.01515815 -0.03157323 -0.02134873 -0.06781664 0.10705069 0.29610953 0.3847053 0.3579756 0.23286074 0.5234207 0.9999998 ]]

Universal Sentence Encoder Multilingual

USE Multilingual strongly clusters the social distancing sentences, but yields a more diffuse view of the others:

[[ 1.0000002 0.51608133 0.21014166 0.1563007 0.25136518 0.24915573 0.11333609 0.14050993 0.12362448 0.28587928 0.12106231 0.22704475 0.2049002 0.18872704 0.03168609 0.07028947 -0.01835078 0.18401349 0.1366872 0.00217592]
[ 0.51608133 0.99999976 0.28955212 0.39494243 0.52527547 0.2067611 0.41612715 0.41649252 0.45472538 0.26007402 0.2111499 0.08535415 0.16657826 0.2447232 0.01407334 0.09773101 0.04491335 0.43760902 0.17339107 -0.01096349]
[ 0.21014166 0.28955212 1.0000002 0.39818513 0.16977409 0.15342665 0.25830835 0.10414216 0.26504725 0.36414826 0.36602372 0.18328673 0.38457507 0.32568103 -0.02035077 0.01647286 0.0111865 0.11229244 0.11558458 -0.01806599]
[ 0.1563007 0.39494243 0.39818513 0.99999976 0.5373311 0.08135615 0.55868447 0.4481116 0.5184478 0.2665618 0.18804166 0.19631547 0.2871124 0.6033769 0.02541481 0.12942794 0.06218909 0.3640567 0.2420692 0.18933237]
[ 0.25136518 0.52527547 0.16977409 0.5373311 1. -0.0382889 0.6574589 0.75634193 0.5706837 0.17453857 0.1561203 -0.04290634 0.06652248 0.25253814 0.00672778 0.11166943 0.02077706 0.60971326 0.24310416 0.09906109]
[ 0.24915573 0.2067611 0.15342665 0.08135615 -0.0382889 0.9999999 -0.02614414 0.00239299 -0.0335152 0.06885836 0.04871358 0.28410703 0.1646254 0.00333439 0.12295115 0.01877239 -0.02318461 -0.02327905 -0.01488914 -0.03515672]
[ 0.11333609 0.41612715 0.25830835 0.55868447 0.6574589 -0.02614414 0.99999994 0.73769224 0.86689806 0.32628554 0.37421337 -0.05902408 0.07143942 0.32740518 -0.02593778 0.22326778 0.15771297 0.42267647 0.24179538 0.09274808]
[ 0.14050993 0.41649252 0.10414216 0.4481116 0.75634193 0.00239299 0.73769224 0.9999999 0.6687136 0.28410104 0.33656412 -0.04955217 0.07730739 0.23383641 0.06686851 0.17185017 0.14416264 0.50476474 0.15511076 0.06848888]
[ 0.12362448 0.45472538 0.26504725 0.5184478 0.5706837 -0.0335152 0.86689806 0.6687136 0.9999997 0.3039065 0.34272265 -0.07308942 0.02539044 0.33900523 0.04397969 0.33258703 0.273138 0.32487315 0.28067723 0.15107988]
[ 0.28587928 0.26007402 0.36414826 0.2665618 0.17453857 0.06885836 0.32628554 0.28410104 0.3039065 1.0000002 0.78831214 0.29582763 0.53000796 0.44747794 -0.02402702 0.20279917 0.12608528 0.17568174 0.07759584 -0.02613236]
[ 0.12106231 0.2111499 0.36602372 0.18804166 0.1561203 0.04871358 0.37421337 0.33656412 0.34272265 0.78831214 0.9999998 0.10295357 0.35208246 0.26991153 -0.10133878 0.19900963 0.18339431 0.09747186 0.01169215 -0.0683264 ]
[ 0.22704475 0.08535415 0.18328673 0.19631547 -0.04290634 0.28410703 -0.05902408 -0.04955217 -0.07308942 0.29582763 0.10295357 0.99999976 0.7217294 0.20755121 0.16791397 0.05433134 0.07050405 0.01458928 0.03153889 -0.00482013]
[ 0.2049002 0.16657826 0.38457507 0.2871124 0.06652248 0.1646254 0.07143942 0.07730739 0.02539044 0.53000796 0.35208246 0.7217294 1. 0.35127372 0.10705271 0.02897312 0.00610893 0.14851406 0.07994392 -0.02094023]
[ 0.18872704 0.2447232 0.32568103 0.6033769 0.25253814 0.00333439 0.32740518 0.23383641 0.33900523 0.44747794 0.26991153 0.20755121 0.35127372 0.9999998 0.16410874 0.19079515 0.16062936 0.25941545 0.24637018 0.18430056]
[ 0.03168609 0.01407334 -0.02035077 0.02541481 0.00672778 0.12295115 -0.02593778 0.06686851 0.04397969 -0.02402702 -0.10133878 0.16791397 0.10705271 0.16410874 0.9999997 0.3120936 0.23033568 0.29371566 0.41725332 0.21591803]
[ 0.07028947 0.09773101 0.01647286 0.12942794 0.11166943 0.01877239 0.22326778 0.17185017 0.33258703 0.20279917 0.19900963 0.05433134 0.02897312 0.19079515 0.3120936 0.9999999 0.62282324 0.28538528 0.54815894 0.319191 ]
[-0.01835078 0.04491335 0.0111865 0.06218909 0.02077706 -0.02318461 0.15771297 0.14416264 0.273138 0.12608528 0.18339431 0.07050405 0.00610893 0.16062936 0.23033568 0.62282324 1. 0.11960582 0.25087872 0.33179605]
[ 0.18401349 0.43760902 0.11229244 0.3640567 0.60971326 -0.02327905 0.42267647 0.50476474 0.32487315 0.17568174 0.09747186 0.01458928 0.14851406 0.25941545 0.29371566 0.28538528 0.11960582 0.9999998 0.45966995 0.11707503]
[ 0.1366872 0.17339107 0.11558458 0.2420692 0.24310416 -0.01488914 0.24179538 0.15511076 0.28067723 0.07759584 0.01169215 0.03153889 0.07994392 0.24637018 0.41725332 0.54815894 0.25087872 0.45966995 1.0000001 0.452159 ]
[ 0.00217592 -0.01096349 -0.01806599 0.18933237 0.09906109 -0.03515672 0.09274808 0.06848888 0.15107988 -0.02613236 -0.0683264 -0.00482013 -0.02094023 0.18430056 0.21591803 0.319191 0.33179605 0.11707503 0.452159 1.0000002 ]]

Universal Sentence Encoder Multilingual Large

The Large edition is nearly identical to its smaller counterpart:

[[ 9.99999702e-01 5.33886313e-01 2.96391428e-01 3.07442486e-01 1.81777582e-01 3.33654642e-01 8.61134529e-02 5.89422360e-02 1.22138381e-01 3.85184467e-01 1.66147292e-01 2.98140347e-01 2.67109603e-01 1.36734545e-01 6.74836561e-02 5.97912818e-03 -2.22859383e-02 1.36714742e-01 7.89780468e-02 -2.91799791e-02]
[ 5.33886313e-01 9.99999642e-01 3.61683190e-01 5.39723456e-01 4.76095766e-01 3.51174414e-01 3.69971722e-01 4.29203272e-01 3.99513513e-01 3.61527920e-01 2.76504934e-01 1.57621086e-01 2.29885101e-01 1.99777529e-01 -6.06341660e-02 5.30658811e-02 -1.42248776e-02 3.46306980e-01 7.59341866e-02 -2.19661258e-02]
[ 2.96391428e-01 3.61683190e-01 1.00000012e+00 5.05351186e-01 1.46879867e-01 2.54988998e-01 2.24176884e-01 8.46205801e-02 2.73797005e-01 3.87565017e-01 3.97140801e-01 1.43078223e-01 3.47496092e-01 3.54557693e-01 -2.36742049e-02 3.02211754e-02 1.94693860e-02 1.08249746e-01 1.64799303e-01 6.13552257e-02]
[ 3.07442486e-01 5.39723456e-01 5.05351186e-01 9.99999762e-01 4.95597780e-01 2.15468258e-01 4.82865095e-01 4.84645754e-01 4.80713367e-01 3.97824943e-01 3.33722591e-01 2.03930870e-01 3.49955022e-01 5.59452236e-01 4.27776948e-04 9.77308303e-02 4.73497808e-02 3.92912745e-01 1.20591685e-01 1.02454916e-01]
[ 1.81777582e-01 4.76095766e-01 1.46879867e-01 4.95597780e-01 9.99999523e-01 -3.51964124e-03 6.42694354e-01 7.20554531e-01 5.48255682e-01 1.49614781e-01 1.05213374e-01 -4.83859815e-02 4.53185737e-02 1.51542962e-01 -3.83545794e-02 9.50933993e-02 1.38385817e-02 5.59626162e-01 1.55060098e-01 9.73528698e-02]
[ 3.33654642e-01 3.51174414e-01 2.54988998e-01 2.15468258e-01 -3.51964124e-03 9.99999821e-01 -1.34298978e-02 9.75372568e-02 3.08605917e-02 1.81385159e-01 1.15857758e-01 3.18770647e-01 2.29536146e-01 1.21627390e-01 2.04637066e-01 6.50581643e-02 -1.31206941e-02 5.96358487e-03 5.61962202e-02 2.15573236e-02]
[ 8.61134529e-02 3.69971722e-01 2.24176884e-01 4.82865095e-01 6.42694354e-01 -1.34298978e-02 9.99999821e-01 7.08008885e-01 8.84255528e-01 2.30791450e-01 2.37283200e-01 2.42809579e-03 8.72570574e-02 2.68361419e-01 -3.17205973e-02 2.39435583e-01 1.74745619e-01 3.96572590e-01 2.72154748e-01 8.20114240e-02]
[ 5.89422360e-02 4.29203272e-01 8.46205801e-02 4.84645754e-01 7.20554531e-01 9.75372568e-02 7.08008885e-01 9.99999642e-01 6.08087182e-01 2.14738369e-01 1.74794883e-01 2.62221321e-02 8.99664015e-02 1.56826168e-01 5.20732924e-02 1.74553081e-01 1.15907423e-01 5.46061993e-01 1.03315562e-01 7.23062456e-02]
[ 1.22138381e-01 3.99513513e-01 2.73797005e-01 4.80713367e-01 5.48255682e-01 3.08605917e-02 8.84255528e-01 6.08087182e-01 9.99999762e-01 2.62304276e-01 2.84828186e-01 3.54753397e-02 1.14952207e-01 2.99538374e-01 -1.05046108e-02 3.13295484e-01 2.33595312e-01 2.72130251e-01 2.76811272e-01 9.26231444e-02]
[ 3.85184467e-01 3.61527920e-01 3.87565017e-01 3.97824943e-01 1.49614781e-01 1.81385159e-01 2.30791450e-01 2.14738369e-01 2.62304276e-01 9.99999881e-01 7.30537653e-01 4.50588644e-01 6.12129927e-01 4.61619616e-01 -6.68970216e-03 1.99685156e-01 1.44494191e-01 9.13357809e-02 4.82044891e-02 -9.07829031e-04]
[ 1.66147292e-01 2.76504934e-01 3.97140801e-01 3.33722591e-01 1.05213374e-01 1.15857758e-01 2.37283200e-01 1.74794883e-01 2.84828186e-01 7.30537653e-01 1.00000012e+00 2.33856171e-01 3.80743682e-01 3.09758395e-01 -6.02705330e-02 2.16064811e-01 1.70389220e-01 5.65076992e-02 2.65673213e-02 -2.86880191e-02]
[ 2.98140347e-01 1.57621086e-01 1.43078223e-01 2.03930870e-01 -4.83859815e-02 3.18770647e-01 2.42809579e-03 2.62221321e-02 3.54753397e-02 4.50588644e-01 2.33856171e-01 9.99999762e-01 8.36353123e-01 3.75985414e-01 2.09882319e-01 6.18371367e-02 8.25410187e-02 1.66936889e-02 4.58528250e-02 -8.89424905e-02]
[ 2.67109603e-01 2.29885101e-01 3.47496092e-01 3.49955022e-01 4.53185737e-02 2.29536146e-01 8.72570574e-02 8.99664015e-02 1.14952207e-01 6.12129927e-01 3.80743682e-01 8.36353123e-01 9.99999821e-01 4.59546268e-01 1.60174787e-01 7.04554021e-02 8.26777071e-02 8.61062855e-02 6.02281280e-02 -6.73194006e-02]
[ 1.36734545e-01 1.99777529e-01 3.54557693e-01 5.59452236e-01 1.51542962e-01 1.21627390e-01 2.68361419e-01 1.56826168e-01 2.99538374e-01 4.61619616e-01 3.09758395e-01 3.75985414e-01 4.59546268e-01 1.00000024e+00 1.09285906e-01 2.08454698e-01 2.03303665e-01 1.37738347e-01 1.24564663e-01 1.33279890e-01]
[ 6.74836561e-02 -6.06341660e-02 -2.36742049e-02 4.27776948e-04 -3.83545794e-02 2.04637066e-01 -3.17205973e-02 5.20732924e-02 -1.05046108e-02 -6.68970216e-03 -6.02705330e-02 2.09882319e-01 1.60174787e-01 1.09285906e-01 1.00000000e+00 3.20602596e-01 3.00565720e-01 3.71706635e-01 4.44395781e-01 1.31968319e-01]
[ 5.97912818e-03 5.30658811e-02 3.02211754e-02 9.77308303e-02 9.50933993e-02 6.50581643e-02 2.39435583e-01 1.74553081e-01 3.13295484e-01 1.99685156e-01 2.16064811e-01 6.18371367e-02 7.04554021e-02 2.08454698e-01 3.20602596e-01 1.00000012e+00 6.87767863e-01 2.46799335e-01 4.67067122e-01 3.57276201e-01]
[-2.22859383e-02 -1.42248776e-02 1.94693860e-02 4.73497808e-02 1.38385817e-02 -1.31206941e-02 1.74745619e-01 1.15907423e-01 2.33595312e-01 1.44494191e-01 1.70389220e-01 8.25410187e-02 8.26777071e-02 2.03303665e-01 3.00565720e-01 6.87767863e-01 1.00000024e+00 1.50053769e-01 2.50763178e-01 3.24940234e-01]
[ 1.36714742e-01 3.46306980e-01 1.08249746e-01 3.92912745e-01 5.59626162e-01 5.96358487e-03 3.96572590e-01 5.46061993e-01 2.72130251e-01 9.13357809e-02 5.65076992e-02 1.66936889e-02 8.61062855e-02 1.37738347e-01 3.71706635e-01 2.46799335e-01 1.50053769e-01 1.00000012e+00 3.81909072e-01 1.32469237e-01]
[ 7.89780468e-02 7.59341866e-02 1.64799303e-01 1.20591685e-01 1.55060098e-01 5.61962202e-02 2.72154748e-01 1.03315562e-01 2.76811272e-01 4.82044891e-02 2.65673213e-02 4.58528250e-02 6.02281280e-02 1.24564663e-01 4.44395781e-01 4.67067122e-01 2.50763178e-01 3.81909072e-01 1.00000000e+00 4.39613789e-01]
[-2.91799791e-02 -2.19661258e-02 6.13552257e-02 1.02454916e-01 9.73528698e-02 2.15573236e-02 8.20114240e-02 7.23062456e-02 9.26231444e-02 -9.07829031e-04 -2.86880191e-02 -8.89424905e-02 -6.73194006e-02 1.33279890e-01 1.31968319e-01 3.57276201e-01 3.24940234e-01 1.32469237e-01 4.39613789e-01 9.99999821e-01]]

LaBSE

LaBSE's results in several differentiated diffuse clusters:

[[ 1.0000002 0.6937454 0.6725899 0.61143243 0.37660468 0.5384503 0.22686651 0.26979166 0.27324134 0.67231524 0.44578648 0.5350619 0.6009529 0.53463423 0.03652978 0.14565846 0.17447296 0.16348803 0.16040495 0.12967603]
[ 0.6937454 1. 0.5722101 0.57146126 0.50595534 0.5442071 0.36106777 0.38443962 0.4361176 0.54845095 0.41583318 0.34621418 0.4469711 0.4817719 0.0629304 0.23198628 0.12041149 0.3710325 0.22498494 0.11570741]
[ 0.6725899 0.5722101 1.0000002 0.65350264 0.3688473 0.4726634 0.31822532 0.180826 0.29076537 0.6114829 0.5697601 0.48295265 0.6003841 0.5399828 0.1091314 0.19326915 0.20581102 0.23708618 0.2050755 0.10063438]
[ 0.61143243 0.57146126 0.65350264 1. 0.50298685 0.36870307 0.34908992 0.39486045 0.39324433 0.45564032 0.380995 0.37731624 0.46745187 0.61882114 0.02794786 0.12872063 0.08084987 0.30662468 0.13498901 0.2097429 ]
[ 0.37660468 0.50595534 0.3688473 0.50298685 1. 0.23876706 0.56644434 0.5434874 0.5669353 0.2702528 0.27037323 0.08013841 0.14640936 0.24639615 0.1194326 0.23823357 0.1837152 0.4549244 0.31533223 0.1648199 ]
[ 0.5384503 0.5442071 0.4726634 0.36870307 0.23876706 0.9999999 0.10030793 0.20529746 0.11506934 0.44352317 0.32032984 0.56460166 0.59104294 0.3500458 0.1703825 0.02432899 0.0094877 0.17336154 0.07075278 -0.06378658]
[ 0.22686651 0.36106777 0.31822532 0.34908992 0.56644434 0.10030793 0.9999999 0.5255148 0.76731473 0.31915453 0.3159027 0.09273866 0.14267 0.28933287 0.05575826 0.38496858 0.3311357 0.34456044 0.35221982 0.09985513]
[ 0.26979166 0.38443962 0.180826 0.39486045 0.5434874 0.20529746 0.5255148 1.0000001 0.5659918 0.39051485 0.35409835 0.14796627 0.19058514 0.3397368 0.16365708 0.36167055 0.33776808 0.5371459 0.21488516 0.15478538]
[ 0.27324134 0.4361176 0.29076537 0.39324433 0.5669353 0.11506934 0.76731473 0.5659918 1.0000001 0.35191172 0.33626115 0.11995895 0.16555734 0.38328767 0.07713168 0.46736455 0.41548812 0.27517647 0.36830103 0.1764433 ]
[ 0.67231524 0.54845095 0.6114829 0.45564032 0.2702528 0.44352317 0.31915453 0.39051485 0.35191172 0.9999999 0.77829474 0.60110056 0.7416543 0.6487883 0.095173 0.34175655 0.30967045 0.22086842 0.2244495 0.10891487]
[ 0.44578648 0.41583318 0.5697601 0.380995 0.27037323 0.32032984 0.3159027 0.35409835 0.33626115 0.77829474 1.0000001 0.40623748 0.53567815 0.47249612 0.08387688 0.30733353 0.28539297 0.20952526 0.18326366 0.06503142]
[ 0.5350619 0.34621418 0.48295265 0.37731624 0.08013841 0.56460166 0.09273866 0.14796627 0.11995895 0.60110056 0.40623748 1.0000002 0.9169899 0.45871 0.14254858 0.07133189 0.11010391 0.11464225 0.12576404 -0.0172227 ]
[ 0.6009529 0.4469711 0.6003841 0.46745187 0.14640936 0.59104294 0.14267 0.19058514 0.16555734 0.7416543 0.53567815 0.9169899 0.99999994 0.5563265 0.11903001 0.0903923 0.12203714 0.18377161 0.16832113 -0.00150557]
[ 0.53463423 0.4817719 0.5399828 0.61882114 0.24639615 0.3500458 0.28933287 0.3397368 0.38328767 0.6487883 0.47249612 0.45871 0.5563265 0.99999964 0.23757485 0.37599552 0.35582662 0.23613054 0.2568812 0.32353824]
[ 0.03652978 0.0629304 0.1091314 0.02794786 0.1194326 0.1703825 0.05575826 0.16365708 0.07713168 0.095173 0.08387688 0.14254858 0.11903001 0.23757485 1. 0.4713841 0.45525134 0.41785482 0.47319016 0.36266914]
[ 0.14565846 0.23198628 0.19326915 0.12872063 0.23823357 0.02432899 0.38496858 0.36167055 0.46736455 0.34175655 0.30733353 0.07133189 0.0903923 0.37599552 0.4713841 0.99999976 0.6738947 0.3435516 0.5562639 0.39328194]
[ 0.17447296 0.12041149 0.20581102 0.08084987 0.1837152 0.0094877 0.3311357 0.33776808 0.41548812 0.30967045 0.28539297 0.11010391 0.12203714 0.35582662 0.45525134 0.6738947 0.9999999 0.2898152 0.4134141 0.3341536 ]
[ 0.16348803 0.3710325 0.23708618 0.30662468 0.4549244 0.17336154 0.34456044 0.5371459 0.27517647 0.22086842 0.20952526 0.11464225 0.18377161 0.23613054 0.41785482 0.3435516 0.2898152 1. 0.52794904 0.26295912]
[ 0.16040495 0.22498494 0.2050755 0.13498901 0.31533223 0.07075278 0.35221982 0.21488516 0.36830103 0.2244495 0.18326366 0.12576404 0.16832113 0.2568812 0.47319016 0.5562639 0.4134141 0.52794904 1. 0.5370462 ]
[ 0.12967603 0.11570741 0.10063438 0.2097429 0.1648199 -0.06378658 0.09985513 0.15478538 0.1764433 0.10891487 0.06503142 -0.0172227 -0.00150557 0.32353824 0.36266914 0.39328194 0.3341536 0.26295912 0.5370462 1.0000002 ]]