Large Language Models (LLM's) like OpenAI's ChatGPT and Google's Bard interpret language not as discrete words, but as word-parts known as "tokens." For a given passage of text, they tokenize it by converting the sequence of characters into a sequence of tokenids from its internal lookup. During training the LLM builds up a massive dictionary lookup of all the unique words and wordparts it sees in its training data and assigns each a unique ID. In other words, "i am here" might become "[1, 2, 3]". Since LLM's operate on numeric representations of text, rather than the text itself, all textual input must be translated into a sequence of tokenids for processing. While it is theoretically possible to build an LLM that encodes every word in every language of the world, there are very substantial costs to larger and larger dictionaries, so in practice, LLMs tend to be designed to have the smallest possible dictionary that covers all of the required languages. For languages that dominate the training data, this typically leads to a one-to-one mapping between words and tokens, while for less-common languages, words are often split into chunks of characters. Languages that use non-Latin characters are frequently tokenized by individual Unicode code points (subdividing grapheme clusters), frequently yielding more tokens than characters. This means that the same text in different languages can yield very different numbers of tokens.
Why does it matter how many "tokens" a given text is translated into? Cost, performance and accuracy. Many commercial models charge per token of text processed, rather than per character, so a text that is 10 tokens in English and 100 in another language will yield a 10x increase in cost for the same results. Higher token counts mean the text is being highly subdivided, which can substantially slow performance and reduce accuracy given contextual encoding limitations.
OpenAI makes available a free website where you can paste in a passage of text and see how many tokens it resolves to for GPT-3 family models. To explore just how much tokenization counts vary across languages, we took the following English passage and translated it via Google Translate to all 132 languages currently supported by the GCP Translation API:
This is May 2023, it is the middle of summer and three years after the start of the decade.
The resulting passages were then run through OpenAI's GPT-3 tokenizer to count the number of characters and tokens each yielded. While the use of machine translation introduces potential error, the overall token counts offer a powerful look at OpenAI's effective definition of long-tail languages. The final graph of the number of GPT-3 tokens for the phrase above in each language can be seen below. From just 22 tokens in English to 288 tokens in Myanmar (13 times more), there is a very strong linguistic divide in the results. More than 40 million people speak Myanmar, 70 million speak Thai and 300 million speak Bengali, even as these languages yield high token-to-word ratios that vastly increase their cost, while Latin, a dead language for more than a millennium, yields tokenization effectively identical to English and Estonian, with 1.2 million speakers sits firmly in the flat core tier occupying the right half of the graph below.
Looking at the ratio of tokens-to-characters (rounded to one decimal), we can see even stronger stratification, with Khmer having a ratio that is 11.35 times higher than English:
Below are a few examples of what tokenization looks like, showing how each passage was tokenized.
English
Tokenization: This is May 2023, it is the middle of summer and three years after the start of the decade.
Token Stats: 91 Characters / 22 Tokens
In the original English, each word is its own unique token, yielding a one-to-one ratio.
Arabic
Google Translate: هذا هو مايو 2023 ، منتصف الصيف وبعد ثلاث سنوات من بداية العقد.
Tokenization: ه��ا ��و مايو 2023 �� منت���� ال��ي�� وبعد ��لا�� ��نوا�� من ��داية الع��د.
Token Stats: 62 Characters / 57 Tokens
The Arabic translation of the text via Google Translate actually yields a 31% reduction in characters, but 60% increase in tokens. In other words, fewer actual characters, but 2.6 times as many tokens. Note that the diamond question mark characters represent Unicode part tokenization, or as OpenAI's warning puts it: "Your input contained one or more unicode characters that map to multiple tokens. The output visualization may display the bytes in each token in a non-standard way."
Chinese
Google Translate: 现在是 2023 年 5 月,正值盛夏,也是十年开始后的第三年。
Tokenization: �����是 2023 ��� 5 �������������������是����������的���三��。
Token Stats: 32 Characters / 50 Tokens
Despite the high semantic density of Chinese's ideographic script, the total count is more than doubled compared with English.
Vietnamese
Google Translate: Bây giờ là tháng 5 năm 2023, tức là giữa mùa hè và ba năm sau khi bắt đầu thập kỷ.
Tokenization: Bây gi��� là tháng 5 n��m 2023, t���c là gi���a m��a hè và ba n��m sau khi b���t �����u th���p k���.
Token Stats: 82 Characters / 72 Tokens
In Vietnamese, almost every Unicode character maps to a token other than a handful of phrases like "ng" and "sa".
Myanmar
Google Translate: ၎င်းသည် 2023 ခုနှစ်၊ မေလဖြစ်ပြီး၊ ၎င်းသည် နွေရာသီ၏အလယ်ဖြစ်ပြီး ဆယ်စုနှစ်တစ်ခု၏အစပြီးနောက် သုံးနှစ်ဖြစ်သည်။
Tokenization: ��������������������� 2023 ��������������������� ������������������������������������ ��������������������� ������������������������������������������������������������ ������������������������������������������������������������������������������ ������������������������������������������������
Token Stats: 107 Characters / 288 Tokens
The Myanmar language yields the worst tokenization performance of all the languages tested here, resulting in a whopping 288 tokens, a 1,204% increase (13 times more) than the same English prompt.
The complete table can be seen below (you may need to install specific fonts to see some of the languages):
Language | Tokens | Characters | Text | %Token Diff | %Character Diff | Token Ratio | Character Ratio | Tokens To Chars |
Khmer | 236 | 86 | នេះគឺជាខែឧសភា ឆ្នាំ 2023 វាជាពាក់កណ្តាលរដូវក្តៅ ហើយបីឆ្នាំក្រោយការចាប់ផ្តើមនៃទសវត្សរ៍។ | 972.73 | -5.49 | 10.73 | 0.95 | 2.74 |
Myanmar | 287 | 106 | ၎င်းသည် 2023 ခုနှစ်၊ မေလဖြစ်ပြီး၊ ၎င်းသည် နွေရာသီ၏အလယ်ဖြစ်ပြီး ဆယ်စုနှစ်တစ်ခု၏အစပြီးနောက် သုံးနှစ်ဖြစ်သည်။ | 1204.55 | 16.48 | 13.05 | 1.16 | 2.71 |
Meiteilon | 270 | 102 | ꯃꯁꯤ ꯲꯰꯲꯳ꯒꯤ ꯃꯦ ꯊꯥꯅꯤ, ꯃꯁꯤ ꯑꯌꯨꯛꯀꯤ ꯃꯌꯥꯏ ꯆꯜꯂꯀꯄꯅꯤ ꯑꯃꯁꯨꯡ ꯆꯍꯤ ꯇꯔꯥꯒꯤ ꯈꯨꯖꯤꯡ ꯑꯁꯤ ꯍꯧꯔꯀꯄꯒꯤ ꯆꯍꯤ ꯑꯍꯨꯃꯒꯤ ꯃꯇꯨꯡꯗꯥ ꯑꯣꯏꯔꯤ꯫ | 1127.27 | 12.09 | 12.27 | 1.12 | 2.65 |
Malayalam | 248 | 95 | ഇത് 2023 മെയ് മാസമാണ്, ഇത് വേനൽക്കാലത്തിന്റെ മധ്യവും ദശകം ആരംഭിച്ച് മൂന്ന് വർഷത്തിന് ശേഷവുമാണ്. | 1027.27 | 4.40 | 11.27 | 1.04 | 2.61 |
Tamil | 233 | 90 | இது மே 2023, இது கோடையின் நடுப்பகுதி மற்றும் தசாப்தம் தொடங்கி மூன்று ஆண்டுகளுக்குப் பிறகு. | 959.09 | -1.10 | 10.59 | 0.99 | 2.59 |
Telugu | 194 | 77 | ఇది మే 2023, ఇది వేసవి మధ్యలో మరియు దశాబ్దం ప్రారంభమైన మూడు సంవత్సరాల తర్వాత. | 781.82 | -15.38 | 8.82 | 0.85 | 2.52 |
Kannada | 163 | 67 | ಇದು ಮೇ 2023, ಇದು ಬೇಸಿಗೆಯ ಮಧ್ಯಭಾಗ ಮತ್ತು ದಶಕದ ಆರಂಭದ ಮೂರು ವರ್ಷಗಳ ನಂತರ. | 640.91 | -26.37 | 7.41 | 0.74 | 2.43 |
Gujarati | 176 | 73 | આ મે 2023 છે, તે ઉનાળાની મધ્યમાં છે અને દાયકાની શરૂઆતના ત્રણ વર્ષ પછી છે. | 700.00 | -19.78 | 8.00 | 0.80 | 2.41 |
Sinhala | 178 | 74 | මෙය 2023 මැයි, එය ගිම්හානයේ මැද භාගය වන අතර දශකය ආරම්භ වී වසර තුනකට පසුවය. | 709.09 | -18.68 | 8.09 | 0.81 | 2.41 |
Odia | 161 | 67 | ଏହା ମେ 2023, ଏହା ଗ୍ରୀଷ୍ମର ମଧ୍ୟଭାଗ ଏବଂ ଦଶନ୍ଧି ଆରମ୍ଭର ତିନି ବର୍ଷ ପରେ | | 631.82 | -26.37 | 7.32 | 0.74 | 2.40 |
Lao | 240 | 101 | ນີ້ແມ່ນເດືອນພຶດສະພາ 2023, ມັນເປັນກາງຮ້ອນແລະສາມປີຫຼັງຈາກການເລີ່ມຕົ້ນຂອງທົດສະວັດ. | 990.91 | 10.99 | 10.91 | 1.11 | 2.38 |
Georgian | 211 | 89 | ეს არის 2023 წლის მაისი, ეს არის ზაფხულის შუა რიცხვები და ათწლეულის დაწყებიდან სამი წელი. | 859.09 | -2.20 | 9.59 | 0.98 | 2.37 |
Amharic | 149 | 67 | ይህ ግንቦት 2023 ነው፣ የበጋው አጋማሽ እና ከአስር አመታት መጀመሪያ በኋላ ከሶስት አመታት በኋላ ነው። | 577.27 | -26.37 | 6.77 | 0.74 | 2.22 |
Tigrinya | 131 | 59 | እዚ ግንቦት 2023 ኮይኑ፡ መፋርቕ ሓጋይን ድሕሪ ሰለስተ ዓመት ምጅማር ዓሰርተ ዓመትን እዩ። | 495.45 | -35.16 | 5.95 | 0.65 | 2.22 |
Assamese | 157 | 75 | এয়া ২০২৩ চনৰ মে’ মাহৰ, গ্ৰীষ্মৰ মাজভাগ আৰু দশক আৰম্ভ হোৱাৰ তিনি বছৰৰ পাছত। | 613.64 | -17.58 | 7.14 | 0.82 | 2.09 |
Korean | 91 | 48 | 지금은 2023년 5월, 여름의 한가운데, 십년의 시작으로부터 3년이 지난 시점입니다. | 313.64 | -47.25 | 4.14 | 0.53 | 1.90 |
Thai | 135 | 72 | นี่คือเดือนพฤษภาคม 2023 ซึ่งเป็นกลางฤดูร้อนและสามปีหลังจากเริ่มต้นทศวรรษ | 513.64 | -20.88 | 6.14 | 0.79 | 1.88 |
Bengali | 133 | 73 | এটি 2023 সালের মে, এটি গ্রীষ্মের মাঝামাঝি এবং দশক শুরু হওয়ার তিন বছর পর। | 504.55 | -19.78 | 6.05 | 0.80 | 1.82 |
Dhivehi | 209 | 116 | މިއީ 2023 ވަނަ އަހަރުގެ މެއި މަހުގެ ތެރޭގައި ހޫނު މޫސުމުގެ މެދުތެރެއާއި ދިހަ އަހަރު ފެށުނުތާ ތިން އަހަރު ފަހުންނެވެ. | 850.00 | 27.47 | 9.50 | 1.27 | 1.80 |
Armenian | 134 | 78 | Սա 2023 թվականի մայիսին է, ամառվա կեսն է և տասնամյակի մեկնարկից երեք տարի անց: | 509.09 | -14.29 | 6.09 | 0.86 | 1.72 |
Sanskrit | 154 | 91 | एषः २०२३ तमस्य वर्षस्य मे-मासः, ग्रीष्मकालस्य मध्यभागः, दशकस्य आरम्भात् वर्षत्रयानन्तरं च । | 600.00 | 0.00 | 7.00 | 1.00 | 1.69 |
Nepali | 98 | 62 | यो मे २०२३ हो, यो गर्मीको मध्य र दशक सुरु भएको तीन वर्षपछि हो। | 345.45 | -31.87 | 4.45 | 0.68 | 1.58 |
Chinese | 50 | 32 | 现在是 2023 年 5 月,正值盛夏,也是十年开始后的第三年。 | 127.27 | -64.84 | 2.27 | 0.35 | 1.56 |
Konkani | 95 | 61 | हो मे २०२३, उमाशेचो मध्य आनी दशक सुरू जाले उपरांत तीन वर्सां. | 331.82 | -32.97 | 4.32 | 0.67 | 1.56 |
Punjabi | 113 | 77 | ਇਹ ਮਈ 2023 ਹੈ, ਇਹ ਗਰਮੀਆਂ ਦਾ ਮੱਧ ਹੈ ਅਤੇ ਦਹਾਕੇ ਦੀ ਸ਼ੁਰੂਆਤ ਤੋਂ ਤਿੰਨ ਸਾਲ ਬਾਅਦ ਹੈ। | 413.64 | -15.38 | 5.14 | 0.85 | 1.47 |
Maithili | 108 | 74 | ई मई २०२३ के बात छै, गर्मी के मध्य छै आरू दशक शुरू होय के तीन साल बाद छै । | 390.91 | -18.68 | 4.91 | 0.81 | 1.46 |
Marathi | 102 | 71 | हा मे 2023 आहे, हा उन्हाळ्याचा मध्य आणि दशक सुरू होऊन तीन वर्षांनी आहे. | 363.64 | -21.98 | 4.64 | 0.78 | 1.44 |
Kurdish (Sorani) | 126 | 89 | ئەمە مانگی ئایاری ٢٠٢٣یە، ناوەڕاستی هاوینە و سێ ساڵ بەسەر دەستپێکردنی دەیەیەکدا تێپەڕیوە. | 472.73 | -2.20 | 5.73 | 0.98 | 1.42 |
Hindi | 102 | 73 | यह मई 2023 है, यह गर्मियों का मध्य है और दशक की शुरुआत के तीन साल बाद है। | 363.64 | -19.78 | 4.64 | 0.80 | 1.40 |
Dogri | 108 | 78 | एह् मई 2023 दा ऐ, गर्मियें दा मझाटले दौर ऐ ते दशक शुरू होने दे त्रै साल बाद ऐ। | 390.91 | -14.29 | 4.91 | 0.86 | 1.38 |
Bhojpuri | 83 | 62 | ई मई 2023 के ह, गर्मी के बीच ह आ दशक के शुरुआत के तीन साल बाद। | 277.27 | -31.87 | 3.77 | 0.68 | 1.34 |
Urdu | 93 | 75 | یہ مئی 2023 ہے، یہ موسم گرما کا وسط ہے اور دہائی کے آغاز کے تین سال بعد ہے۔ | 322.73 | -17.58 | 4.23 | 0.82 | 1.24 |
Uyghur | 83 | 69 | بۇ 2023-يىلى ماي ، يازنىڭ ئوتتۇرىسى ، ئون يىل ئۆتۈپ ئۈچ يىلدىن كېيىن. | 277.27 | -24.18 | 3.77 | 0.76 | 1.20 |
Yiddish | 114 | 96 | דאָס איז מאי 2023, דאָס איז די מיטן פון זומער און דריי יאָר נאָך די אָנהייב פון די יאָרצענדלינג. | 418.18 | 5.49 | 5.18 | 1.05 | 1.19 |
Tatar | 73 | 64 | Бу 2023 елның мае, җәй уртасы һәм декада башланганнан соң өч ел. | 231.82 | -29.67 | 3.32 | 0.70 | 1.14 |
Kazakh | 93 | 82 | Бұл 2023 жылдың мамыры, бұл жаздың ортасы және онжылдық басталғаннан кейін үш жыл. | 322.73 | -9.89 | 4.23 | 0.90 | 1.13 |
Kyrgyz | 102 | 92 | Бул 2023-жылдын май айы, жайдын ортосу жана он жылдыктын башталышынан үч жыл өткөндөн кийин. | 363.64 | 1.10 | 4.64 | 1.01 | 1.11 |
Hebrew | 61 | 56 | זהו מאי 2023, זהו אמצע הקיץ ושלוש שנים לאחר תחילת העשור. | 177.27 | -38.46 | 2.77 | 0.62 | 1.09 |
Mongolian | 100 | 92 | Энэ бол 2023 оны тавдугаар сар, зуны дунд сар, арван жил эхэлснээс хойш гурван жилийн дараа. | 354.55 | 1.10 | 4.55 | 1.01 | 1.09 |
Pashto | 86 | 80 | دا د 2023 می میاشت ده، دا د اوړي منځنۍ او د لسیزې له پیل څخه درې کاله وروسته ده. | 290.91 | -12.09 | 3.91 | 0.88 | 1.08 |
Belarusian | 76 | 71 | Гэта травень 2023 года, сярэдзіна лета і тры гады пасля пачатку дэкады. | 245.45 | -21.98 | 3.45 | 0.78 | 1.07 |
Greek | 113 | 106 | Αυτός είναι ο Μάιος του 2023, είναι τα μέσα του καλοκαιριού και τρία χρόνια μετά την έναρξη της δεκαετίας. | 413.64 | 16.48 | 5.14 | 1.16 | 1.07 |
Tajik | 86 | 81 | Ин моҳи майи соли 2023 аст, миёнаҳои тобистон ва се сол пас аз оғози даҳсола аст. | 290.91 | -10.99 | 3.91 | 0.89 | 1.06 |
Serbian | 73 | 69 | Ово је мај 2023, средина је лета и три године након почетка деценије. | 231.82 | -24.18 | 3.32 | 0.76 | 1.06 |
Ukrainian | 76 | 73 | Це травень 2023 року, середина літа і три роки після початку десятиліття. | 245.45 | -19.78 | 3.45 | 0.80 | 1.04 |
Sindhi | 78 | 75 | هي مئي 2023 آهي، اهو اونهاري جي وچ ۾ آهي ۽ ڏهاڪي جي شروعات کان ٽي سال پوءِ. | 254.55 | -17.58 | 3.55 | 0.82 | 1.04 |
Japanese | 40 | 39 | これは 2023 年 5 月、夏の真ん中、10 年の始まりから 3 年後です。 | 81.82 | -57.14 | 1.82 | 0.43 | 1.03 |
Macedonian | 85 | 83 | Ова е мај 2023 година, средината на летото и три години по почетокот на деценијата. | 286.36 | -8.79 | 3.86 | 0.91 | 1.02 |
Russian | 64 | 64 | Это май 2023 года, середина лета и три года после начала декады. | 190.91 | -29.67 | 2.91 | 0.70 | 1.00 |
Bulgarian | 83 | 84 | Това е май 2023 г., средата на лятото е и три години след началото на десетилетието. | 277.27 | -7.69 | 3.77 | 0.92 | 0.99 |
Persian | 53 | 55 | این می 2023 است، اواسط تابستان و سه سال پس از شروع دهه. | 140.91 | -39.56 | 2.41 | 0.60 | 0.96 |
Arabic | 57 | 62 | هذا هو مايو 2023 ، منتصف الصيف وبعد ثلاث سنوات من بداية العقد. | 159.09 | -31.87 | 2.59 | 0.68 | 0.92 |
Arabic | 57 | 62 | هذا هو مايو 2023 ، منتصف الصيف وبعد ثلاث سنوات من بداية العقد. | 159.09 | -31.87 | 2.59 | 0.68 | 0.92 |
Vietnamese | 72 | 82 | Bây giờ là tháng 5 năm 2023, tức là giữa mùa hè và ba năm sau khi bắt đầu thập kỷ. | 227.27 | -9.89 | 3.27 | 0.90 | 0.88 |
Yoruba | 57 | 74 | Eyi jẹ May 2023, o jẹ aarin igba ooru ati ọdun mẹta lẹhin ibẹrẹ ọdun mẹwa. | 159.09 | -18.68 | 2.59 | 0.81 | 0.77 |
Igbo | 59 | 83 | Nke a bụ Mee 2023, ọ bụ etiti oge ọkọchị na afọ atọ ka mmalite nke afọ iri gachara. | 168.18 | -8.79 | 2.68 | 0.91 | 0.71 |
Turkmen | 49 | 72 | Bu 2023-nji ýylyň maýy, tomsuň ortasy we onýyllygyň başyndan üç ýyl soň. | 122.73 | -20.88 | 2.23 | 0.79 | 0.68 |
Ewe | 57 | 91 | Esia nye May 2023, enye dzomeŋɔli ƒe domedome eye ƒe etɔ̃ le ƒe ewoawo ƒe gɔmedzedze megbe. | 159.09 | 0.00 | 2.59 | 1.00 | 0.63 |
Azerbaijani | 48 | 84 | Bu, 2023-cü ilin may ayıdır, yayın ortasıdır və onilliyin başlamasından üç il sonra. | 118.18 | -7.69 | 2.18 | 0.92 | 0.57 |
Bambara | 43 | 76 | Nin ye mɛkalo san 2023 ye, samiɲɛ cɛmancɛ don, san tan daminɛ kɔfɛ san saba. | 95.45 | -16.48 | 1.95 | 0.84 | 0.57 |
Twi | 44 | 78 | Eyi yɛ May 2023, ɛyɛ awɔw bere mfinimfini na mfe du no mfiase akyi mfe abiɛsa. | 100.00 | -14.29 | 2.00 | 0.86 | 0.56 |
Turkish | 37 | 66 | Bu Mayıs 2023, yaz ortası ve on yılın başlangıcından üç yıl sonra. | 68.18 | -27.47 | 1.68 | 0.73 | 0.56 |
Hawaiian | 64 | 117 | ʻO Mei 2023 kēia, ʻo ia ka waena o ke kauwela a ʻekolu mau makahiki ma hope o ka hoʻomaka ʻana o nā makahiki he ʻumi. | 190.91 | 28.57 | 2.91 | 1.29 | 0.55 |
Kurdish (Kurmanji) | 37 | 68 | Ev Gulana 2023-an e, nîvê havînê û sê sal piştî destpêka dehsalê ye. | 68.18 | -25.27 | 1.68 | 0.75 | 0.54 |
Hungarian | 36 | 67 | Ez 2023 májusa, nyár közepe és három évvel az évtized kezdete után. | 63.64 | -26.37 | 1.64 | 0.74 | 0.54 |
Krio | 36 | 67 | Dis na May 2023, na di midul fɔ sɔm ɛn tri ia afta di tɛn ia bigin. | 63.64 | -26.37 | 1.64 | 0.74 | 0.54 |
Lithuanian | 46 | 86 | Tai 2023 m. gegužės mėn., yra vasaros vidurys ir treji metai nuo dešimtmečio pradžios. | 109.09 | -5.49 | 2.09 | 0.95 | 0.53 |
Czech | 35 | 66 | Je květen 2023, je polovina léta a tři roky po začátku desetiletí. | 59.09 | -27.47 | 1.59 | 0.73 | 0.53 |
Guarani | 39 | 75 | Kóva jasypokõi 2023, ha'e arahaku mbyte ha mbohapy ary oñepyrû rire década. | 77.27 | -17.58 | 1.77 | 0.82 | 0.52 |
Welsh | 39 | 75 | Mai 2023 yw hwn, mae'n ganol yr haf a thair blynedd ar ôl dechrau'r degawd. | 77.27 | -17.58 | 1.77 | 0.82 | 0.52 |
Maltese | 40 | 77 | Dan huwa Mejju 2023, huwa nofs is-sajf u tliet snin wara l-bidu tad-deċennju. | 81.82 | -15.38 | 1.82 | 0.85 | 0.52 |
Albanian | 39 | 76 | Ky është maji 2023, është mesi i verës dhe tre vjet pas fillimit të dekadës. | 77.27 | -16.48 | 1.77 | 0.84 | 0.51 |
Latvian | 39 | 78 | Šis ir 2023. gada maijs, ir vasaras vidus un trīs gadi pēc desmitgades sākuma. | 77.27 | -14.29 | 1.77 | 0.86 | 0.50 |
Polish | 33 | 66 | Jest maj 2023 roku, środek lata i trzy lata po rozpoczęciu dekady. | 50.00 | -27.47 | 1.50 | 0.73 | 0.50 |
Icelandic | 38 | 77 | Þetta er maí 2023, það er mitt sumar og þremur árum eftir upphaf áratugarins. | 72.73 | -15.38 | 1.73 | 0.85 | 0.49 |
Finnish | 41 | 84 | Tämä on toukokuu 2023, on keskikesä ja kolme vuotta vuosikymmenen alkamisen jälkeen. | 86.36 | -7.69 | 1.86 | 0.92 | 0.49 |
Oromo | 47 | 98 | Kun Caamsaa 2023 yoo ta'u, walakkeessa gannaa yoo ta'u, jalqaba kurmaana booda waggaa sadii booda. | 113.64 | 7.69 | 2.14 | 1.08 | 0.48 |
Hmong | 57 | 119 | Qhov no yog lub Tsib Hlis 2023, nws yog nruab nrab ntawm lub caij ntuj sov thiab peb xyoos tom qab pib ntawm kaum xyoo. | 159.09 | 30.77 | 2.59 | 1.31 | 0.48 |
Slovak | 33 | 69 | Toto je máj 2023, je polovica leta a tri roky po začiatku desaťročia. | 50.00 | -24.18 | 1.50 | 0.76 | 0.48 |
Corsican | 41 | 86 | Questu hè di maghju 2023, hè a mità di l'estiu è trè anni dopu à l'iniziu di a dicada. | 86.36 | -5.49 | 1.86 | 0.95 | 0.48 |
Scots Gaelic | 50 | 109 | Is e seo Cèitean 2023, is e meadhan an t-samhraidh a th’ ann agus trì bliadhna às deidh toiseach na deichead. | 127.27 | 19.78 | 2.27 | 1.20 | 0.46 |
Estonian | 33 | 72 | See on mai 2023, on suve keskpaik ja kolm aastat pärast kümnendi algust. | 50.00 | -20.88 | 1.50 | 0.79 | 0.46 |
Swedish | 37 | 81 | Det här är maj 2023, det är mitt i sommaren och tre år efter början av decenniet. | 68.18 | -10.99 | 1.68 | 0.89 | 0.46 |
Romanian | 35 | 77 | Este mai 2023, este mijlocul verii și la trei ani de la începutul deceniului. | 59.09 | -15.38 | 1.59 | 0.85 | 0.45 |
Shona | 43 | 95 | Uyu ndiChivabvu 2023, ndipo pakati pezhizha uye makore matatu mushure mekutanga kwemakore gumi. | 95.45 | 4.40 | 1.95 | 1.04 | 0.45 |
Chichewa | 38 | 84 | Izi ndi Meyi 2023, ndipakati pachilimwe komanso zaka zitatu chiyambireni zaka khumi. | 72.73 | -7.69 | 1.73 | 0.92 | 0.45 |
Uzbek | 36 | 80 | Bu 2023 yil may oyi, yozning o'rtasi va o'n yillik boshlanganidan uch yil o'tib. | 63.64 | -12.09 | 1.64 | 0.88 | 0.45 |
Catalan | 40 | 89 | És el maig del 2023, és a mitjans de l'estiu i tres anys després de l'inici de la dècada. | 81.82 | -2.20 | 1.82 | 0.98 | 0.45 |
Kinyarwanda | 35 | 78 | Ni Gicurasi 2023, ni hagati yizuba nimyaka itatu nyuma yimyaka icumi itangiye. | 59.09 | -14.29 | 1.59 | 0.86 | 0.45 |
Portuguese | 35 | 78 | Estamos em maio de 2023, no meio do verão e três anos após o início da década. | 59.09 | -14.29 | 1.59 | 0.86 | 0.45 |
Xhosa | 44 | 99 | Lo nguMeyi ka-2023, kuphakathi kwehlobo kunye neminyaka emithathu emva kokuqala kweshumi leminyaka. | 100.00 | 8.79 | 2.00 | 1.09 | 0.44 |
Croatian | 34 | 77 | Ovo je svibanj 2023., sredina je ljeta i tri godine nakon početka desetljeća. | 54.55 | -15.38 | 1.55 | 0.85 | 0.44 |
Mizo | 36 | 82 | Hei hi May 2023 a ni a, nipui lai a ni a, kum sawm tan atanga kum thum hnuah a ni. | 63.64 | -9.89 | 1.64 | 0.90 | 0.44 |
Zulu | 42 | 96 | Lona nguMeyi 2023, kumaphakathi nehlobo neminyaka emithathu ngemuva kokuqala kweshumi leminyaka. | 90.91 | 5.49 | 1.91 | 1.05 | 0.44 |
Ilocano | 38 | 87 | Mayo 2023 daytoy, tengnga ti kalgaw ken tallo a tawen kalpasan ti panangrugi ti dekada. | 72.73 | -4.40 | 1.73 | 0.96 | 0.44 |
Malagasy | 38 | 87 | Mey 2023 izao, afovoan'ny fahavaratra ary telo taona aorian'ny fiandohan'ny folo taona. | 72.73 | -4.40 | 1.73 | 0.96 | 0.44 |
Basque | 30 | 69 | Hau 2023ko maiatza da, uda erdian eta hamarkada hasi eta hiru urtera. | 36.36 | -24.18 | 1.36 | 0.76 | 0.43 |
Luganda | 43 | 99 | Guno May 2023, mu makkati g’omusana ate nga wayise emyaka esatu bukya emyaka ekkumi gitandikiddewo. | 95.45 | 8.79 | 1.95 | 1.09 | 0.43 |
Aymara | 46 | 106 | Akax mayo phaxsin 2023 maranwa, chika jallupachankiwa ukatx tunka mara qalltatapatx kimsa maraw sarawayxi. | 109.09 | 16.48 | 2.09 | 1.16 | 0.43 |
Luxembourgish | 35 | 82 | Dëst ass Mee 2023, et ass d'Mëtt vum Summer an dräi Joer nom Ufank vun der Dekade. | 59.09 | -9.89 | 1.59 | 0.90 | 0.43 |
Irish | 40 | 94 | Is é seo Bealtaine 2023, is é lár an tsamhraidh agus trí bliana tar éis thús na deich mbliana. | 81.82 | 3.30 | 1.82 | 1.03 | 0.43 |
Bosnian | 33 | 78 | Ovo je maj 2023. godine, sredina je ljeta i tri godine nakon početka decenije. | 50.00 | -14.29 | 1.50 | 0.86 | 0.42 |
Norwegian | 34 | 81 | Dette er mai 2023, det er midt på sommeren og tre år etter begynnelsen av tiåret. | 54.55 | -10.99 | 1.55 | 0.89 | 0.42 |
Sudanese | 31 | 74 | Ieu Méi 2023, éta tengah usum panas sareng tilu taun saatos mimiti dékade. | 40.91 | -18.68 | 1.41 | 0.81 | 0.42 |
Slovenian | 28 | 67 | To je maj 2023, je sredi poletja in tri leta po začetku desetletja. | 27.27 | -26.37 | 1.27 | 0.74 | 0.42 |
Danish | 33 | 79 | Det er maj 2023, det er midt på sommeren og tre år efter begyndelsen af årtiet. | 50.00 | -13.19 | 1.50 | 0.87 | 0.42 |
Sesotho | 45 | 108 | Mona ke Mots'eanong 2023, ke bohareng ba lehlabula le lilemo tse tharo kamora ho qala ha lilemo tse leshome. | 104.55 | 18.68 | 2.05 | 1.19 | 0.42 |
Cebuano | 44 | 106 | Kini mao ang Mayo 2023, kini mao ang tunga-tunga sa ting-init ug tulo ka tuig human sa pagsugod sa dekada. | 100.00 | 16.48 | 2.00 | 1.16 | 0.42 |
French | 41 | 99 | Nous sommes en mai 2023, nous sommes au milieu de l'été et trois ans après le début de la décennie. | 86.36 | 8.79 | 1.86 | 1.09 | 0.41 |
Haitian Creole | 31 | 75 | Sa a se me 2023, li se mitan an nan ete ak twa ane apre kòmansman deseni a. | 40.91 | -17.58 | 1.41 | 0.82 | 0.41 |
Esperanto | 38 | 92 | Ĉi tio estas majo 2023, estas la mezo de somero kaj tri jaroj post la komenco de la jardeko. | 72.73 | 1.10 | 1.73 | 1.01 | 0.41 |
Italian | 34 | 83 | Siamo nel maggio 2023, siamo in piena estate e tre anni dopo l'inizio del decennio. | 54.55 | -8.79 | 1.55 | 0.91 | 0.41 |
Filipino | 43 | 105 | Ito ay Mayo 2023, ito ay ang kalagitnaan ng tag-araw at tatlong taon pagkatapos ng pagsisimula ng dekada. | 95.45 | 15.38 | 1.95 | 1.15 | 0.41 |
Samoan | 42 | 103 | O Me 2023 lenei, o le ogatotonu o le taumafanafana ma le tolu tausaga talu ona amata le sefulu tausaga. | 90.91 | 13.19 | 1.91 | 1.13 | 0.41 |
Lingala | 47 | 116 | Oyo ezali sanza ya mitano 2023, ezali katikati ya eleko ya molunge mpe mbula misato nsima ya ebandeli ya mbula zomi. | 113.64 | 27.47 | 2.14 | 1.27 | 0.41 |
Malay | 38 | 94 | Ini adalah Mei 2023, ia adalah pertengahan musim panas dan tiga tahun selepas permulaan dekad. | 72.73 | 3.30 | 1.73 | 1.03 | 0.40 |
Somali | 35 | 87 | Tani waa Maajo 2023, waa bartamihii xagaaga iyo saddex sano kadib bilawga tobanka sano. | 59.09 | -4.40 | 1.59 | 0.96 | 0.40 |
Indonesian | 37 | 92 | Ini Mei 2023, saat itu pertengahan musim panas dan tiga tahun setelah dimulainya dekade ini. | 68.18 | 1.10 | 1.68 | 1.01 | 0.40 |
Maori | 40 | 100 | Ko Mei 2023 tenei, ko te waenganui o te raumati me te toru tau i muri i te timatanga o te tekau tau. | 81.82 | 9.89 | 1.82 | 1.10 | 0.40 |
Tsonga | 40 | 101 | Leri i May 2023, i xikarhi ka ximumu naswona endzhaku ka malembe manharhu kusungule khume ra malembe. | 81.82 | 10.99 | 1.82 | 1.11 | 0.40 |
Galician | 32 | 81 | Este é maio de 2023, é a metade do verán e tres anos despois do inicio da década. | 45.45 | -10.99 | 1.45 | 0.89 | 0.40 |
German | 32 | 81 | Wir sind im Mai 2023, mitten im Sommer und drei Jahre nach Beginn des Jahrzehnts. | 45.45 | -10.99 | 1.45 | 0.89 | 0.40 |
Hausa | 35 | 89 | Wannan shine Mayu 2023, shine tsakiyar bazara kuma shekaru uku bayan farkon shekaru goma. | 59.09 | -2.20 | 1.59 | 0.98 | 0.39 |
Javanese | 31 | 79 | Iki Mei 2023, iku tengah mangsa panas lan telung taun sawise wiwitan dasawarsa. | 40.91 | -13.19 | 1.41 | 0.87 | 0.39 |
Quechua | 39 | 100 | Kayqa aymuray killapi 2023 watapi, chawpi chiri killa, chunka wata qallarisqanmanta kimsa watamanta. | 77.27 | 9.89 | 1.77 | 1.10 | 0.39 |
Spanish | 36 | 93 | Estamos en mayo de 2023, estamos en pleno verano y tres años después del inicio de la década. | 63.64 | 2.20 | 1.64 | 1.02 | 0.39 |
Sepedi | 41 | 106 | Ye ke Motsheganong 2023, ke bogareng bja selemo le mengwaga ye meraro ka morago ga go thoma ga ngwagasome. | 86.36 | 16.48 | 1.86 | 1.16 | 0.39 |
Dutch | 33 | 87 | Dit is mei 2023, het is midden in de zomer en drie jaar na het begin van het decennium. | 50.00 | -4.40 | 1.50 | 0.96 | 0.38 |
Frisian | 34 | 90 | Dit is maaie 2023, it is midden yn 'e simmer en trije jier nei it begjin fan it desennium. | 54.55 | -1.10 | 1.55 | 0.99 | 0.38 |
Swahili | 34 | 90 | Hii ni Mei 2023, ni katikati ya majira ya joto na miaka mitatu baada ya kuanza kwa muongo. | 54.55 | -1.10 | 1.55 | 0.99 | 0.38 |
Afrikaans | 32 | 90 | Dit is Mei 2023, dit is die middel van die somer en drie jaar na die begin van die dekade. | 45.45 | -1.10 | 1.45 | 0.99 | 0.36 |
Latin | 23 | 72 | Hoc est Maii 2023, media aestate tribus annis post initium decennii est. | 4.55 | -20.88 | 1.05 | 0.79 | 0.32 |
English | 22 | 91 | This is May 2023, it is the middle of summer and three years after the start of the decade. | 0.00 | 0.00 | 1.00 | 1.00 | 0.24 |