
As we continue to scale up our experiments cataloging the underlying stories being covered across the world's television news coverage each day, one of the most important considerations is how we can help journalists and scholars look across the more than 150 languages and dialects represented in this vast archive. To date, we've used Google Translate to translate the last few years of Belarusian, Iranian, Russian and Ukrainian coverage for journalists and scholars. However, the combination of ASR, disfluent stream-of-consciousness speech of live television news coverage and the immense complexities of real-world discussion of breaking events from all across the world means those translations are not always the easiest to understand. Would switching to Gemini 2.5 for our translations yield better results? How do Gemini 2.5 Pro, Flash Thinking and Flash Non-Thinking compare? Do we still see the severe hallucination challenges we encountered in our older LLM translation evaluations?
The end result is that Gemini 2.5 yields vastly more understandable and fluent translations. It is particularly good at smoothing over the disfluencies of ASR transcription errors and realtime speech, yielding readily understandable transcripts. In a fascinating and unexpected turn, the Gemini translation is actually more accurate in at least one case, correcting "Malaysian Aviation Institute" to "Maersk". We do find that Gemini 2.5 Flash Non-Thinking hallucinates at least twice, fabricating the name and class of a vessel. However, enabling reasoning with Gemini 2.5 Flash Thinking and Pro eliminates this hallucination. We observed hallucination with Flash Non-Thinking on several occasions during our more extensive tests, but did not identify a single instance of such hallucination with either of the Gemini thinking models, demonstrating the critical importance of reasoning models to improving translation accuracy. Overall, the results presented here and in our ongoing more extensive testing suggests that Gemini 2.5 Flash Thinking yields results that not only vastly exceed Google Translate's fluency and understandability, but correct a number of ASR-induced translation errors.
For all Gemini tests below we use the following prompt:
Translate the following TEXT into English. Provide only the final translation.
Let's start with this story from a Chinese news broadcast from the start of last year, transcribed by Chirp:
2023 年12月31号单马航院发表名于该团的一艘船在红海遭到机了调查事的细节并进一步评式马石决定在未来48小时过红海的传播 根据美国中部发表的声明时间2023年12月30号晚8点30分左右公司运挂国的 美 海 军 派 出 多 架 直 升 机 击 沉 三 艘 湖 赛 武 装 快 艇 并 打 死 多 名 武 装 人 员 , 剩 下 的 一 艘 快 廷 快 速 离 开 现 场 。
Google Translate (note the mistranslation to "Malaysian Aviation Institute"):
On December 31, 2023, the Malaysian Aviation Institute announced that a ship of the regiment was attacked by a machine gun in the Red Sea. The Institute will investigate the details of the incident and further comment on the spread of the Red Sea in the next 48 hours. According to a statement issued by the Central United States, at around 8:30 pm on December 30, 2023, the U.S. Navy dispatched multiple helicopters to sink three Lake Sai armed speedboats and killed several armed personnel. The remaining speedboat quickly left the scene.
Gemini 2.5 Pro:
On December 31, 2023, Maersk announced that one of its ships was attacked in the Red Sea. To investigate the details of the incident and further assess the situation, the company has decided to pause all transits through the Red Sea for the next 48 hours. According to a statement from U.S. Central Command, at approximately 8:30 PM on December 30, 2023, U.S. Navy helicopters sank three Houthi armed speedboats and killed multiple armed personnel. The remaining speedboat fled the scene.
Gemini 2.5 Flash Thinking:
On December 31, 2023, Maersk issued a statement providing details of an incident where one of its vessels was attacked in the Red Sea, and further announced Maersk's decision to suspend all Red Sea transits for the next 48 hours. According to a statement issued by US Central Command (CENTCOM), at approximately 8:30 PM on December 30, 2023, the US Navy dispatched multiple helicopters in response to an attack on a US-flagged vessel operated by the company. They sank three Houthi armed speedboats and killed several armed personnel, while the remaining speedboat quickly departed the scene.
Gemini 2.5 Flash Non-Thinking. Note that it hallucinates and provides a vessel name and that it is a container ship (neither of which are present in the article):
On December 31, 2023, the MV United III, a vessel belonging to Maersk, reported a detailed investigation of an incident in the Red Sea and further commented that Maersk has decided to suspend all its vessels from transiting the Red Sea for the next 48 hours. According to a statement released by U.S. Central Command, at approximately 8:30 PM on December 30, 2023, a company container ship was attacked, and the U.S. Navy dispatched multiple helicopters, sinking three Houthi armed speedboats and killing several militants, while the remaining speedboat quickly departed the scene.
Similarly, this Chirp-transcribed story:
。 那 么 2024 年 , 新 的 一 年 各 地 呢 也 都 有 了 自 己 新 的 城 市 建 设 规 划 , 而 这 些 基 础 设 施 建 设 呢 离 不 开 一 种 装 备 叫 做 高 空 座 业 平 台 , 他 帮 助 建 设 人 员 更 加 安 全 清 变 的 穿 缩 在 高 处 作 业 。 这 几 年 我 们 在 大 星 机 场 亚 运 会 厂 馆 的 建 设 中 都 见 过 他 们 的 身 影 。 可 以 说 这 些 高 空 作 业 平 台 一 方 面 建 证 着 我 国 装 备 能 力 的 提 升 , 同 时 他 们 也 是 基 础 设 施 建 设 不 断 提 诉 的 一 个 缩 影 。 我 们 了 解 到 呢 在 好 的 众 鲁 , 我 们 现 在 呢 就 是 在 江 苏 徐 州 的 徐 公 集 团 , 可 以 看 到 虽 然 是 难 架 期 啊 , 但 是 整 个 总 装 车 间 特 别 的 热 闹 。 我 旁 边 的 这 台 生 产 线 已 经 排 满 了 。 呃 高 空 作 业 平 台 从 此 刻 开 始 所 有 的 运 型 数 据 , 而 一 个 个 这 样 的 数 据 呢 会 总 成 了 我 手 上 的 这 样 的 一 张 热 力 图 。 我 们 可 以 看 到 这 些 设 备 分 步 在 哪 ? 现 在 运 行 的 情 况 怎 么 样 ? 都 能 够 一 目 了 然 的 呈 现 。 高 空 作 业 平 台 呢 和 我 们 这 个 机 一 样 啊 它 其 实 是 基 础 设 施 建 设 中 必 不 可 少 的 一 环 。 那 么 新 年 开 始 了 各 地 都 会 加 大 这 个 基 础 设 置 建 设 的 布 局 , 所 以 我 们 整 个 场 区 。
Google Translate (note disfluencies and unclear wording like "the difficult period" and "increase the layout of infrastructure construction"):
In 2024, the new year, each region has its own new urban construction plan. The construction of these infrastructures is inseparable from a type of equipment called an aerial work platform. It helps construction workers work more safely and efficiently at heights. In recent years, we have seen these platforms in the construction of the Asian Games Pavilion at Daxing Airport. These aerial work platforms not only demonstrate the improvement of China's equipment capabilities, but also epitomize the continuous improvement of infrastructure construction. We understand that in Zhonglu, we are currently at Xugong Group in Xuzhou, Jiangsu. Despite the difficult period, the entire assembly shop is bustling with activity. The production line next to me is already fully booked. From this point on, the aerial work platform collects all operational data, and each piece of this data is compiled into a heat map like the one I have in my hand. We can see the location of each device and its current operating status at a glance. The aerial work platform, like this machine, is an essential part of infrastructure construction. Well, at the beginning of the new year, all places will increase the layout of infrastructure construction, so our entire venue.
Gemini 2.5 Pro (note the far more instructive and understandable "holiday period" and "expanding their infrastructure construction projects"):
So, in the new year of 2024, various regions have their own new urban construction plans, and this infrastructure work is inseparable from a type of equipment known as an aerial work platform. It helps construction personnel move and work more safely and conveniently at heights. In recent years, we have seen them during the construction of Daxing Airport and the Asian Games venues. You could say that on one hand, these aerial work platforms bear witness to the improvement of our country's equipment capabilities, while on the other, they are a microcosm of the accelerating pace of infrastructure construction. We are now at the XCMG Group in Xuzhou, Jiangsu, and as you can see, even though it's the holiday period, the entire final assembly workshop is exceptionally bustling. The production line next to me is already full. All the operational data from the aerial work platforms is aggregated into a heat map like the one I'm holding. We can see where this equipment is distributed and its current operational status, all presented clearly at a glance. Aerial work platforms are an indispensable part of infrastructure construction. Now that the new year has begun, various regions will be expanding their infrastructure construction projects, so our entire factory area…
Gemini 2.5 Flash Thinking:
In 2024, the new year has begun, and various regions have announced their new urban construction plans. This infrastructure development is inseparable from a type of equipment called aerial work platforms, which help construction personnel operate more safely and efficiently at heights. In recent years, we've seen them widely used in the construction of Daxing Airport and Asian Games venues. It can be said that these aerial work platforms, on the one hand, bear witness to the improvement of our country's equipment manufacturing capabilities, and at the same time, they are a microcosm of the continuous acceleration of infrastructure construction.
We understand that we are currently at XCMG Group in Xuzhou, Jiangsu. We can see that even though it's typically a slow period, the entire assembly workshop is exceptionally bustling. The production line next to me is already fully loaded with aerial work platforms. From this moment on, all operational data from these platforms, piece by piece, will be compiled into a heat map like the one I'm holding. We can see where these pieces of equipment are distributed and how they are currently operating, all presented clearly at a glance.
Aerial work platforms, like other essential machinery, are an indispensable part of infrastructure construction. As the new year begins, all regions will increase their investment in infrastructure development, and so our entire facility…
Gemini 2.5 Flash Non-Thinking:
So, with 2024, the new year, various regions have their new urban construction plans, and these infrastructure constructions are inseparable from a piece of equipment called an aerial work platform. It helps construction workers more safely and easily move and work at heights. In recent years, we've seen them in the construction of large airports and Asian Games venues. It can be said that these aerial work platforms, on one hand, bear witness to the improvement of our country's equipment capabilities, and at the same time, they are a microcosm of the continuous acceleration of infrastructure construction. We learned that in a good many places, we are now at XCMG Group in Xuzhou, Jiangsu. You can see that even though it's a holiday period, the entire assembly workshop is particularly bustling. The production line next to me is already full. Uh, aerial work platforms, from this moment on, all operational data, and such individual data will be aggregated into a heat map like the one in my hand. We can see where these devices are distributed and how they are currently operating, all presented at a glance. Aerial work platforms, like our machines, are actually an indispensable part of infrastructure construction. So, as the new year begins, various regions will increase their infrastructure construction layouts, which is why our entire site…
And this Chirp-transcribed story:
是 这 个 马 头 上 运 输 货 物 的 一 个 小 平 台 车 , 它 实 际 上 是 一 个 自 动 化 的 机 器 人 , 那 么 通 过 他 来 这 个 运 行 货 物 啊 , 比 那 种 传 统 , 从 企 业 家 看 来 , 那 么 它 是 一 个 利 润 的 增 长 点 , 从 投 资 商 看 来 呢 , 它 是 一 个 高 回 报 的 一 个 突 破 点 , 从 当 地 政 府 来 看 , 它 是 一 个 提 升 经 济 发 展 的 一 个 着 力 点
Google Translate (note the nonsensical "horse's head" and overall stilted wording of the translation):
This small platform vehicle on a horse's head is actually an automated robot. From the perspective of entrepreneurs, transporting goods is a profit growth point compared to traditional methods. From the perspective of investors, it is a breakthrough point with high returns. From the perspective of local governments, it is a focus for promoting economic development.
Gemini 2.5 Pro (note how "horse's head" becomes "small platform vehicle"):
This is a small platform vehicle for transporting goods on the terminal; it's actually an automated robot. Using it to transport goods, compared to traditional methods… From an entrepreneur's perspective, it is a point of profit growth. From an investor's perspective, it is a breakthrough for high returns. From the local government's perspective, it is a focal point for advancing economic development.
Gemini 2.5 Flash Thinking:
This is a small platform vehicle for transporting goods on the dock; it is actually an automated robot. Using it to transport goods, from the perspective of entrepreneurs, it represents a profit growth point. From the perspective of investors, it is a breakthrough point with high returns. From the perspective of local government, it is a key focus for boosting economic development.
Gemini 2.5 Flash Non-Thinking (even Flash Non-Thinking yields a more accurate translation than Google Translate in this case):
It's a small flatbed truck for transporting goods on this terminal; it's actually an automated robot. So, through it, the transportation of goods, compared to traditional methods, from an entrepreneur's perspective, it's a new profit growth point; from an investor's perspective, it's a breakthrough point with high returns; from the local government's perspective, it's a key leverage point for boosting economic development.