The GDELT Project

Experiments With Speech Transcription & Translation: Meta's SeamlessM4T Vs OpenAI's Whisper Vs GCP's STT+GT Vs GCP's USM/Chirp+GT

As the final piece in our our series of evaluating Meta's new SeamlessM4T multimodal translation model, let's evaluate its speech transcription and translation capability. Similar to OpenAI's Whisper, Seamless promises to be able to take an audio clip in any of 100+ languages and yield a textual translated transcript into any other language. Let's evaluate its performance and compare against OpenAI's Whisper and Google's Speech-to-Text V1 API and Google's Universal Speech Model (USM)/Chirp via Speech-to-Text V2 paired with the Google Translate API. The end result is that Seamless requires audio to be chunked into 5 second clips, which interrupts utterances mid-word and yields highly stilted text that is difficult to use. It is therefore not at all competitive with Whisper or commercial services like STT+GT. For a 5 minute chunk of audio, Whisper takes 43 seconds, including model load, on a V100, while USM/Chirp takes 10 seconds flat and as a hosted API does not require GPU access.

For our experiments below, we'll use this Russia 1 news broadcast from earlier today.

Since Seamless does not support MP4 audio extraction, we'll convert to MP3 first using ffmpeg. It also returns an error if the audio input exceeds 60 seconds, so we'll chop to the first 60 seconds:

rm RUSSIA1_20230823_110000_Vesti-crop.mp3; time ffmpeg -i ./RUSSIA1_20230823_110000_Vesti.mp4 -t 60 -ac 1 RUSSIA1_20230823_110000_Vesti-crop.mp3
time m4t_predict ./RUSSIA1_20230823_110000_Vesti-crop.mp3 s2tt eng --model_name seamlessM4T_large
time m4t_predict ./RUSSIA1_20230823_110000_Vesti-crop.mp3 s2tt eng --model_name seamlessM4T_medium

This yields the results below, which are repetitive gibberish:

Large: I'm going to tell you a little bit more about it, and I'm going to tell you a little bit more about it, and I'm going to tell you a little bit more about it.
Medium: (Laughter) (Applause) (Applause) (Applause) (Applause) (Applause) (Applause) (Applause) (Applause)

A closer inspection of the Hugging Face Seamless demo shows that their MP3 files are 32kbps at 24kHz mono audio, whereas ours is 56kbps at 44.1kHz mono. What might be the issue? Buried on the more detailed documentation page is that Seamless requires 16kHz audio.

So, let's repeat our process but convert to 16Khz while we're at it:

rm RUSSIA1_20230823_110000_Vesti-crop.mp3; time ffmpeg -i ./RUSSIA1_20230823_110000_Vesti.mp4 -t 60 -ac 1 -ar 16000 RUSSIA1_20230823_110000_Vesti-crop.mp3
time m4t_predict ./RUSSIA1_20230823_110000_Vesti-crop.mp3 s2tt eng --model_name seamlessM4T_large
time m4t_predict ./RUSSIA1_20230823_110000_Vesti-crop.mp3 s2tt eng --model_name seamlessM4T_medium

This time we get what appears to be reasonable, if a bit odd and stitled, results:

Large: Today, almost all the leaders have spoken out in favor of the expansion of the United States, so we will soon see a new abbreviation of the United States.
Medium: Today, in fact, all the leaders of the United States of America have affirmed the new status of the United States as an authoritarian state, so we may soon see the influence of the new members of the United States of America.

Let's proceed then and transcribe and translate the first 5 minutes in 60 second chunks:

rm RUSSIA1_20230823_110000_Vesti-crop.mp3; time ffmpeg -i ./RUSSIA1_20230823_110000_Vesti.mp4 -t 60 -ac 1 -ar 16000 RUSSIA1_20230823_110000_Vesti-crop.mp3
time m4t_predict ./RUSSIA1_20230823_110000_Vesti-crop.mp3 s2tt eng --model_name seamlessM4T_large
time m4t_predict ./RUSSIA1_20230823_110000_Vesti-crop.mp3 s2tt eng --model_name seamlessM4T_medium

Large:  Today, almost all the leaders have spoken out in favor of the expansion of the United States, so we will soon see a new abbreviation of the United States.
Medium: Today, in fact, all the leaders of the United States of America have affirmed the new status of the United States as an authoritarian state, so we may soon see the influence of the new members of the United States of America.

rm RUSSIA1_20230823_110000_Vesti-crop.mp3; time ffmpeg -i ./RUSSIA1_20230823_110000_Vesti.mp4 -ss 60 -t 60 -ac 1 -ar 16000 RUSSIA1_20230823_110000_Vesti-crop.mp3
time m4t_predict ./RUSSIA1_20230823_110000_Vesti-crop.mp3 s2tt eng --model_name seamlessM4T_large
time m4t_predict ./RUSSIA1_20230823_110000_Vesti-crop.mp3 s2tt eng --model_name seamlessM4T_medium

Large:  The desire of the Western countries to preserve their supremacy in the world has led to a severe crisis in Ukraine, and the defense of the Ukrainian military is being criticized by the Pentagon, which is preparing to attack the new regions.
Medium: "The desire of the Western countries to preserve their superiority in the world has led to a severe crisis in Ukraine," President Vladimir Putin said in a press release, criticizing the Ukrainian military's offensive in the Pentagon.

rm RUSSIA1_20230823_110000_Vesti-crop.mp3; time ffmpeg -i ./RUSSIA1_20230823_110000_Vesti.mp4 -ss 120 -t 60 -ac 1 -ar 16000 RUSSIA1_20230823_110000_Vesti-crop.mp3
time m4t_predict ./RUSSIA1_20230823_110000_Vesti-crop.mp3 s2tt eng --model_name seamlessM4T_large
time m4t_predict ./RUSSIA1_20230823_110000_Vesti-crop.mp3 s2tt eng --model_name seamlessM4T_medium

Large: It was obvious that the leader was preparing for the meeting in a very tight format, and in the middle of the first full-fledged meeting of the Heads of State and Government, we saw that the leaders of the five countries were coming here today.
Medium:  "Welcome Irina, what other statements were made on the road in Johannesburg, because the leaders of the delegation had been taking pictures for a long time, and it was clear that the leaders of the BRICS countries were preparing for the summit, and the leaders of the BRICS countries were preparing for the summit.

rm RUSSIA1_20230823_110000_Vesti-crop.mp3; time ffmpeg -i ./RUSSIA1_20230823_110000_Vesti.mp4 -ss 180 -t 60 -ac 1 -ar 16000 RUSSIA1_20230823_110000_Vesti-crop.mp3
time m4t_predict ./RUSSIA1_20230823_110000_Vesti-crop.mp3 s2tt eng --model_name seamlessM4T_large
time m4t_predict ./RUSSIA1_20230823_110000_Vesti-crop.mp3 s2tt eng --model_name seamlessM4T_medium

Large: The main thing is that we are all working together to solve the most pressing issues of global politics and the global economy, and our strategic alliance with the so-called global community has been strengthened.
Medium: As I said about the role of BRICS in the modern world, we are all unanimously advocating the solution of the most pressing issues of the global and regional order, the so-called global order, the so-called global order, and the so-called global order.

rm RUSSIA1_20230823_110000_Vesti-crop.mp3; time ffmpeg -i ./RUSSIA1_20230823_110000_Vesti.mp4 -ss 240 -t 60 -ac 1 -ar 16000 RUSSIA1_20230823_110000_Vesti-crop.mp3
time m4t_predict ./RUSSIA1_20230823_110000_Vesti-crop.mp3 s2tt eng --model_name seamlessM4T_large
time m4t_predict ./RUSSIA1_20230823_110000_Vesti-crop.mp3 s2tt eng --model_name seamlessM4T_medium

Large: I would also like to point out that this is a policy based on the continuation of colonialism and the respect of the rights of each people to their own model of development.
Medium:  I would also like to point out that it is the desire of some countries to preserve their sovereignty and respect for the rights of other nations, such as the G20 or the G7 that should be opposed.

It looks like we've got a reasonable workflow going! It takes around 20-30 seconds per 60 second chunk of audio above, including both model loading and transcription/translation.

Here are the final results concatenated together. At first this looks like quite decent results. However, there's a problem – comparing these results to the Visual Explorer version of the broadcast, these transcripts don't match the actual broadcast at all – they've been heavily fabricated (hallucinated), with elements of the source material guiding them and repurposed and massively truncated. The Medium model at least integrates BRICS, but both transcripts are both severely truncated and largely hallucinated:

Could this be the same length issue that Seamless has for text? Let's try 30, 15 and 5 second clips, below. The 30 and 15 second clips result in the exact same translation, while the 5 second clip yields something very different, reinforcing that this is a length limitation. Strangely, when trying larger clip sizes up to 100 seconds, we see that Seamless reports that it supports up to a maximum sequence length of around 4096, which works out to around 75 seconds or so.

rm RUSSIA1_20230823_110000_Vesti-crop.mp3; time ffmpeg -i ./RUSSIA1_20230823_110000_Vesti.mp4 -t 30 -ac 1 -ar 16000 RUSSIA1_20230823_110000_Vesti-crop.mp3
time m4t_predict ./RUSSIA1_20230823_110000_Vesti-crop.mp3 s2tt eng --model_name seamlessM4T_large
time m4t_predict ./RUSSIA1_20230823_110000_Vesti-crop.mp3 s2tt eng --model_name seamlessM4T_medium

Large: Today, virtually all the leaders spoke in favor of the expansion of the BRICS, so we may soon see a new abbreviation of the new members of the association.
Medium: Today, in fact, all the leaders have spoken out for the expansion of the BRICS, so we may soon see a new abbreviation of the new members of the union.

rm RUSSIA1_20230823_110000_Vesti-crop.mp3; time ffmpeg -i ./RUSSIA1_20230823_110000_Vesti.mp4 -t 15 -ac 1 -ar 16000 RUSSIA1_20230823_110000_Vesti-crop.mp3
time m4t_predict ./RUSSIA1_20230823_110000_Vesti-crop.mp3 s2tt eng --model_name seamlessM4T_large
time m4t_predict ./RUSSIA1_20230823_110000_Vesti-crop.mp3 s2tt eng --model_name seamlessM4T_medium

Large:  Today, virtually all leaders have spoken in favor of expanding the BRICS, so we may soon see a new abbreviation and new members of the union.
Medium: "Today, in fact, all the leaders have spoken out for the expansion of the BRICS, so we may soon see a new abbreviation and new members of the United Nations.

rm RUSSIA1_20230823_110000_Vesti-crop.mp3; time ffmpeg -i ./RUSSIA1_20230823_110000_Vesti.mp4 -t 5 -ac 1 -ar 16000 RUSSIA1_20230823_110000_Vesti-crop.mp3
time m4t_predict ./RUSSIA1_20230823_110000_Vesti-crop.mp3 s2tt eng --model_name seamlessM4T_large
time m4t_predict ./RUSSIA1_20230823_110000_Vesti-crop.mp3 s2tt eng --model_name seamlessM4T_medium

Large:  The BRICS countries would confirm the new status of an independent state.
Medium: The BRICS countries would confirm the new status of an independent state.

And continuing in 5 second chunks:

rm RUSSIA1_20230823_110000_Vesti-crop.mp3; time ffmpeg -i ./RUSSIA1_20230823_110000_Vesti.mp4 -ss 5 -t 5 -ac 1 -ar 16000 RUSSIA1_20230823_110000_Vesti-crop.mp3
time m4t_predict ./RUSSIA1_20230823_110000_Vesti-crop.mp3 s2tt eng --model_name seamlessM4T_large
time m4t_predict ./RUSSIA1_20230823_110000_Vesti-crop.mp3 s2tt eng --model_name seamlessM4T_medium

Large:  Today, virtually all leaders have spoken out in favor of expanding BRICS, so it's possible
Medium:  Today, in fact, all the leaders have spoken out for the expansion of the BRICS.

rm RUSSIA1_20230823_110000_Vesti-crop.mp3; time ffmpeg -i ./RUSSIA1_20230823_110000_Vesti.mp4 -ss 10 -t 5 -ac 1 -ar 16000 RUSSIA1_20230823_110000_Vesti-crop.mp3
time m4t_predict ./RUSSIA1_20230823_110000_Vesti-crop.mp3 s2tt eng --model_name seamlessM4T_large
time m4t_predict ./RUSSIA1_20230823_110000_Vesti-crop.mp3 s2tt eng --model_name seamlessM4T_medium

Large: Soon we will see a new abbreviation and new members of the association.
Medium: Soon we will see a new abbreviation and new members of the United Nations.

rm RUSSIA1_20230823_110000_Vesti-crop.mp3; time ffmpeg -i ./RUSSIA1_20230823_110000_Vesti.mp4 -ss 15 -t 5 -ac 1 -ar 16000 RUSSIA1_20230823_110000_Vesti-crop.mp3
time m4t_predict ./RUSSIA1_20230823_110000_Vesti-crop.mp3 s2tt eng --model_name seamlessM4T_large
time m4t_predict ./RUSSIA1_20230823_110000_Vesti-crop.mp3 s2tt eng --model_name seamlessM4T_medium

News and look, it was 60 minutes of goodbye and goodbye.
News and viewers, it was more than 60 minutes of all good and good news.

Putting together, we get the following as the first 15 seconds of the broadcast. Unfortunately, processing the audio as 5 second chunks yields poor performance:

In contrast, let's process under Whisper. This time we can process all 5 minutes in a single go, taking 43 seconds including model loading and processing for the 5 minute clip.

pip install -U openai-whisper
rm RUSSIA1_20230823_110000_Vesti-crop.mp3; time ffmpeg -i ./RUSSIA1_20230823_110000_Vesti.mp4 -ss 0 -t 300 RUSSIA1_20230823_110000_Vesti-crop.mp3
time whisper ./RUSSIA1_20230823_110000_Vesti-crop.mp3 --language Russian --task translate

This yields the following. Here we can see that the 5-second chunking for Seamless at least matches the gist this time, but at far worse performance:

[00:00.000 --> 00:03.500] The country in the BRICS would confirm the new status of the independent state.
[00:04.800 --> 00:08.800] Today, in fact, all the leaders have expressed their support for the expansion of the BRICS,
[00:08.800 --> 00:15.400] therefore, it is possible that very soon we will see a new abbreviation of the new members of the United Nations.
[00:16.100 --> 00:19.400] All you have to do is watch, it was only 60 minutes, all the best and goodbye.
[00:31.000 --> 00:33.000] The Russian TV Channel
[00:38.000 --> 00:42.000] Hello, on the Russian TV channel Vesti, in the studio of Irina Rosyus and the main topics of this time.
[00:47.000 --> 00:53.000] Our five of us, on the right, have confirmed the global arena as an authority structure,
[00:54.000 --> 00:58.000] an influence that is subsequently strengthened in the life of the business.
[00:59.000 --> 01:02.000] Vladimir Putin has performed on the plenary session of the leaders of the BRICS.
[01:04.000 --> 01:06.000] The American strikers do not stand out without a way out.
[01:07.000 --> 01:11.000] The defense of the admin of the disgusting NATO technique criticizes the Ukrainian military.
[01:11.000 --> 01:14.000] In the Pentagon, it teaches Kiev to fight and not to be considered lost.
[01:17.000 --> 01:21.000] The Kievan regime pushes its soldiers to the mine fields and under artillery strikes.
[01:21.000 --> 01:24.000] The situation in the new regions of Putin was discussed with the Baltic ambassador.
[01:25.000 --> 01:34.000] And in the Primorya, again, the flow that brought the necessary cyclone to the separate regions is preparing for the evacuation of the population.
[01:45.000 --> 01:50.000] The aspiration of the Western countries to preserve its superiority in the world led to a severe crisis in Ukraine.
[01:50.000 --> 01:54.000] The president, Vladimir Putin, has stated this in the plenary session of the BRICS summit.
[01:55.000 --> 01:59.000] And now, on the direct connection from Johannesburg, my colleague Alexey Golovko-Lexey comes out.
[01:59.000 --> 02:00.000] Hello, Alexey Golovko-Lexey.
[02:00.000 --> 02:05.000] What other statements have been made on the forum and what solutions, as expected, will the leaders of the five take?
[02:09.000 --> 02:10.000] Hello, Irina.
[02:10.000 --> 02:15.000] In Johannesburg, in the first full-fledged working day of the BRICS summit, this is already the 15th summit of the union,
[02:15.000 --> 02:19.000] and we have been watching all day, like here, to the center of the center.
[02:19.000 --> 02:26.000] Carthage is approaching the head of the state and those who represent the BRICS countries today.
[02:26.000 --> 02:29.000] I will remind you that Russia represents Sergey Lavrov.
[02:29.000 --> 02:31.000] Vladimir Putin is speaking on video communication.
[02:31.000 --> 02:35.000] So, the leaders of the delegation have been walking along the Kovrov road for a long time.
[02:35.000 --> 02:37.000] They photographed for a long time.
[02:37.000 --> 02:39.000] The view that the mood is good for everyone.
[02:39.000 --> 02:41.000] They posed for a long time, the photographers did not want to let them go.
[02:41.000 --> 02:49.000] And then, when they were already sitting in the hall for a meeting, so to speak, in a narrow format, it was clear that the leader was very carefully preparing for a speech.
[02:49.000 --> 02:51.000] Sidenpino helped the Minister of Foreign Affairs Vany.
[02:51.000 --> 02:53.000] He hinted at something there.
[02:53.000 --> 02:58.000] When Vladimir Putin was speaking, it was clear that a part of the speech was written by him by hand.
[02:58.000 --> 03:03.000] The Russian president has talked a lot about the role of the BRICS in the modern world today.
[03:03.000 --> 03:09.000] And what is the influence of this union on global politics and global economy?
[03:11.000 --> 03:17.000] Our five of us, on the right, have proven to be a global arena as an authority structure,
[03:17.000 --> 03:22.000] an influence that is subsequently strengthening in the state affairs.
[03:22.000 --> 03:30.000] The strategic course of the union, established in the future, is responsible for the tea of the main part of the international community,
[03:30.000 --> 03:33.000] the so-called world majority.
[03:33.000 --> 03:41.000] We are working on the principles of equal rights, partner support and taking into account the interests of each other.
[03:41.000 --> 03:48.000] We are solving the most fundamental issues of global and regional travel.
[03:48.000 --> 03:59.000] The main thing is that we are all united in the use of the formation of a multi-polar world order that is really fair and based on international law,
[03:59.000 --> 04:11.000] under the supervision of the key principles of the UN constitution, including the sovereign right and respect for each people on their own model of development.
[04:15.000 --> 04:22.000] Also, the Russian president has repeatedly emphasized that the BRICS is such a union that does not oppose itself,
[04:22.000 --> 04:32.000] but other international organizations, such as G20 or G7, that this is an economic and influential bloc, but to the point at which it is necessary to listen.
[04:32.000 --> 04:46.000] We are against any kind of digimony, which is propagandized by some countries by its exclusivity and based on this postulate of new politics,
[04:46.000 --> 04:50.000] which is not colonialism.
[04:50.000 --> 05:00.000] I would like to note that the aim is to preserve its hegemony in the world, the aim of some countries by its exclusivity.

What about Google's Speech to Text V1 API paired with Google Translate?

We'll use our STT wrapper toolkit we released yesterday:

wget https://storage.googleapis.com/data.gdeltproject.org/blog/2023-tvnewsstreammonitoring/vidcap_asrvideo.pl; chmod 755 vidcap_asrvideo.pl
time ./vidcap_asrvideo.pl --proj=[YOURPROJECTID] --filename=./RUSSIA1_20230823_110000_Vesti-crop.mp3 --gcs=gs://[YOURBUCKET]/ --model=latest_long --lang=ru-RU

Then we extract the transcript (asking for just the first of each alternate set of transcripts):

gsutil cp gs://[YOURBUCKET]/RUSSIA1_20230823_110000_Vesti-crop.mp3.asr.json .
cat RUSSIA1_20230823_110000_Vesti-crop.mp3.asr.json | jq -r .results[].alternatives[0].transcript

This yields:

Страны Брикс подтвердила бы новый статус независимой державы.
Сегодня фактически все лидеры высказались за расширение бритесь, поэтому возможно, Уже совсем скоро мы увидим новую аббревиатуру новых членов объединения вести. Смотрите, это, блядь, 60 минут всего доброго и до свидания.
Здравствуйте на телеканале Россия Вести в студии Ирина росиус и главная тема к этому часу
Утвердилась на глобальной арене. В качестве авторитетной структуры влияние, которые мировых делах последовательно укрепляется Владимир Путин выступил на пленарной сессии.
Лидеров американские страйкеры не выдерживают бездорожья защищённость, отвратительная натовскую технику критикуют украинские военные в пентагоне же учат Киев воевать и не считаться спасениями.
Киевский режим толкает своих солдат на минные поля и под артиллерийские удары ситуацию в новых регионах Путин обсудил с балетским и пасечником.
И в Приморье снова потопы, которые принёс Южный циклон в отдельных районах готовится к эвакуации населения.
западных стран сохранить своё превосходство в мире привело к тяжёлому кризису на Украине об этом заявил президент Владимир Путин выступая на пленарной сессии саммита Брикс
Ну а сейчас напрямую связь из йохана с бурга выходит Мой коллега Алексей Головко Алексей Здравствуйте Какие ещё заявления прозвучали на форуме и какие решения как ожидается примут лидеры пятёрки?
Ирина бьюханнесбург, я в разгаре первой полноценный рабочий день саммита Брикс – это уже пятнадцатый саммит объединение. Мы сегодня весь день наблюдали, как сюда акцентом к центру, а подъезжают к кортежи, глав государств и э тех, кто представляет сегодня в ЮАР страны Брикс напомню, что Россию здесь представляет, А Сергей Лавров Владимир Путин выступает по видеосвязи так вот, э, лидеры делегации долго проходили по ковровой дорожке, потом очень долго фотографировались видно, что настроение у всех хорошее и долго позировали фотограф. А те не хотели их отпускать. Ну и затем, Когда уже сели, а в зал для э заседания так называемого в узком формате было видно, что очень тщательно лидеры готовится к речину помогал министра иностранного дела войны, а что-то там подсказывал По бумагам. А когда выступал Владимир Путин было видно, что часть речи у неё написано от руки российский президент. Сегодня много говорил о роли Брикс в современном мире и о том, какой объединение Какое влияние это объединение оказывает на Глоба
политику и на глобальную экономику
наши Пятёрка по праву утвердилась на глобальной арене в качестве авторитетной структуры влияния, которые в мировых делах последовательно укрепляется.
Стратегический курс объединения, устремлённых в будущее отвечает чаянием основной части международного сообщества так называемого мирового большинства действуя слаженно.
На принципах равноправия партнёрской поддержки и учёта интересов друг друга. Мы занимаемся решением самых насущных вопросов глобальной и региональной поездки. Главное. Мы все единодушные выступаем в пользу формирования многополярного миропорядка, по-настоящему, справедливого и основанного на международном праве при соблюдении ключевых принципов ООН включая суверенное право и уважение право каждого народа на собственную модель для развития.
Также российские президент неоднократно подчёркивал, что Брикс – это такой объединение, которое не противо не противопоставляет себя а другим международным организациям таким как G20 или G7 что это экономические и влиятельный блок, Но к мнению, которого надо прислушиваться.
Гегемонии пропагандируемый некоторые страны своей исключительностью и основанный на этом постулате новой политики политики продолжающегося колониализма не около низко. Я хочу отметить, что именно стремление сохранить свою гегемонию в мире стремление некоторых стран своей

Which Google Translate translates as:

The Brix countries would confirm the new status of an independent power.
Today, virtually all leaders have spoken out in favor of expanding the shave, so it is possible that very soon we will see a new abbreviation for the new members of the association to lead. Look, it's a fucking 60 minutes of goodbye and goodbye.
Hello on the Russia Vesti TV channel in the Irina Rosius studio and the main topic for this hour
Established on the global stage. Vladimir Putin addressed the plenary session as an authoritative structure of influence that consistently strengthens world affairs.
The American strikers cannot withstand the off-road security of the leaders, the Ukrainian military in the Pentagon criticizes the disgusting NATO equipment, while teaching Kyiv to fight and not be considered salvation.
The Kiev regime is pushing its soldiers into minefields and under artillery strikes. Putin discussed the situation in new regions with a ballet dancer and a beekeeper.
And in Primorye again the floods brought by the Southern Cyclone in some areas are preparing for the evacuation of the population.
Western countries to maintain their superiority in the world led to a severe crisis in Ukraine, President Vladimir Putin said this at the plenary session of the Brix summit
Well, now there is a direct connection from Johann to Burg My colleague Alexey Golovko Alexey Hello What other statements were made at the forum and what decisions are expected to be taken by the leaders of the five?
Irina Beuhannesburg, I'm in the midst of the first full day of the Brix summit – this is already the fifteenth bundling summit. We have been watching all day today how the emphasis is on the center here, and the motorcades are approaching, the heads of state and those who represent the countries of Brix in South Africa today, let me remind you that he represents Russia here, And Sergei Lavrov, Vladimir Putin speaks via video link, so, uh, the leaders of the delegation walked along the carpet for a long time, then they took pictures for a very long time, it is clear that everyone is in a good mood and the photographer posed for a long time. And they didn't want to let them go. Well, then, when they had already sat down, and in the hall for the so-called meeting in a narrow format, it was clear that the leaders were very carefully preparing for the speech, helping the Minister of Foreign Affairs of the war, and suggesting something according to the papers. And when Vladimir Putin spoke, it was clear that part of her speech was handwritten by the Russian president. Today I talked a lot about the role of Brix in the modern world and about what kind of association What impact does this association have on Global politics and the global economy Our Five has rightfully established itself on the global stage as an authoritative structure of influence, which is consistently being strengthened in world affairs.
The strategic course of unification, aspiring to the future, meets the aspirations of the main part of the international community, the so-called world majority, acting in a coordinated manner.
Based on the principles of equal partnership support and consideration of each other's interests. We deal with the most pressing issues of global and regional travel. Main. We are all unanimous in favor of the formation of a multipolar world order, truly, just and based on international law, while respecting the key principles of the UN, including sovereign right and respect for the right of every people to their own model for development.
Also, the Russian president has repeatedly stressed that Brix is such an association that does not oppose itself to other international organizations such as the G20 or G7, that it is an economic and influential bloc, but the opinion that must be heeded.
The hegemony propagated by some countries by their exclusiveness and based on this postulate of the new policy of the policy of continuing colonialism is not near low. I want to note that it is precisely the desire to maintain their hegemony in the world that the desire of some countries to

What about Google's far more powerful Universal Speech Model (USM) known as "Chirp"? Given its relative newness, there aren't many simple code samples that show how to use the RESTful interface, so we'll showcase below how to use for local audio files of less than a minute and GCS-hosted audio of any length.

For audio files of less than 1 minute, you can use the following code. NOTE the non-standard "https://us-central1-speech.googleapis.com" URL for the STT API and the use of "us-central1" instead of "global" as the region. At this time Chirp requires "us-central1" to be set in both places – if you attempt to use the main "https://speech.googleapis.com" endpoint you'll get generic errors about invalid JSON fields in your request. Note that you'll see a lot of information in the documentation about creating a "recognizer" – the request below sidesteps all of that and allows you to just run Chirp in similar fashion to how STT V1 worked:

#chop the audio file to 60s or less...
rm RUSSIA1_20230823_110000_Vesti-crop.mp3; time ffmpeg -i ./RUSSIA1_20230823_110000_Vesti.mp4 -ss 0 -t 60 -ac 1 -ar 16000 RUSSIA1_20230823_110000_Vesti-crop.mp3

#create the config file...
echo "{
  \"config\": {
    \"auto_decoding_config\": {},
    \"language_codes\": [\"ru-RU\"],
    \"model\": \"chirp\",
    \"features\": { \"enable_automatic_punctuation\": true, \"enable_word_time_offsets\": true },
  },
  \"content\": \"$(base64 -w 0 ./RUSSIA1_20230823_110000_Vesti-crop.mp3 | sed 's/+/-/g; s/\//_/g')\"
}" > /tmp/data.txt

#and make the request...
curl -X POST -H "Content-Type: application/json; charset=utf-8" \
    -H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \
    -d @/tmp/data.txt \
    https://us-central1-speech.googleapis.com/v2/projects/[YOURPROJECTID]/locations/us-central1/recognizers/_:recognize | jq -r .results[].alternatives[0].transcript

For longer audio files greater than a minute, they must be uploaded to GCS first. Using the documentation for STT V2 and batchRecognize, the code below adapts our example above to batch recognition that works identically to our STT V1 example earlier. Note that unlike STT V1, you can't specify the exact file to output the results to since batches can contain multiple output files. Instead, you specify a root GCS path to which results are output into:

#chop the audio file to the first 5 minutes to match our other examples...
rm RUSSIA1_20230823_110000_Vesti-crop.mp3; time ffmpeg -i ./RUSSIA1_20230823_110000_Vesti.mp4 -ss 0 -t 300 -ac 1 -ar 16000 RUSSIA1_20230823_110000_Vesti-crop.mp3

echo "{
  \"config\": {
    \"auto_decoding_config\": {},
    \"language_codes\": [\"ru-RU\"],
    \"model\": \"chirp\",
    \"features\": { \"enable_automatic_punctuation\": true, \"enable_word_time_offsets\": true },
  },
  \"files\": [ {\"uri\": \"gs://[YOURBUCKET]/RUSSIA1_20230823_110000_Vesti-crop.mp3\"}],
  \"recognitionOutputConfig\": { \"gcsOutputConfig\": { \"uri\": \"gs://[YOURBUCKET]/RUSSIA1_20230823_110000_Vesti-crop/asr/\" } }
}" > /tmp/data.txt

curl -X POST -H "Content-Type: application/json; charset=utf-8" \
    -H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \
    -d @/tmp/data.txt \
    https://us-central1-speech.googleapis.com/v2/projects/[YOURPROJECTID]/locations/us-central1/recognizers/_:batchRecognize

This return a response that looks like the following. NOTE the "name" field that gives you the JOBID. You'll need that in a moment.

{
  "name": "projects/[YOURPROJECTID]/locations/us-central1/operations/[JOBID]",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.speech.v2.OperationMetadata",
    "createTime": "2023-08-23T23:13:10.052151Z",
    "updateTime": "2023-08-23T23:13:10.052151Z",
    "batchRecognizeRequest": {
      "recognizer": "projects/[YOURPROJECTID]/locations/us-central1/recognizers/_",
      "files": [
        {
          "uri": "gs://[YOURBUCKET]/RUSSIA1_20230823_110000_Vesti-crop.mp3"
        }
      ],
      "config": {
        "features": {
          "enableWordTimeOffsets": true,
          "enableAutomaticPunctuation": true
        },
        "autoDecodingConfig": {},
        "model": "chirp",
        "languageCodes": [
          "ru-RU"
        ]
      },
      "recognitionOutputConfig": {
        "gcsOutputConfig": {
          "uri": "gs://[YOURBUCKET]/RUSSIA1_20230823_110000_Vesti-crop/asr/"
        }
      }
    }
  }
}

To find out when the ASR has completed and the GCS filename the results were written into, use the following, copying the JOBID from the "name" field of the output above:

curl -X GET \
    -H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \
    "https:///us-central1-speech.googleapis.com/v2/projects/[YOURPROJECTID]/locations/us-central1/operations/[JOBID]"

sdf

{
  "name": "projects/[YOURPROJECTID]/locations/us-central1/operations/[JOBID]",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.speech.v2.OperationMetadata",
    "createTime": "2023-08-23T23:13:10.052151Z",
    "updateTime": "2023-08-23T23:13:19.520519Z",
    "resource": "projects/[YOURPROJECTID]/locations/us-central1/recognizers/_",
    "method": "google.cloud.speech.v2.Speech.BatchRecognize",
    "batchRecognizeRequest": {
      "recognizer": "projects/[YOURPROJECTID]/locations/us-central1/recognizers/_",
      "files": [
        {
          "uri": "gs://[YOURBUCKET]/RUSSIA1_20230823_110000_Vesti-crop.mp3"
        }
      ],
      "config": {
        "features": {
          "enableWordTimeOffsets": true,
          "enableAutomaticPunctuation": true
        },
        "autoDecodingConfig": {},
        "model": "chirp",
        "languageCodes": [
          "ru-RU"
        ]
      },
      "recognitionOutputConfig": {
        "gcsOutputConfig": {
          "uri": "gs://[YOURBUCKET]/RUSSIA1_20230823_110000_Vesti-crop/asr/"
        }
      }
    },
    "progressPercent": 100
  },
  "done": true,
  "response": {
    "@type": "type.googleapis.com/google.cloud.speech.v2.BatchRecognizeResponse",
    "results": {
      "gs://[YOURBUCKET]/RUSSIA1_20230823_110000_Vesti-crop.mp3": {
        "uri": "gs://[YOURBUCKET]/RUSSIA1_20230823_110000_Vesti-crop/asr/RUSSIA1_20230823_110000_Vesti-crop_transcript_64e48e2f-0000-24f4-bc6b-089e082862b4.json"
      }
    },
    "totalBilledDuration": "150s"
  }
}

Note the "results" field towards the bottom of the JSON that tells us the precise filename of the output in GCS. We'll fetch that:

gsutil cp gs://[YOURBUCKET]/RUSSIA1_20230823_110000_Vesti-crop/asr/RUSSIA1_20230823_110000_Vesti-crop_transcript_64e48e2f-0000-24f4-bc6b-089e082862b4.json .

This shows what the results look like:

{
 "results": [{
    "alternatives": [{
       "transcript": " страны в Брикс подтвердила бы новый статус независимой державы. Сегодня фактически все лидеры высказались за расширение брикс, поэтому возможно уже совсем скоро мы увидим новую аббревиатуру и новых членов объединения. Вести смотрите, это были 60 минут. Всего доброго и до свидания!",
       "words": [{
          "startOffset": "0.080s",
          "endOffset": "0.400s",
          "word": "страны"
         }, {
          "startOffset": "0.400s",
          "endOffset": "0.440s",
          "word": "в"
         }, {
          "startOffset": "0.440s",
          "endOffset": "0.800s",
          "word": "Брикс"
         }, {
          "startOffset": "0.800s",
          "endOffset": "1.400s",
          "word": "подтвердила"
         }, {
          "startOffset": "1.400s",
          "endOffset": "1.560s",
          "word": "бы"
         }, {
          "startOffset": "1.560s",
          "endOffset": "1.880s",
          "word": "новый"
         }, {
          "startOffset": "1.880s",
          "endOffset": "2.360s",
          "word": "статус"
         },

Let's extract the final transcript:

cat RUSSIA1_20230823_110000_Vesti-crop_transcript_64e48da7-0000-24f4-bc6b-089e082862b4.json | jq -r .results[].alternatives[0].transcript

Yielding:

страны в Брикс подтвердила бы новый статус независимой державы. Сегодня фактически все лидеры высказались за расширение брикс, поэтому возможно уже совсем скоро мы увидим новую аббревиатуру и новых членов объединения. Вести смотрите, это были 60 минут. Всего доброго и до свидания!
Здравствуйте на телеканале Россия Вести в студии Ирина Росиус и главные темы к этому часу: наша пятёрка по праву утвердилась на глобальной арене в качестве авторитетной структуры, влияние, которое в мировых делах последовательно укрепляется. Владимир
на пленарной сессии лидеров стран Брикс. Американские страйкеры не выдерживают бездорожье. Защищённость от мин отвратительная натовскую технику критикуют украинские военные. В Пентагоне же учат Киев воевать и не считаться с потерями. Киевский режим толкает своих солдат на минные поля и под артиллерийские удары. Ситуацию в новых регионах Путин обсудил с Балетским и Пасечником. И в Приморье снова потопы как
который принёс Южный циклон, в отдельных районах готовится к эвакуации населения. Стребление западных стран сохранить своё превосходство в мире, привело к тяжёлому кризису на Украине, об этом заявил президент Владимир Путин, выступая на пленарной сессии саммита Брикс. Ну а сейчас на прямую связь из Йоханнесбурга выходит мой коллега Алексей Головколик
Здравствуйте, какие ещё заявления прозвучали на форуме и какие решения, как ожидается, примут лидеры пятёрки? Здравствуйте, Ирина, в Йоханнесбурге, в разгаре первый полноценный рабочий день саммита Брикс, это уже пятнадцатый саммит объединения, мы сегодня весь день наблюдали, как сюда к сентан-центру а подъезжают кортежи а глав государства и а тех, кто представляет сегодня в Юр страны Брикс, напомню, что Россию здесь представляет Сергей Лавров, Владимир
выступает по видеосвязи, так вот лидеры делегации долго проходили по ковровой дорожке, потом очень долго фотографировались, видно, что настроение у всех хорошие, они долго позировали фотографом, те не хотели их отпускать, ну и затем когда уже сели в зал для заседаний так называем в узком формате, было видно, что очень тщательно лидер готовится к рече, пину помогал министр иностранных дел Ваны, что-то там подсказывал по бумагам, а когда выступал Владимир Путин, было видно, что часть речи у него написана от руки, российский президент сегодня
говорил о роли Брикс в современном мире и о том, какое объединение это, какое влияние это объединение оказывает на глобальную политику и на глобальную экономику? наша пятёрка по праву утвердилась на глобальной арене в качестве авторитетной структуры, влияние, которое в мировых делах последовательно укрепляется. Стратегический курс объдинения, устремлён в будущее, отвечает чаянием основной части международного сообщества,
называемого мирового большинства, действуя слаженно на принципах равноправия, партнёрской поддержки и учёта интересов друг друга, мы занимаемся решением самых насущных вопросов глобальной и региональной поездки. Главное, мы все единодушно выступаем в пользу формирования многополярного миропорядка, по-настоящему справедливого и основанного на международном праве (
ключевых принципов устава ООН, включая суверенное право и уважение, права каждого народа на собственную модель развития. Также Российский президент неоднократно подчеркивал, что Брикс – это такое объдинение, которое не не противопоставляет себя другим международным организациям, таким как G20 или G7, что это экономический влиятельный блок, но к мнению которого надо прислушива
мы против какой бы то ни было гегемонии, пропагандируемой некоторыми странами своей исключительности и основаны на этом постулате новой политики политики продолжающегося колониализма, неоколониализма. Я хочу отметить, что именно стремление сохранить свою гегемонию в мире, стремление некоторых стран свои

Which Google Translate translates as:

countries in the Brix would confirm the new status of an independent power. Today, virtually all leaders have spoken out in favor of expanding the brix, so perhaps very soon we will see a new abbreviation and new members of the association. Watch the news, it was 60 minutes. All the best and goodbye!
Hello on the Russia Vesti TV channel in the studio of Irina Rosius and the main topics for this hour: our five have rightfully established themselves on the global stage as an authoritative structure, an influence that is consistently strengthening in world affairs. Vladimir
at the plenary session of the leaders of the Brix countries. American strikers can't handle off-road. Mine protection is disgusting, NATO equipment is criticized by the Ukrainian military. The Pentagon is teaching Kyiv to fight and not to reckon with losses. The Kiev regime is pushing its soldiers into minefields and under artillery strikes. Putin discussed the situation in the new regions with Baletsky and Pasechnik. And in Primorye again floods like
which brought the Southern cyclone, in some areas is preparing for the evacuation of the population. The extermination of Western countries to maintain their superiority in the world has led to a severe crisis in Ukraine, President Vladimir Putin said this, speaking at the plenary session of the Brix summit. Well, now my colleague Alexey Golovkolik is on direct communication from Johannesburg
Hello, what other statements were made at the forum and what decisions are expected to be taken by the leaders of the five? Hello, Irina, in Johannesburg, in the midst of the first full working day of the Brix summit, this is already the fifteenth summit of the association, today we have been watching all day how motorcades of the heads of state and those who represent the country today in Yur are approaching the sentan center Brix, let me remind you that Russia is represented here by Sergey Lavrov, Vladimir
speaks via video link, and so the leaders of the delegation walked along the red carpet for a long time, then took pictures for a very long time, it is clear that everyone is in a good mood, they posed for a photographer for a long time, they did not want to let them go, and then when they sat down in the meeting room, the so-called in a narrow format, it was clear that the leader was preparing very carefully for the speech, the Minister of Foreign Affairs of Vany helped pin, he suggested something on the papers, and when Vladimir Putin spoke, it was clear that part of his speech was written by hand, the Russian president Today
talked about the role of Brix in the modern world and what kind of association it is, what impact does this association have on global politics and the global economy? our five have rightfully established themselves on the global arena as an authoritative structure, an influence that is consistently strengthening in world affairs. The strategic course of unification, looking to the future, meets the aspirations of the main part of the international community,
called the world majority, acting in a coordinated manner on the principles of equality, partnership support and taking into account the interests of each other, we are dealing with the most pressing issues of the global and regional trip. Most importantly, we are all unanimous in favor of the formation of a multipolar world order that is truly just and based on international law (
key principles of the UN Charter, including sovereignty and respect, the right of every people to their own model of development. Also, the Russian President has repeatedly stressed that Brix is such an association that does not oppose itself to other international organizations, such as the G20 or G7, that it is an economic influential bloc, but whose opinion should be listened to
we are against any kind of hegemony promoted by some countries of their exclusivity and are based on this postulate of the new policy of the policy of continuing colonialism, neo-colonialism. I want to note that it is precisely the desire to maintain their hegemony in the world, the desire of some countries to