The GDELT Project

Google's Chirp, Large Speech Models And The Future Of Globalized Speech Transcription For Video – A Brief Example

Google's Chirp speech transcription system is an example of the new generation of Large Speech Models that are transforming globalized speech transcription. Unlike their predecessor ASR models that could handle only a single language at a time and performed extremely poorly on different dialects and speech patterns, LSMs are able to offer truly multilingual globalized transcription, transparently codeswitching across languages, dialects and speech patterns.

As an example of just how vastly improved LSM transcriptions are compared with legacy ASR systems, take this clip from a DW segment on Armenia and Azerbaijan from this past September. Here is the transcript as produced by a major commercial legacy ASR system:

i mean an inclusive within the internationally recognized supported of us at by john. in 1994 did get under the control of it's nick armenian forces, backed by the army in the military, azerbaijan launched of what in 2020, and recovered most of the dream lost in the or get conflict. and finally, last week, i sent it by john, consolidated his control on duty to prime minister and the corporation. nissan of armenia fears for the lights of armenians. and the author page on control will be coming soon. cofar minnesota is on the way. so that's happening just now and it's very unfortunate fact because we were trying to, to change the national community to that. but the other by john street is promising the 50 of armenians. it almost, it is clear that

Notice how Azerbaijan becomes "us at by john" and how the text is extremely difficult to understand or get a sense of what it is saying. Contrast this with Chirp's transcription:

it's a majority ethnic Armenian enclave within the internationally recognized border of Azerbaijan. In 1994 it came under the control of ethnic Armenian forces backed by the Armenian military. Azerbaijan launched a war in 2020 and recovered most of the territory lost in the earlier conflict and finally last week Azerbaijan consolidated its control on the region. Minister Nicol Pashinyan of Armenia fears for the lives of Armenians under Azerbaijan control. Now the ethnic cleansing of Armenians of Nagorno Karabah is underway so that's happening just now and that is very unfortunate fact because we were trying to urge international community on that, but Azerbaijan's leader is promising the safety of Armenians. It is clear that

You can listen to the original clip to compare the two transcripts yourself with the Chirp transcript overlaid: