The Rich Multilingual World Of Global Television News: New Possibilities For Understanding Codeswitching In News

For those accustomed to the monolingual English-only broadcasts of the United States, it can be nearly impossible to imagine the layered multilingual television news broadcasts that serve the multilingual societies that are the norm across much of the world.

Take for example this South Sudan broadcast that features primarily Arabic speech with two minutes of English at the end. Notice how even during the Arabic portion of the broadcast, the chyron scrolling along the bottom is in English. Broadcasts across the world can sometimes freely mix one, two or even three or more languages as presenters, reporters, newsreaders and interviewees all speak in different languages and dialects and/or clips are aired from other countries in other languages, with examples of highly intermixed intra-sentential codeswitching not that uncommon in highly multilingual societies.

To date, it has been nearly impossible to study multilingual news presentation through at-scale machine-assisted means due to the inability of traditional ASR and OCR systems to support full-accuracy mixed unbounded multilingual recognition. With the availability today of 300+ language robust mixed-frame OCR and the emergent (though not yet fully developed or officially supported) multilingual recognition capabilities of large-model ASR tools like Chirp, we can for the first time begin to ask key questions about the language and representation of news in richly multilingual societies.