The GDELT Project

When Machine Translation Begins To Encode "Values": LLMs, Editorialization & Guardrails

For the long history of machine translation, from rules-based to SMT to NMT, translation systems were designed to be neutral transparent conversion systems, designed to accept input text in one language and provide as close as possible to the same meaning in the desired output language. The neutrality and transparency of traditional MT systems means they can be given any news article from anywhere in the world and they will make a best-effort attempt to translate it. While the translation may be highly imperfect, sometimes bordering on gibberish, the system will attempt to translate anything it is given.

The emerging world of LLM-based translation systems poses a fundamental challenge to enterprise translation workflows: the incorporation of fundamental "values" and editorialization, where an LLM can simply refuse to provide a translation of a given news article, saying the underlying story or narrative goes against its company's values. LLM-driven translation workflows often produce a surreal Western utopia, refusing to translate any coverage that undermines or questions that perfection, excluding underrepresented voices and lived experiences in favor of dominate narratives and voices.

As we have been ramping up our experiments with LLM-based machine translation, we are observing three major trends that undermine their use in real-world workflows: