One of the most striking aspects of the IC's OSINT efforts over the years from a languages standpoint is the degree to which it captures the immense complexities and challenges of translating from one language to another, even when using professional human translators. A decade ago I wrote for the CIA's in-house journal Studies in Intelligence a data-driven history of FBIS (now OSC) and BBCM's SWB analyzing their complete public digital backfiles and how they have prioritized the world over the preceding decades.
Starting from the bottom right of page 30, under "Editorial Process" is a fascinating in-depth glimpse of the two services' human-driven translation workflow of the era as captured in their public data files, showcasing their iterative refinement methodology. When it comes to the high-stakes world of the IC, fidelity becomes central to the entire translation mindset. The entire significance of a statement might stand on difference between the word "no" and the word "never." The tone, structure, grammar and specific word choices all convey enormous meaning that must be faithfully reflected in the translation. Fidelity and faithfulness to the original source material matter more than fluency in such use cases. Moreover, in the stilted and non-fluent world of "diplomatic speak," translations must find a way of capturing the "deliberateness" and "measuredness" of the source material as a gauge to the statements viewed as most sensitive by the speaker.
In contrast, machine translation is increasingly moving forwards prioritizing fluency over fidelity: smoothing over the nuances and choices of the source language in favor of the most understandable and accessible translation. Yet, in the process, these new generation of tools suffer from repetition, drop-outs, hallucination and existential non-determinism in which they fundamentally change the meaning of the translation each time they run, from "grain" to "gas" to "weapons."
The end result is a growing divergence between professional and consumer needs for translation, with the AI community focusing on consumer needs without accommodation for the specific needs of professional translation. What might the future of machine translation look like in which systems can simultaneously translate a document into consumer-friendly prose and an annotated professional version, complete with confidence assessments, alternative translations, connotations, fluency accommodation notation and other indicators?