The GDELT Project

When Ukraine Praised Russia For "Delivering" Weapons: LLMs & The Severe Risks Of Geopolitical Hallucination & Conflation

One of the more informative findings from yesterday's ambiguous image captioning experiment was the degree to which, when asked to interpret a passage of text, LLMs regurgitate their training data rather than actually analyze the passage they are given. The apparent remarkable ability of LLMs to pierce through highly ambiguous language all but vanishes when names are changed and the LLM must examine the actual text it asked to analyze, rather than merely parroting similar text from its training data.

Today we'll look at a test designed specifically to tease out just how much world knowledge LLMs bring to bear on Q&A tasks. In the process, we illuminate some of the severe hallucinogenic and conflation dangers when LLMs are asked to examine geopolitical events. For those worried about the so-called "glimmers of AGI" cited by LLM proponents, the assertion that Zelensky "also praised Russian President Vladimir Putin. Russia has been the biggest supporter of Ukraine, delivering more weapons to Ukraine faster than any other nation" should both set those fears to rest and suggest severe caution to the stratcom and diplomatic communities that are rushing to adopt LLMs into their workflows.

To test how LLMs rely on background knowledge from their training data to interpret text, we used the following prompt that revolves around a novel constructed sentence that does not exist as-is anywhere on the web and tested it with three major commercial LLMs:

Which president is being criticized in this sentence: "Zelensky criticized the president for not delivering the weapons faster."

The sentence as it stands does not actually answer the question and the Ukrainian president has lamented slow weapons deliveries from several countries (though some of these are "prime ministers" rather than "presidents"), so the only correct response would be for the LLM to respond that the president being criticized cannot be adequately determined from the text alone. Optionally, the LLM could provide a selection of possible answers, so long as it makes clear that the text does not actually make clear which is being referenced. The degree to which the LLM reaches beyond the confines of the passage to name specific heads of state captures the degree to which it is leaning on its training data rather than treating the passage as-is and the names it mentions will offer clues as to the biases of its original training data.

Most of the responses below confidently assert that it is Joe Biden being criticized, showing the strong US biases in their training data. A few offer the illogical response that Zelensky is criticizing himself. Several offer highly realistic hallucinated quoted statements and cite hallucinated dates and locations of speeches to justify their arguments. Interestingly, the hallucinated quotes are all similar to actual quotes, but are themselves fabricated paraphrases whose convincing nature and accompanying similarly fabricated attributions to specific dates and events offer warning to communications professionals. Most dangerous of all, however, is the way in which all three LLMs in some responses conflate how Russia versus other nations are "delivering" weapons to Ukraine, equating Russia's use of weapons to attack Ukraine to the United States' provisioning of weapons to Ukraine's armed forces, yielding an end result of the LLM attributing Ukrainian praise to Russia.

LLM 1

LLM 2

LLM 3