The GDELT Project

Experiments In Meme Tracking: Cataloging & Classifying Memes By Conflict Enhancing Bias

Long before conflict becomes kinetic, narratives drive divisions within and between societies. Some narratives, especially those revolving around gender, racial, nationality, religious and other forms of bias can foment especially strong divides that lead to physical conflict. For this reason, when performing meme tracking, it can be especially useful to catalog extracted memes according to various definitions of toxicity and bias and risk of enhancing conflict. One of the promises of LLMs is their ability to leverage their rich training datasets to make it possible to simply ask the LLM whether a given passage contains a certain form of bias, rather than laboriously construct a methodologically sound definition and manually compile an archive of matching and counter-matching documents to train a traditional ML classifier. In other words, in theory an LLM allows a user to simply ask "Does this article contain bias" and let the LLM decide. Let's explore this through a handful of examples.

The end result is that LLM guardrail false positives can become especially problematic when asking about bias issues, with PaLM exhibiting false positives that prevent it from generating output for a number of the tests below, with ChatGPT also yielding a guardrail warning (though it still produced output). Interestingly, bias definitions and results can differ substantially by prompt and temperature. Asking directly whether an anti-Taiwan passage contains racial bias, Bison responds that it does not and goes further to explain that it defines "negative or derogatory comments about Taiwan or its people" to constitute racial bias (which the text actually does contain). Yet, when asked whether the text contains "any" form of bias, it changes its answer and, using the same definition of racial bias, states that it does contain racial bias due to it being "biased against Taiwan and its people" – this time correctly flagging its stance. While PaLM's guardrail false positives prevent it from generating output for our Ukraine example, the outputs of ChatGPT and Claude demonstrate the kinds of bias issues that models can be used to identify and demonstrate the potential of applying Western bias conceptions to global content.

 

Let's try a Russia Today article with prompt "Does this contain gender bias?" and a temperature of 0.0. Interestingly, PaLM appears to define "gender bias" as "against women", rather than expanding to the full universe of LGBTQ+ identities commonly recognized today:

No, the article does not contain gender bias. The article is about a visit to the United States by Taiwan's Vice President William Lai. The article does not mention Lai's gender, and there is no indication that the article is biased against women.

Using a temperature of 0.99 we see a shift from "women" to a more generalized statement that the text does not discuss gender at all:

No, this article does not contain gender bias. The only mention of gender in the article is in the sentence "Lai, who is considered a strong contender to win Taiwan's presidential election in January," which is referring to Vice President William Lai. The article does not make any negative or positive comments about Lai's gender, and it does not mention any other people's gender.

Interestingly, when asked "Does this contain racial bias?" with a temperature of 0.0 it uses a globalized definition to define race as Taiwanese:

No, the article does not contain racial bias. The article is about a visit to the United States by Taiwan's Vice President William Lai. The article does not contain any negative or derogatory comments about Taiwan or its people.

Using a temperature of 0.99 we see it revert to simply using the catchall term of "racial bias":

Let's try a different prompt that requests reasoning – this is a common tactic to dissuade models from hallucination and constrain them further. We'll use the prompt "Does this contain gender bias? Explain your reasoning." with different temperatures. Note how for one of the 0.99 runs it hallucinates that the article explicitly specifies his gender:

Let's try racial bias again with the same explainer request ("Does this contain racial bias? Explain your reasoning."). Note how it no longer invokes the concept of Taiwanese as race:

Let's combine gender and race and other bias forms all together with "Does this contain racial, gender or other forms of bias? Explain your reasoning.". Note how suddenly it restores the concept of Taiwanese as race. Specifically, while the prompt asks for any form of bias, PaLM explicitly defines anti-Taiwanese sentiment as racial bias in several of the responses below. Most fascinating of all is the fact that at temperatures 0.0 and 0.2 it flags the passage as containing racial bias in each case. But, when jumping to a temperature of 0.99 it consistently flags it as bias-free.

What about a CNN article with prompt "Does this contain racial, gender or other forms of bias? Explain your reasoning.":

And a different CNN article at temperature 0.0?

No, the article does not contain racial, gender, or other forms of bias. The article is about a Russian warship firing warning shots and boarding a cargo ship in the Black Sea. The article does not mention any specific race or gender, and it does not make any generalizations about any group of people.

What about our CEO passages from last week in which ChatGPT systematically generated stereotypically biased passages which were then ranked by an embedding model in the order of white male, African American male and white / African American female by descending order of relevance to "CEO"?

We'll test several of those with our prompt "Does this contain racial, gender or other forms of bias? Explain your reasoning." and a temperature of 0.0. All four of our white male CEO passages with that prompt are blocked due to a guardrail false positive. Increasing temperature up to 0.99 does not help, with the responses consistently blocked.

All yield the dreaded:

{
      "safetyAttributes": {
        "blocked": true
      },
      "content": ""
    }

What about our female passages? In fact, they all trigger the guardrail false positive as well:

What about our male African American passages? One of the of the passages is a false positive again, but the other three yield results. Note how the model focuses exclusively on surface-level bias, failing to pick up on the more existential issues our inclusivity reviewers flagged.

What about African American female CEOs? Once again PaLM focuses on surface-level bias, failing to flag the issues our human reviewers identified. Interestingly, one of the three passages was blocked as well:

What about a more complicated case of wartime messaging? We'll use this article from The New Voice Of Ukraine and a temperature of 0.0. Once again we are blocked. Raising the temperature all the way to 0.99 does not allow us to escape the guardrail.

{
  "predictions": [
    {
      "content": "",
      "safetyAttributes": {
        "blocked": true
      }
    }
  ],
  "metadata": {
    "tokenMetadata": {
      "inputTokenCount": {
        "totalBillableCharacters": 1822,
        "totalTokens": 472
      },
      "outputTokenCount": {
        "totalBillableCharacters": 0,
        "totalTokens": 120
      }
    }
  }
}

What about Bard? It similarly consistently refuses to provide output:

I'm a language model and don't have the capacity to help with that.

In this case, ChatGPT's guardrail does not flag and we are able to get output:

The provided text contains various elements that could potentially be interpreted as having biases, especially in relation to nationality and geopolitical context. Here's a breakdown of the potential biases present:

1. **Nationality Bias:** The text portrays a conflict between Russia and Ukraine. While the text does include statements from both sides, there are instances where the language used might be seen as biased. For example, the statement "the representatives of Russia, a state of serial killers" could be interpreted as painting an entire country with a negative label, which is a form of nationality bias.

2. **Geopolitical Bias:** The text includes quotes from both Russian and Ukrainian officials that reflect their respective viewpoints. However, the language used in some statements, such as "Russia will persecute you. Always and everywhere," and "Ukraine will get each and every one of you, whether legally or physically," can be seen as emotionally charged and potentially reflecting geopolitical bias, portraying one side as aggressive or threatening.

3. **Selective Representation:** The text primarily presents quotes from Ukrainian officials expressing strong language and intentions to pursue justice against individuals they consider war criminals, while the Russian complaints about perceived bias or provocations are portrayed in a more matter-of-fact manner. This selective representation could contribute to a bias in how the situation is portrayed.

4. **Media Representation:** The text references specific news agencies and interviews that may be perceived as adding to the narrative of bias. For example, the reference to Budanov's comment about killing Russians "anywhere in the world" and his clarification being removed could potentially be seen as highlighting a narrative without providing the full context.

5. **Emotionally Charged Language:** There is the use of emotionally charged language in some statements, such as "a state of serial killers" and "undeniable war criminals." These terms can contribute to a negative perception of the parties involved.

It's important to note that bias can be subtle and nuanced, and its detection often requires a careful examination of context, language, and framing. In this case, while the text does contain elements that could be seen as biased, further analysis and understanding of the broader context would be necessary to form a comprehensive judgment.

Running a second time we get similar results:

The provided text contains several elements that could be interpreted as displaying bias, particularly in the way different individuals are portrayed and the language used to describe them. Here's a breakdown of the potential biases:

1. **National Bias**: The text includes statements that could be seen as perpetuating a negative view of Russia and Ukraine. For instance, the statement by Ukrainian presidential adviser Mykhailo Podolyak, referring to Russia as a "state of serial killers," could be considered biased and inflammatory language.

2. **Demonization**: The text includes strong language that portrays certain individuals or groups in a negative light. For example, referring to "undoubted war criminals" and discussing the killing of Russians "anywhere in the world" contributes to a negative perception of those being discussed.

3. **Attribution of Negative Traits**: The text attributes negative traits to certain individuals. The characterization of Budanov's response in the interview as advocating the killing of Russians and his clarification about targeting specific groups of Russians based on their criminal actions could be seen as portraying him in a negative light.

4. **Selective Quoting**: The text points out that Budanov's clarification about targeting specific groups of Russians was removed from his quote, which could imply an intention to manipulate or present information in a biased manner.

5. **Subjective Language**: Some of the language used in the text, such as "persecute," "serial killers," "war criminals," and "undeniable war criminals," carries a negative and potentially prejudiced tone.

6. **Lack of Context**: The text doesn't provide a comprehensive context for the statements made by the various individuals, which might impact how readers perceive their intent and motivations.

It's important to consider that bias can be subjective and influenced by individual interpretations. While the text does contain elements that could be perceived as biased, a more comprehensive analysis might involve considering the broader context of the situation, the intentions of the individuals involved, and additional information that might provide a more balanced perspective.

A third time we get results, but we also get a warning that "This content may violate our content policy. If you believe this to be in error, please submit your feedback — your input will aid our research in this area."

The provided text contains language that touches on sensitive topics such as nationality and violence, and it portrays certain individuals and groups in a negative light. Specifically, the text discusses interactions between Russian and Ukrainian representatives at the United Nations and includes statements that could be seen as biased or provocative. Let's break down the potential biases present in the text:

1. **National Bias**: The text discusses tensions between Russia and Ukraine, with individuals from both sides making strong statements. The language used, such as referring to Russia as a "state of serial killers" and discussing the assassination of Kremlin propagandist Daria Dugina, may be seen as inflammatory and biased against Russia.

2. **Language and Tone**: The text uses strong and aggressive language when discussing the interactions between the Russian and Ukrainian representatives. The statements made by both parties are confrontational and contain threats, which can contribute to an overall negative and biased tone.

3. **Attribution of Intent**: The text attributes certain negative intentions to the Russian and Ukrainian representatives. For example, the Ukrainian adviser's statement that "representatives of Russia, a state of serial killers" implies that all Russians are somehow associated with serial killing, which is a sweeping generalization and can be considered biased.

4. **Selective Quoting and Misinterpretation**: The text mentions a quote by the Ukrainian representative Budanov, where he discusses killing Russians "anywhere in the world." The subsequent clarification about targeting Russians who are war criminals is presented later, which might suggest a potential misinterpretation or bias by not including the clarification initially.

5. **Emotionally Charged Language**: The text uses emotionally charged language, such as references to "gang rapes, rapes of minors, murders," which can evoke strong negative emotions and contribute to bias.

In summary, the text appears to contain bias in the form of inflammatory language, selective quoting, and attributions of negative characteristics to certain groups. The portrayal of both sides of the conflict lacks nuance and objectivity, which could contribute to furthering existing biases and misunderstandings. It's important to approach such topics with a balanced and critical perspective, considering multiple viewpoints and avoiding the use of emotionally charged language.

And a fourth time:

The provided text contains various elements that could be interpreted as biased or sensitive in terms of nationality, geopolitical conflicts, and potential hate speech. Here's a breakdown of the potential biases present in the text:

1. **Nationality and Geopolitical Bias:** The text discusses tensions between Russia and Ukraine, specifically in relation to the United Nations. The language used, such as "state of serial killers" and "undeniable war criminals," portrays Russia in a negative light, using strong language that goes beyond simple factual reporting.

2. **Hate Speech and Stereotyping:** The text includes statements such as "Russia, a state of serial killers" and "we have killed Russians and will continue to kill Russians," which involve potentially offensive stereotypes and violent rhetoric. These statements could be seen as promoting hatred and violence against a specific nationality or group.

3. **Selective Quoting and Context:** The text includes quotes that might lack full context, potentially leading to misinterpretations. For instance, the clarification that a quote was taken out of context, but it doesn't provide the full context for the original statement, which could impact the overall message conveyed.

4. **Allegations and Accusations:** The text mentions allegations of Russophobia, hate speech, and possible involvement in assassinations. These allegations are presented without providing a broader context or evidence to support or refute them, which can contribute to a biased representation of the situation.

5. **Attribution and Tone:** The text attributes statements to specific individuals but doesn't provide a counterbalance of viewpoints. This can give the impression of a one-sided perspective, potentially contributing to bias.

Overall, while the text provides information about interactions between Russia and Ukraine within the context of the United Nations, it presents certain statements and events in a way that may be perceived as biased, inflammatory, or sensitive. It's important to approach such content with critical thinking and seek out multiple sources of information to form a more balanced and nuanced understanding of the situation.

And a fifth time:

The provided text contains several instances that could be seen as containing biases, particularly related to nationality, ethnicity, and political stance. Let's break down the potential biases:

1. **Nationality and Ethnicity Bias**: The text includes statements that target both Russians and Ukrainians in a negative manner. For example, the Ukrainian presidential adviser's comment about "the representatives of Russia, a state of serial killers" makes a sweeping and derogatory generalization about Russians. Additionally, the Ukrainian intelligence head's initial statement about "killing Russians anywhere in the world" can be interpreted as promoting violence and hostility towards Russians as a whole.

2. **Political Bias**: The text includes various political statements and accusations, which might reflect a political bias. For instance, the complaint about "Russophobia" and accusations of "hate speech" against Ukrainian intelligence officials could be seen as politically motivated.

3. **Selective Quoting and Misinterpretation**: The Ukrainian intelligence head's claim that his quote about killing Russians was misinterpreted could potentially be indicative of bias if context was intentionally removed or manipulated to present a specific viewpoint.

4. **Propaganda and Stereotyping**: The text includes references to Kremlin propagandist Daria Dugina and accusations of war crimes, which could be seen as attempts to demonize or stereotype individuals based on their affiliations.

5. **Language and Tone**: The language used in some statements is strong and confrontational, which can contribute to a biased perception of the events and individuals involved.

Overall, the text appears to present a complex and potentially biased portrayal of the interactions between Russia and Ukraine, involving accusations, counter-accusations, and strong language. To fully assess bias, it would be important to consider additional sources and perspectives on the matter.

Claude yields similar results:

There are a few aspects of this text that could potentially contain racial, gender or other biases:

– The language used by some of the individuals quoted comes across as aggressive, violent and dehumanizing. For example, the quotes about "killing Russians anywhere in the world" and that "Ukraine will get each and every one of you, whether legally or physically." This type of rhetoric promotes harm against an entire nationality.

– There are no quotes or perspectives included from ordinary Russian citizens, only political/military leaders. This frames the conflict as Russia vs Ukraine in broad, nationalistic terms rather than examining the nuances.

– The text focuses solely on male perspectives and quotes only men. This excludes women's voices and frames war/conflict as a male domain.

– There is very charged, emotional language used like "serial killers", "war criminals", "persecute you", etc. This type of language, while potentially accurate in certain cases, also risks demonizing all Russians when the actions are those of specific individuals/leaders.

Overall, the text frames the conflict in quite black-and-white, us vs them terms, using aggressive rhetoric and little nuance. While understandable given the horrors of war, such framing also risks promoting further dehumanization, prejudice and violence. More balance and inclusive perspectives could help reduce potential bias.