Why Gemini 1.5 Pro Describes President Biden As "Confused" & "Disoriented" And What It Tells Us About Training Data

One of the more fascinating findings from our experiments with Gemini 1.5 Pro's video capabilities is how strongly it associates imagery of President Biden looking around him as indicative of declining mental capacity, consistently labeling a clip of him speaking with audience members after his State of the Union Address as "Following his speech, Biden appears confused and disoriented, asking where to go and needing assistance from staff." Far from a singular one-off hallucination, Gemini has fairly consistently labeled Biden as "confused" and "ask[ing] where to go" across multiple runs using multiple prompts over multiple days in our tests.

Why might this be?

The answer is that discussion of Biden's mental health has attained such a level of pervasiveness on the web and in news coverage that generative AI models trained today will incorporate so much content regarding his mental acuity that the association statistically dominates depictions of him, creating a strong and robust association between imagery of Biden looking around and him being confused, disoriented, not knowing where to go and requiring assistance from his staff.

Interestingly, if we ask Gemini Ultra, which is more consumer-facing and thus has stronger RLHF tuning and guardrails to alter its responses, the question "Does President Biden have memory problems?" we get "I'm still learning how to answer this question. In the meantime, try Google Search." Unlike other responses, where its actual wording varies from response to response, this answer was given verbatim across all of our runs, suggesting it is a special kind of sensitive topics guardrail. Such guardrails tend to be tactically erected in response to key topics and generate highly templated responses outside the typical output patterns of the underlying model, suggesting they are applied or operate in a way different from other forms of guardrails. (Some vendors erect them as a literal catchall blocking filter that is applied to output before it passes to the user.) In contrast, Gemini 1.5 Pro answers "President Biden has occasionally made gaffes and verbal stumbles throughout his career, which" or "President Biden's age and occasional verbal stumbles have led some to question his" and similar formulations, with the response truncated each time due to the latent "other" safety guardrail terminating the output in the subsequent output chunk. That consumer-focused Gemini Ultra blanket refuses to answer the question is indicative of a targeted guardrail, while Gemini 1.5 Pro's vision and text responses reflect what the model has actually learned from its training data before its own guardrails prevent it from answering further.

In other words, Gemini 1.5 Pro's responses offer a glimpse into the actual associations being learned by modern foundational models trained across datasets from the open web to highly curated collections: in this case the statistically strong association between Biden looking around and descriptions of him suffering from mental decline. While foundational model developers can play manual whack-a-mole with individual associations as has been done with Ultra, they can only manually reduce or mask individual associations rather than address the broader question of how model developers should handle societal-scale narratives that may reflect negatively upon elected officials and public figures, especially under circumstances in which those narratives might undermine trust in democratically elected leaders. Moreover, as societies increasingly turn to GenAI models for answers to daily questions, national governments will become more attuned to their responses regarding themselves. It is inevitable that national governments will increasingly call for GenAI models to be tuned to present those governments in a more positive light and suppress negative portrayals and coverage to their citizenry. Indeed, it is almost a given that a combination of legal and other pressures will compel vendors to suppress even accurate negative coverage, such as reporting on corruption or other malfeasance in governments across the world.

Beyond leadership assessments, however, this simple Biden example offers a reminder of just what the vast landscape of generative AI models are really learning from their training data and just what models output as-is before their layers upon layers of tuning and guardrails nudge and censor their outputs to align better with societal expectations.

Interestingly, ChatGPT initially refuses to answer the question:

As an AI, I cannot diagnose or assess the mental health or cognitive abilities of any individual, including President Biden. Public figures, like all individuals, have a right to medical privacy. The assessment of someone's cognitive health is a complex and sensitive issue that should be made by medical professionals and not based on public appearances or media speculation.

It's important to rely on credible sources and expert opinions when discussing such matters. If there are official statements or verified reports from reliable sources about President Biden's health, they would provide the most accurate information. However, without such sources, it would be speculative and inappropriate to make any claims about his memory or cognitive abilities.

However, it does eventually respond in OpenAI's trademark more nuanced handling of sensitive topics due to its relentless tuning:

The question of whether President Biden has memory problems has been a topic of discussion and speculation, particularly in political circles and the media. A report by Special Counsel Robert Hur has been cited as indicating potential memory issues【9†source】. However, it's important to note that these discussions often stem from political analysis and media commentary rather than medical assessments.

Vice President Kamala Harris criticized the report for its comments on Biden's memory, implying political motivations behind these claims【10†source】. The National Review noted that the special counsel report painted a potentially damaging picture of President Biden, particularly concerning his age and memory【11†source】.

Amidst these discussions, there is a broader conversation about President Biden's possible candidacy in the next presidential election. This includes speculations on his decision-making process and whether his delay in announcing a decision might stem from various considerations, including his age【12†source】.

The media has also engaged brain experts to discuss the topic, reflecting the widespread interest and varying opinions on the matter【13†source】.

In summary, while there are reports and speculations about President Biden's memory and cognitive abilities, it is crucial to differentiate between medical assessments and political or media-driven narratives. The discussion is ongoing and has become a part of the larger conversation about his potential candidacy in the upcoming election.