The GDELT Project

Why We Care So Much About Infographics: They Force AI Models To Structure Information Into Narrative Arcs By Pushing Their Reasoning Capabilities

Over the past two weeks we've used Google's new Gemini 3 and Nano Banana Pro to create literally hundreds of rich beautiful infographics covering everything from legislation to government reports to speeches, resumes, academic papers, news articles, television broadcasts, Wikipedia pages, blogs and just about every other kind of information imaginable. A number of you have asked why we've been so intently interested in infographics beyond their beautiful visuals and why infographics specifically rather than the more mediagenic cartoon and hyperrealistic visuals most of the social media landscape has been focusing on. The reason lies at the heart of GDELT's mission: infographics push the very boundaries of advanced AI models' reasoning abilities in their ability to structure complex non-linear streams of information into cohesive linear narrative arcs.

Summarizing a long textual document into a shorter textual document requires only statistical modeling to identify the passages and information most and least important to the document's meaning. At its simplest, summarization can just drop unimportant sentences and reword others to preserve key details. At its most advanced, it still involves what amounts to a "translation" task: distilling the original text into a shorter more condensed form. Summarization can also "cheat" by distilling locally using global cues, rewriting in place, removing the equivalent of low TF-IDF information, and other processes. Even when applied as a "true" summarization task with global-in global-out distillation, summarization typically preserves the ordering of the original document, condensing the wordcount, but not fundamentally restructuring the entire narrative into something fundamentally new. In other words, it leans most heavily on the model's statistical representation of word usage, rather than deeply probing its reasoning abilities.

In contrast, creating infographics requires something fundamentally different: deeply reasoning over a document to distill not only its core "essence" and points, but to then restructure those into a single overarching branching narrative tree that allows it to be represented in a single image. The model must extract the "meaning" of the document and then organize that meaning into an entirely new structure, reordering and reimagining how that meaning is told. The visual dimension additionally requires the model to escape the text-only realm to connect the visual modality, which brings with it a parallel representational landscape that pushes the model's ability to reason in a truly multimodal space.

It is this multimodal non-linear to linear branching narrative tree deep multimodal reasoning process that is deeply stressed by infographic creation and helps us probe the true reasoning capabilities of the latest SOTA models with an eye towards how they can help us begin to truly understand the planet around us.