Generative Image AI: Branding & Representation Lessons From Creating Nearly 500 DALL-E Images For Autonomous Storytelling & Promotional Campaigns

Over the past week and a half we've been exploring the potential of generative AI image creation tools like DALL-E for visual storytelling as part of fully autonomous national, political and organizational branding campaigns, as well as their potential for citizen-led promotional campaigns during times of crisis. In just the past 11 days we've created nearly 500 images at an effective cost of just a single penny across a vast range of topics and artistic styles using nothing more than simple textual descriptions of the desired topic and style, with the AI model then creating the resulting image from whole cloth.

Some of the most important lessons we've learned:

  • The Future Is Here. The nearly infinite range of topics and styles and the ability to draw an image from only a simple textual description means the idea of autonomous visual storytelling and promotional campaigns is here whether we are ready for it or not. From fully autonomous assembly-line creation and publication of vast image archives to manually-driven artist tool, generative image tools like DALL-E can produce visually arresting and impactful imagery at scale. Critically, the tools can transform vague instruction like "an inspirational magazine cover about a victorious Ukraine" or "Estonia as a good place to invest" into compelling imagery that incorporates and adapts the underlying symbolism and visual language without any user intervention. At the same time, the user can iteratively steer their images by adding successive levels of detail, from content to style to positioning and focus to produce any image of their imagination.
  • Automated Workflows. It is possible to connect textual tools like GPT 3.5 or 4.0 and other LLMs to analyze textual documents and other content and produce prompts suitable for generating associated imagery that presents the topic in a given light or for a given audience and to do so in a fully unsupervised and autonomous workflow. For example, it is possible to create a workflow that monitors a social media platform for any post that criticizes a given political candidate and autonomously generate a point-by-point textual rebuttal and associated image promoting that rebuttal or criticizing the original author. The implications of such at-scale autonomous diplomacy and campaigning is yet to be fully understood.
  • Empowering Citizenry. Governments, political campaigns, non-profit organizations, issue advocacy groups and even individuals now have the tools to create rich visually stunning imagery that is highly personalized and customized to their precise needs. Nations in crisis can empower their citizens and supporters across the world to produce imagery telling their stories and promoting their nation and its needs at a scale of millions to tens of millions of personalized images pouring across global social media platforms and local venues. Non-profits and issue advocacy groups can similarly leverage their supporters now to tell their stories through personalized imagery that tells their individual stories in a way never before possible.
  • Crafting Not Creating & Limited Ideation. At the same time, these tools do not truly "create" in the kind of imaginative and interpretive way that human artists do. They merely mimic what they've seen in the past and put it together in random and diffuse ways. This means that for topics with wide-ranging visual representation over time, they can construct an endless stream of diverse imagery, while for topics with more limited past visual symbolism, they are unable to reach beyond what human artists have done and instead yield a consistent stream of nearly identical imagery. In other words, if you take a given topic and search for it on Google Imagery, the diversity of the first few pages of search results will give you a good idea of how diverse the imagery produced by DALL-E and other generative AI tools will be. For all the talk of them being "ideation" tools, they are still limited to merely remixing what has been created in the past, rather than truly creating something novel by coming up with a fundamentally new visual representation of a topic.
  • Representation & Bias. Models like DALL-E encode strongly biased representations of gender, race, nationality and culture. As accessibility of these tools continues to grow and they are increasingly used to produce imagery at scale, these biases will become ever more entrenched in how we as societies "see" these traits in ways that will likely permanently entrench and encode harmful stereotypes and biases.
  • Artifacts, Context & Respect. Models like DALL-E don't truly "understand" the images they create – they merely remix collections of pixels in ways that are most statistically associated with the input tokens in their textual prompts. This is one of the reasons they do so poorly at producing text: letters hold no meaning to them. This means that upon closer inspection, imagery will contain not only mundane deformities and bizarre mashups, but often highly offensive combinations or contexts of symbols or presentation of culturally meaningful symbols in demeaning, vulgar and outrageously unacceptable forms. Organizations and governments must be especially vigilant to the presence of such offensive imagery, which may be buried in the background or touch upon a fault line not widely known to those not familiar with a given culture or context. A casually decapitated Ukrainian civilian or elevated Russian flag takes on unique significance in imagery about Ukraine during the current crisis, while presenting a cosplaying deformed flag hijab as promoting Turkey might seem unremarkable to an American used to the caricature of women for advertising and unfamiliar with Islamic tradition. The portrayal of a burning recognizable city, deceased soldiers or destroyed national landmarks might be less than "motivating" and "inspirational" even if presented in a visually arresting style – there is a massive difference between the "motivating" nature of presentation and content. Imagery produced by genai systems will require careful review by SMEs familiar with both the presented topics and the symbolism of relevant nations and cultures to assess if changes are required to the image.

Below is a thematic recap of our generative AI image creation experiments over the past week and a half:

Citizen-Led National Campaigns

National Campaigns

Political Campaigns

Organizational Campaigns

Issue Advocacy Campaigns

Personal Campaigns

Bias & Representation