How LLM Guardrails Defeat RAG-Based Generative Search Applications For Global Events

Retrieval Augmented Generation (RAG) has become perhaps the dominate mechanism through which to expand and update the memories of LLMs. The user query/prompt is formed into an embedding, an external knowledge store (such as paragraphs from news coverage, academic papers, Wikipedia, internal document archives, etc) is queried using embedding-based search for the top X most-similar results and then those top X results are provided to the LLM along with the original prompt for answering. For example, if a user asks "What is the latest news about Ukraine's counteroffensive?", the system will construct an embedding of that query, identify the top 20 paragraphs from the last 24 hours and then provide those paragraphs and the prompt to the final LLM for summarization. Unfortunately, as commercial LLM vendors increasingly update their model guardrails in realtime around major global events, the RAG model is breaking down as models refuse to process any prompts relating to those events.

For example, our RAG-based headline summarization model based on the DOC 2.0 API broke down two weeks ago when asked to summarize the latest events in Gaza, with OpenAI's ChatGPT models refusing to answer anything relating to Gaza due to an overzealous guardrail that banned the topic broadly. Two weeks later ChatGPT's guardrails have been corrected to allow it to summarize content relating to Gaza, but it Google's Bard still refuses to answer any questions relating to Gaza nearly two months after the Hamas attack.

This means that RAG-based news summarization workflows will often fail when they are most needed: to summarize vast narrative landscapes of highly conflicting and contested events. Such events often lead to LLM vendors to institute temporary or permanent enhanced guardrails that block them from processing content relating to the event, meaning that RAG-based workflows often fail shortly after events begin to attract substantial attention.

Worse, over the past year we've observed this process accelerate, where an LLM-based summarization workflow will catch the early glimmers of a major story and then fail as the story begins to gain steam due to guardrails being implemented in realtime. In short, just as the volume of information explodes and LLM-based summarization is the most helpful, the LLMs are cut off.

Importantly, even for non-news-related applications, the ever-changing guardrails of commercial LLM APIs means RAG-based workflows will be highly unstable across all domains. Imagine a shipping company that does business in Israel and uses LLMs to analyze their shipping manifests – a task entirely unrelated to news processing. In the aftermath of October 7th, most commercial LLMs fail in highly unexpected ways when asked to process content relating to Israel, even when that content is unrelated to events in Gaza, meaning that even a seemingly innocuous shipping manifest processing pipeline could suddenly fail without warning – collateral damage from the careening reactive way in which LLM vendors tend to implement their guardrail systems today.