Today's Wall Street Journal explores how a growing number of fast food chains are using AI-powered speech recognition to take customer orders in their drive-throughs rather than a human being. A small but growing number of companies are going even further: using LLMs to completely take over their primary customer relations channels, with severe reputational risks.
This past May one well-known US pizza chain switched its primary customer support to an LLM. The company uses an SMS short code as its primary support contact, sending customers a steady stream of promotional offers by text and asking customers with support needs to text it. In fact, the company does not offer a phone number and notes that customer service emails may not be responded to in a timely fashion, pushing its customers to use its SMS shortcode as its primary contact point.
For at least the past year this company has encountered significant backend technical issues in which at least every 2-3 weeks it will text a promotional code that is not accepted by its website or stores due to a sync issue between its promotional database and its order processing system. In the past, customers simply responded to the promotional text and within typically 20 seconds were sent a corrected code. Company policy has long been that customers must send a screen capture of the original text when requesting a corrected code, rather than copy-pasting it. Customers that simply copy-paste the text of the original promotional offer are required to instead send a screen capture before a corrected code can be sent.
This past May this chain replaced its frontline human customer support representatives with an LLM.
A week later I received a promotional text via their SMS shortcode and an email with the same offer (as is typical). When the code did not work, I replied with the screen capture as usual. This time I received a very different response than usual. Instead of a replacement code I was greeted with: "I'm sorry you're experiencing issues. I'm afraid as an AI language model, I don't have access to your previous queries, so I'm not sure what code you're referring to. Regarding the IMAGEUPLOAD, I'd be happy to forward your concern to the team. Could you please provide me with more details about the issue you're facing and attach any relevant images that you'd like to share with us?"
When I attempted to attach the image again, I received a similar message asking me to attach relevant images and referencing an "IMAGEUPLOAD" and that it needed further information before it could take action. Given that most commercial LLMs are not multimodal at this time, it is likely that the pizza company had created a workflow that involved attaching an image but did not actually test their workflow under real world conditions to realize that since they require an image upload and the LLM cannot process image uploads, customers will simply get stuck in a perpetual feedback loop.
After entering an endless cycle of the LLM requesting an image to be attached and then being unable to process the attached image, I was finally able to coax it into handing the thread over to a human representative. Yet the company appears to have gone even further with its LLM use. When the LLM hands a thread over to a human representative it categorizes the call and provides a summary and other information to the human representative. In this case, the LLM had summarized it as a refund request. Thus, the human asked "We need a copy of the transaction in order to provide a refund. Your order information is not listed anywhere so we cannot provide a refund without proof of transaction." After further back-and-forth, it appears that the human representative is not provided a copy of the customer's interaction with the LLM up to that point, just a digested summary. Thus, in my case, after 10 minutes of back-and-forth with the LLM about an invalid promotional code that I needed a replacement for, the human was presented only with a one-line summary from the LLM that I was a customer requesting a refund for a missing order. It is remarkable that after decades of call center modernization, the LLM transition is wiping clean all of the hard-earned knowledge about preserving the transcript of a communication when handing over to another agent.
Eventually the customer service representative provided a replacement code, but only after delays of up to a minute with each text, suggesting the human team has been largely decimated in the transition to the LLM-based support model and only a handful remain.
Remarkably, a week later the process repeated itself – this time with more dangerous results. Since I knew the LLM-based system required copy-paste text rather than screen captures, I pasted a copy of the promotional offer. To which the official support channel replied: " That message is not from us at [COMPANY NAME]. It appears to be a scam message. Please do not click on any links or provide any personal information to the sender."
This prompted an urgent review of communications with the company, verifying that communications had occurred through both phone and email and that in email they could be traced back to definitively to the company's own servers.
Immediately I feared that the company had fallen victim to a cybersecurity breach: the promotional text and email had come from the company's own email server and shortcode, yet here was the company stating it never sent that communication. The only possibility is that the company had experienced a breach and all of my personal information the company had on me was now available for sale on the dark web.
Yet, the most remarkable thing happened when I replied that I had confirmed definitively that the message came from the company itself. The LLM wrote back: "I'm sorry for the confusion. The message you received is a legitimate message from us. It will expire tomorrow, so feel free to take advantage of it while it lasts! Let me know if you have any further questions or if you need any assistance with your order."
This then led to a repeat of the week prior, of several minutes messaging back-and-forth with the LLM attempting to get it to understand that the code didn't work and I needed a new one, until finally I was able to get it to hand over to a human agent who took tens of minutes to correct the issue, rather than the tens of seconds it previously took.
In an era of heightened cyber vigilance, it is nothing short of extraordinary for a company to inform its customers that legitimate communications from it through its official channels are scams and not to click on any links or provide any information to avoid identity theft or fraud. To an American public increasingly aware of digital fraud, identity theft and the dangers of corporate cyberattacks exposing their personal information to scammers, it is an immensely serious situation when a company officially notifies a customer that a communication they received from it was fraudulent and, by virtue of that communication being from the company's official account, implying that the company has experienced a cybersecurity breach and its systems have been compromised.
All of this, for the company to then moments later nonchalantly says "oops" and notify the customer that nothing that it just said was true. Then to force the customer back into an infinite loop of communications until the customer is eventually able to force the LLM to hand the communication over to a human.
Asked for comment on all of this, the company remained silent despite multiple requests. Asked specifically why it hadn't wrapped guardrails around its LLM output to flag sensitive outputs for human review before sending, the company continued to remain silent.
It also raises critical questions about whether the LLM is an inhouse model or a commercially hosted service and, if so, who private customer data is being shared with and whether the terms of its license with the LLM vendor allows that vendor to integrate private customer communications into its model training where they could be regurgitated later.
In 2023 it is nothing short of remarkable that a company would deploy an LLM into a mission-critical frontline highly visible role as customer support that involves processing sensitive financial and personal data without explicit guardrails to prevent it from hallucinating stories about cybersecurity breaches and fraudulent impersonation. Companies thinking of deploying LLMs into frontline roles should apply post-LLM filtering models that flag any sensitive content for human review. Even then, the strong strongly hallucinogenic nature of today's LLMs raises critical questions about just how safely companies can deploy them to the frontlines.