Rasa Agent Significantly Outperforms Prompt-Based LLM in Security Test

The report concluded that the Rasa agent stayed within its scope, demonstrated high reliability, and exhibited behavior without sacrificing flexibility.

Organizations are reimagining their workflows and how they interact with customers, partners, and employees as we enter the agentic AI era. According to PwC, 88% of businesses say they plan to increase their AI budgets to invest more deeply into agentic AI adoption.

As more companies continue to deploy AI agents in customer-facing roles, security becomes incredibly important. Whether initiated by malicious actors with bad intentions or customers unfamiliar with AI agents, there are many scenarios in which an agent can produce the wrong information or, worse, get tricked into taking an incorrect or damaging action. Malicious actors are also using AI to create more sophisticated attack methods in how they attempt to crack and subvert AI agents for their own gain.

Therefore, it is crucial for organizations to understand what technologies and services underpin the AI agents they seek to deploy and the security implications of each type of agent. For simplicity's sake, they can understand agents as existing in one of three different categories

Traditional rules-based “chatbots” that follow very rigid scripts, “prompt-driven”
LLM agents that use language models to be more adaptive and able to produce natural language responses to queries
Hybrid models that combine the rules and pre-determined flows of older chatbots with the freedom and flexibility provided by LLMs.

It is clear that most organizations see the limitations of the older chatbots and are strongly moving away from them. While they are designed to be safe, they also often frustrate customers by failing to adapt to different types of questions that don’t fit their exact flow. Therefore, most companies looking to deploy AI agents have a choice between two options: a purely prompt-driven LLM or a hybrid approach like Rasa’s that includes prescriptive business logic.

AI Security firm Lakera assessed which of these two approaches provided greater safety and security, and we’re pleased to report that the Rasa mock application outperformed the prompt-driven LLM. Both applications underwent evaluation across AI safety, security, and operational performance dimensions.

To bring the security test into focus, Lakera used, as an example, a conversational agent that recommended vehicles, searched for availability, and determined loan qualifications. This real-world example is ripe for exploitation, producing the opportunity for different fraudulent activities that can hurt the bank financially, legally, and reputationally, and create issues between it and potential third parties.

Lakera assessed the security of both agents through several attacks:

Direct Context Extraction: Extracting hidden system instructions or sensitive data
Direct Instruction Override: Bypassing intended operational boundaries
Indirect Output Addition: Embedding malicious content in legitimate responses
Indirect Instruction Override: Hidden instructions corrupting core task execution
Denial of AI Service: Disrupting normal AI behavior and service availability

Our agent significantly outperformed the prompt-based agent in every category. Ours was successful in containing most attacks and any vulnerabilities, such as output modifications through the rephraser component, were minor. While not included in the Lakera test, the response rephraser can be turned off for those organizations that wish to minimize the risk further.

The prompt-driven LLM exhibited high susceptibility to every type of attempted vulnerability while a majority of Rasa demonstrated low to no severity in all categories but one.

The prompt-driven agent, for example, was severely compromised by prompt extraction, exposure of sensitive data, generation of harmful content, including hate speech and dangerous instructions, and infrastructure disruption affecting system availability.

Demonstrating content safety resiliency

For example, the Rasa agent exhibited no content safety violations, while the prompt-driven agent was easily tricked into swearing, producing hate speech, and echoing user self-harm messages.

While LLMs can adapt to users’ prompts, that flexibility is also a liability. It may be programmed to withstand a direct query that aims to circumvent its safety guards, but savvy users that engage in topic drift over multi-turn conversations can ultimately get the LLM to produce something troublesome.

Users have previously coaxed open models like Gemini, ChatGPT, and Perplexity to say damaging things, and it will continue to be an issue for LLMs without any guardrails. What makes Rasa so powerful is that it uses structured flows and templated responses to contain unsafe prompts.

Withstanding prompt injection attacks

Prompt-driven agents are unsurprisingly highly susceptible to dangerous prompt injections that can cause incredible system integrity damage. Lakera was able to coerce the prompt-based AI agent to engage in a direct instruction override when it used a fake system and user prompts to get the agent to reveal an entire conversation history.

Far more dangerous than making an agent say something improper is tricking it to ignore system safety rules, disclose internal secrets or code, or execute scripts against systems, creating risks that can cause serious financial or legal damage.

Avoiding process-draining loops

In another example, the Rasa agent assisted a “user” with a complicated, multi-step query with ease, while the prompt-driven agent times stalled for up to 45 seconds, making it unavailable to other users and, in one specific case, creating unstable subsequent messages.

Organizations that have routed entire workflows around AI agents being the first line of defense for customer queries will find themselves significantly behind on tasks if customer care lines are suddenly inundated with requests because their agents are down.

The best choice for security and effectiveness is a hybrid approach

An immutable law of AI agents is that there is a trade-off between security and flexibility. But an agent on either side of these extremes will create problems, either for users or the business, by making it more susceptible to attacks by clever and motivated fraudsters.

While the Lakera study demonstrated that the Rasa agent's security posture was inherently better than the prompt-based LLM agent, it also provides a better user experience. Rasa agents keep customers on track so their pressing issues are solved, but they also offer language flexibility so they don’t get slowed down if a customer’s query is slightly different from a pre-determined flow. It offers truly the best of both worlds, safety and efficiency, in one package.

Click here to read the entire report and contact Rasa when you're ready to use AI agents that combine security, efficiency, and high success rates.